Fallacies

A fallacy is a kind of error in reasoning. The list of fallacies below contains 231 names of the most common fallacies, and it provides brief explanations and examples of each of them. Fallacious reasoning should not be persuasive, but it too often is.

The vast majority of the commonly identified fallacies involve arguments, although some involve only explanations, or definitions, or questions, or other products of reasoning. Some researchers, although not most, use the term “fallacy” very broadly to indicate any false belief or cause of a false belief. The long list below includes some fallacies of these sorts if they have commonly-known names, but most are fallacies that involve kinds of errors made while arguing informally in natural language, that is, in everyday discourse.

A charge of fallacious reasoning always needs to be justified. The burden of proof is on your shoulders when you claim that someone’s reasoning is fallacious. Even if you do not explicitly give your reasons, it is your responsibility to be able to give them if challenged.

A piece of reasoning can have more than one fault and thereby commit more than one fallacy. If it is fallacious, this can be because of its form or its content or both. The formal fallacies are fallacious only because of their logical form, their structure. The Slippery Slope Fallacy is an informal fallacy that has the following form: Step 1 often leads to step 2. Step 2 often leads to step 3. Step 3 often leads to…until we reach an obviously unacceptable step, so step 1 is not acceptable. That form occurs in both good arguments and faulty arguments. The quality of an argument of this form depends crucially on the strength of the probabilities in going from one step to the next. The probabilities involve the argument’s content, not merely its logical form.

The discussion below that precedes the long alphabetical list of fallacies begins with an account of the ways in which the term “fallacy” is imprecise. Attention then turns to some of the competing and overlapping ways to classify fallacies of argumentation. Researchers in the field of fallacies disagree about which name of a fallacy is more helpful to use, whether some fallacies should be de-emphasized in favor of others, and which is the best taxonomy of the fallacies. Researchers in the field are also deeply divided about how to define the term “fallacy” itself and how to define certain fallacies. There is no agreement on whether there are necessary and sufficient conditions for distinguishing between fallacious and non-fallacious reasoning generally. Analogously, there is doubt in the field of ethics regarding whether researchers should pursue the goal of providing necessary and sufficient conditions for distinguishing moral actions from immoral ones.

Table of Contents

  1. Introduction
  2. Taxonomy of Fallacies
  3. Pedagogy
  4. What is a Fallacy?
  5. Other Controversies
  6. Partial List of Fallacies
  7. References and Further Reading

1. Introduction

The first known systematic study of fallacies was due to Aristotle in his De Sophisticis Elenchis (Sophistical Refutations), an appendix to his Topics, which is one of his six works on logic. This six are collectively known as the Organon. He listed thirteen types of fallacies. Very few advances were made for many centuries after this. After the Dark Ages, fallacies again were studied systematically in Medieval Europe. This is why so many fallacies have Latin names. The third major period of study of the fallacies began in the later twentieth century due to renewed interest from the disciplines of philosophy, logic, communication studies, rhetoric, psychology, and artificial intelligence.

The more frequent the error within public discussion and debate the more likely it is to have a name. Nevertheless, there is no specific name for the fallacy of subtracting five from thirteen and concluding that the answer is seven, even though the error is common.

The term “fallacy” is not a precise term. One reason is that it is ambiguous. Depending on the particular theory of fallacies, it might refer either to (a) a kind of error in an argument, (b) a kind of error in reasoning (including arguments, definitions, explanations, questions, and so forth), (c) a false belief, or (d) the cause of any of the previous errors including what are normally referred to as “rhetorical techniques.”

Regarding (d), being ill, being hungry, being stupid, being hypercritical, and being careless are all sources of potential error in reasoning, so they could qualify as fallacies of kind (d), but they are not included in the list below, and most researchers on fallacies normally do not call them fallacies. These sources of errors are more about why people commit a fallacy than about what the fallacy is. On the other hand, wishful thinking, stereotyping, being superstitious, rationalizing, and having a poor sense of proportion also are sources of potential error and are included in the list below, though they would not be included in the lists of some researchers. Thus there is a certain arbitrariness to what appears in lists such as this. What have been left off the list below are the following persuasive techniques commonly used to influence others and to cause errors in reasoning: apple polishing, ridiculing, applying financial pressure, being sarcastic, selecting terms with strong negative or positive associations, using innuendo, weasling, and using other propaganda techniques. Basing any reasoning primarily on the effectiveness of one or more of these techniques is fallacious.

The fallacy literature has given some attention to the epistemic role of reasoning. Normally, the goal in reasoning is to take the audience from not knowing to knowing, or from not being justified in believing something to being justified in believing it. If a fallacy is required to fail at achieving this epistemic goal, then begging the question, which is a form of repeating the conclusion in the premises, does not achieve this goal even though it is deductively valid—so, reasoning validly is not a guarantee of avoiding a fallacy.

In describing the fallacies below, the custom is followed of not distinguishing between a reasoner using a fallacy and the reasoning itself containing the fallacy.

Real arguments are often embedded within a very long discussion. Richard Whately, one of the greatest of the 19th century researchers into informal logic, wisely said “A very long discussion is one of the most effective veils of Fallacy; …a Fallacy, which when stated barely…would not deceive a child, may deceive half the world if diluted in a quarto volume.”

2. Taxonomy of Fallacies

The importance of understanding the common fallacy labels is that they provide an efficient way to communicate criticisms of someone’s reasoning. However, there are a number of competing and overlapping ways to classify the labels. The taxonomy of the fallacies is in dispute.

Multiple names of fallacies are often grouped together under a common name intended to bring out how the specific fallacies are similar. Here are three examples. (1) Fallacies of relevance include fallacies that occur due to reliance on an irrelevant reason. There are different kinds of these fallacies. Ad Hominem, Appeal to Pity, and Affirming the Consequent are all fallacies of relevance. (2) Accent, Amphiboly and Equivocation are examples of fallacies of ambiguity. (3) The fallacies of illegitimate presumption include Begging the Question, False Dilemma, No True Scotsman, Complex Question and Suppressed Evidence.

The fallacies of argumentation can be classified as either formal or informal. A formal fallacy can be detected by examining the logical form of the reasoning, whereas an informal fallacy usually cannot be detected this way because it depends upon the content of the reasoning and possibly the purpose of the reasoning. So, informal fallacies are errors of reasoning that cannot easily be expressed in our standard system of formal logic, the first-order predicate logic. The long list below contains very few formal fallacies. Fallacious arguments (as well as perfectly correct arguments) can be classified as deductive or inductive, depending upon whether the fallacious argument is most properly assessed by deductive standards or instead by inductive standards. Deductive standards demand deductive validity, but inductive standards require inductive strength such as making the conclusion more likely.

Fallacies of argumentation can be divided into other categories. Some classifications depend upon the psychological factors that lead people to use them. Those fallacies also can be divided into categories according to the epistemological factors that cause the error. For example, arguments depend upon their premises, even if a person has ignored or suppressed one or more of them, and a premise can be justified at one time, given all the available evidence at that time, even if we later learn that the premise was false. Also, even though appealing to a false premise is often fallacious, it is not if we are reasoning about what would have happened even if it did not happen.

3. Pedagogy

It is commonly claimed that giving a fallacy a name and studying it will help the student identify the fallacy in the future and will steer them away from using the fallacy in their own reasoning. As Steven Pinker says in The Stuff of Thought (p. 129),

If a language provides a label for a complex concept, that could make it easier to think about the concept, because the mind can handle it as a single package when juggling a set of ideas, rather than having to keep each of its components in the air separately. It can also give a concept an additional label in long-term memory, making it more easily retrievable than ineffable concepts or those with more roundabout verbal descriptions.

For pedagogical purposes, researchers in the field of fallacies disagree about the following topics: which name of a fallacy is more helpful to students’ understanding; whether some fallacies should be de-emphasized in favor of others; and which is the best taxonomy of the fallacies.

It has been suggested that, from a pedagogical perspective, having a representative set of fallacies pointed out to you in others’ reasoning is much more effective than your taking the trouble to learn the rules of avoiding all fallacies in the first place. But fallacy theory is criticized by some teachers of informal reasoning for its over-emphasis on poor reasoning rather than good reasoning. Do colleges teach Calculus by emphasizing all the ways one can make mathematical mistakes? Besides, studying fallacies will make students be overly critical. These critics want more emphasis on the forms of good arguments and on the implicit rules that govern proper discussion designed to resolve a difference of opinion.

4. What is a Fallacy?

Researchers disagree about how to define the very term “fallacy.” For example, most researchers say fallacies may be created unintentionally or intentionally, but some researchers say that a supposed fallacy created unintentionally should be called a blunder and not a fallacy.

Could there be a computer program, for instance, that could always successfully distinguish a fallacy from a non-fallacy? A fallacy is a mistake, but not every mistake is a fallacy.

Focusing just on fallacies of argumentation, some researchers define such a fallacy as an argument that is deductively invalid or that has very little inductive strength. Because examples of false dilemma, inconsistent premises, and begging the question are valid arguments in this sense, this definition misses some standard fallacies. Other researchers say a fallacy is a mistake in an argument that arises from something other than merely false premises. But the false dilemma fallacy is due to false premises. Still other researchers define a fallacy as an argument that is not good. Good arguments are then defined as those that are deductively valid or inductively strong, and that contain only true, well-established premises, but are not question-begging. A complaint with this definition is that its requirement of truth would improperly lead to calling too much scientific reasoning fallacious; every time a new scientific discovery caused scientists to label a previously well-established claim as false, all the scientists who used that claim as a premise would become fallacious reasoners. This consequence of the definition is acceptable to some researchers but not to others. Because informal reasoning regularly deals with hypothetical reasoning and with premises for which there is great disagreement about whether they are true or false, many researchers would relax the requirement that every premise must be true or at least known to be true. One widely accepted definition defines a fallacious argument as one that either is deductively invalid or is inductively very weak or contains an unjustified premise or that ignores relevant evidence that is available and that should be known by the arguer. Finally, yet another theory of fallacy says a fallacy is a failure to provide adequate proof for a belief, the failure being disguised to make the proof look adequate.

Other researchers recommend characterizing a fallacy as a violation of the norms of good reasoning, the rules of critical discussion, dispute resolution, and adequate communication. The difficulty with this approach is that there is so much disagreement about how to characterize these norms.

In addition, all the above definitions are often augmented with some remark to the effect that the fallacies need to be convincing or persuasive to too many people. It is notoriously difficult to be very precise about these notions. Some researchers in fallacy theory have therefore recommended dropping the notions altogether; other researchers suggest replacing them in favor of the phrase “can be used to persuade.”

Some researchers complain that all the above definitions of fallacy are too broad and do not distinguish between mere blunders and actual fallacies, the more serious errors.

Researchers in the field are deeply divided, not only about how to define the term “fallacy” and how to define some of the individual fallacies, but also about whether there are necessary and sufficient conditions for distinguishing between fallacious and non-fallacious reasoning generally. Analogously, there is doubt in the field of ethics whether researchers should pursue the goal of providing necessary and sufficient conditions for distinguishing moral actions from immoral ones.

5. Other Controversies

How do we defend the claim that an item of reasoning should be labeled as a particular fallacy? A major goal in the field of informal logic is provide some criteria for each fallacy. Schwartz presents the challenge this way:

Fallacy labels have their use. But fallacy-label texts tend not to provide useful criteria for applying the labels. Take the so-called ad verecundiam fallacy, the fallacious appeal to authority. Just when is it committed? Some appeals to authority are fallacious; most are not. A fallacious one meets the following condition: The expertise of the putative authority, or the relevance of that expertise to the point at issue, are in question. But the hard work comes in judging and showing that this condition holds, and that is where the fallacy-label-texts leave off. Or rather, when a text goes further, stating clear, precise, broadly applicable criteria for applying fallacy labels, it provides a critical instrument [that is] more fundamental than a taxonomy of fallacies and hence to that extent goes beyond the fallacy-label approach. The further it goes in this direction, the less it need to emphasize or even to use fallacy labels (Schwartz, 232).

The controversy here is the extent to which it is better to teach students what Schwartz calls “the critical instrument” than to teach the fallacy-label approach. Is the fallacy-label approach better for some kinds of fallacies than others? If so, which others?

One controversy involves the relationship between the fields of logic and rhetoric. In the field of rhetoric, the primary goal is to persuade the audience, not guide them to the truth. Philosophers concentrate on convincing the ideally rational reasoner.

Advertising in magazines and on television is designed to achieve visual persuasion. And a hug or the fanning of fumes from freshly baked donuts out onto the sidewalk are occasionally used for visceral persuasion. There is some controversy among researchers in informal logic as to whether the reasoning involved in this nonverbal persuasion can always be assessed properly by the same standards that are used for verbal reasoning.

6. Partial List of Fallacies

Consulting the list below will give a general idea of the kind of error involved in passages to which the fallacy name is applied. However, simply applying the fallacy name to a passage cannot substitute for a detailed examination of the passage and its context or circumstances because there are many instances of reasoning to which a fallacy name might seem to apply, yet, on further examination, it is found that in these circumstances the reasoning is really not fallacious.

Abusive Ad Hominem

See Ad Hominem.

Accent

The Accent Fallacy is a fallacy of ambiguity due to the different ways a word or syllable is emphasized or accented. Also called Accentus, Misleading Accent, and Prosody.

Example:

A member of Congress is asked by a reporter if she is in favor of the President’s new missile defense system, and she responds, “I’m in favor of a missile defense system that effectively defends America.”

With an emphasis on the word “favor,” her response is likely to be for the President’s missile defense system. With an emphasis, instead, on the word “effectively,” her remark is likely to be against the President’s missile defense system. And by using neither emphasis, she can later claim that her response was on either side of the issue. For an example of the Fallacy of Accent involving the accent of a syllable within a single word, consider the word “invalid” in the sentence, “Did you mean the invalid one?” When we accent the first syllable, we are speaking of a sick person, but when we accent the second syllable, we are speaking of an argument failing to meet the deductive standard of being valid. By not supplying the accent, and not supplying additional information to help us disambiguate, then we are committing the Fallacy of Accent.

Accentus

See the Fallacy of Accent.

Accident

We often arrive at a generalization but don’t or can’t list all the exceptions. When we then reason with the generalization as if it has no exceptions, our reasoning contains the Fallacy of Accident. This fallacy is sometimes called the “Fallacy of Sweeping Generalization.”

Example:

People should keep their promises, right? I loaned Dwayne my knife, and he said he’d return it. Now he is refusing to give it back, but I need it right now to slash up my neighbors who disrespected me.

People should keep their promises, but there are exceptions to this generalization as in this case of the psychopath who wants Dwayne to keep his promise to return the knife.

Ad Baculum

See Scare Tactic and Appeal to Emotions (Fear).

Ad Consequentiam

See Appeal to Consequence.

Ad Crumenum

See Appeal to Money.

Ad Hoc Rescue

Psychologically, it is understandable that you would try to rescue a cherished belief from trouble. When faced with conflicting data, you are likely to mention how the conflict will disappear if some new assumption is taken into account. However, if there is no good reason to accept this saving assumption other than that it works to save your cherished belief, your rescue is an Ad Hoc Rescue.

Example:

Yolanda: If you take four of these tablets of vitamin C every day, you will never get a cold.

Juanita: I tried that last year for several months, and still got a cold.

Yolanda: Did you take the tablets every day?

Juanita: Yes.

Yolanda: Well, I’ll bet you bought some bad tablets.

The burden of proof is definitely on Yolanda’s shoulders to prove that Juanita’s vitamin C tablets were probably “bad”—that is, not really vitamin C. If Yolanda can’t do so, her attempt to rescue her hypothesis (that vitamin C prevents colds) is simply a dogmatic refusal to face up to the possibility of being wrong.

Ad Hominem

Your reasoning contains this fallacy if you make an irrelevant attack on the person arguing and suggest that this attack undermines the argument itself. “Ad Hominem” means “to the person” as in being “directed at the person.” It is a smear tactic.

Example:

What she says about Johannes Kepler’s astronomy of the 1600s must be just so much garbage. Do you realize she’s only fifteen years old?

This attack may undermine the young woman’s credibility as a scientific authority, but it does not undermine her reasoning itself because her age is irrelevant to the quality of her reasoning about Kepler. That reasoning should stand or fall on the scientific evidence, not on the arguer’s age or anything else about her personally.

The major difficulty with labeling a piece of reasoning an Ad Hominem Fallacy is deciding whether the personal attack is relevant or irrelevant. For example, attacks on a person for their immoral sexual conduct are irrelevant to the quality of the person’s reasoning about Kepler’s astronomy, but they are relevant to arguments promoting the person for a leadership position in a church or mosque or city council.

If the fallacious reasoner points out irrelevant circumstances that the reasoner is in, such as the arguer’s having a vested interest in people accepting the reasoning, then the ad hominem fallacy also may be called a Circumstantial Ad Hominem. If the fallacious attack points out some despicable trait of the arguer, it also may be called an Abusive Ad Hominem. An Ad hominem that attacks an arguer by attacking the arguer’s associates is called the Fallacy of Guilt by Association. If the fallacy focuses on a complaint about the origin of the arguer’s views, then it is a kind of Genetic Fallacy. If the fallacy is due to claiming the person does not practice what is preached, it is the Tu Quoque Fallacy. Two Wrongs do Not Make a Right is also a type of Ad Hominem fallacy.

The intentional use of the ad hominem fallacy is a tactic used by all dictators and authoritarian leaders. If you say something critical of them or their regime, their immediate response is to attack you as unreliable, or as being a puppet of the enemy, or as being a traitor.

Ad Hominem, Circumstantial

See Guilt by Association.

Ad Ignorantiam

See Appeal to Ignorance.

Ad Misericordiam

See Appeal to Emotions.

Ad Novitatem

See Bandwagon.

Ad Numerum

See Appeal to the People.

Ad Populum

See Appeal to the People.

Ad Verecundiam

See Appeal to Authority.

Affirming the Consequent

If you have enough evidence to affirm the consequent of a conditional and then suppose that as a result you have sufficient reason for affirming the antecedent, your reasoning contains the Fallacy of Affirming the Consequent. This formal fallacy is often mistaken for Modus Ponens, which is a valid form of reasoning also using a conditional. A conditional is an if-then statement; the if-part is the antecedent, and the then-part is the consequent. The following argument affirms the consequent that she does speak Portuguese. Its form is an invalid form.

Example:

If she’s Brazilian, then she speaks Portuguese. Hey, she does speak Portuguese. So, she is Brazilian.

Noticing that she speaks Portuguese suggests that she might be Brazilian, but it is weak evidence by itself, and if the argument is assessed by deductive standards, then it is deductively invalid. That is, if the arguer believes or suggests that her speaking Portuguese definitely establishes that she is Brazilian, then the argumentation contains the Fallacy of Affirming the Consequent.

Against the Person

See Ad Hominem.

All-or-Nothing

See Black-or-White Fallacy.

Ambiguity

Any fallacy that turns on ambiguity. See the fallacies of Amphiboly, Accent, and Equivocation. Amphiboly is ambiguity of syntax. Equivocation is ambiguity of semantics. Accent is ambiguity of emphasis.

Amphiboly

This is an error due to taking a grammatically ambiguous phrase in two different ways during the reasoning.

Example:

Tests show that the dog is not part wolf, as the owner suspected.

Did the owner suspect the dog was part wolf, or was not part wolf? Who knows? The sentence is ambiguous, and needs to be rewritten to remove the fallacy. Unlike Equivocation, which is due to multiple meanings of a phrase, Amphiboly is due to syntactic ambiguity, that is, ambiguity caused by multiple ways of understanding the grammar of the phrase.

Anecdotal Evidence

This is fallacious generalizing on the basis of a some story that provides an inadequate sample. If you discount evidence arrived at by systematic search or by testing in favor of a few firsthand stories, then your reasoning contains the fallacy of overemphasizing anecdotal evidence.

Example:

Yeah, I’ve read the health warnings on those cigarette packs and I know about all that health research, but my brother smokes, and he says he’s never been sick a day in his life, so I know smoking can’t really hurt you.

Anthropomorphism

This is the error of projecting uniquely human qualities onto something that isn’t human. Usually this occurs with projecting the human qualities onto animals, but when it is done to nonliving things, as in calling the storm cruel, the Pathetic Fallacy is created. It is also, but less commonly, called the Disney Fallacy or the Walt Disney Fallacy.

Example:

My dog is wagging his tail and running around me. Therefore, he knows that I love him.

The fallacy would be averted if the speaker had said “My dog is wagging his tail and running around me. Therefore, he is happy to see me.” Animals do not have the ability to ascribe knowledge to other beings such as humans. Your dog knows where it buried its bone, but not that you also know where the bone is.

Appeal to Authority

You appeal to authority if you back up your reasoning by saying that it is supported by what some authority says on the subject. Most reasoning of this kind is not fallacious, and much of our knowledge properly comes from listening to authorities. However, appealing to authority as a reason to believe something is fallacious whenever the authority appealed to is not really an authority in this particular subject, when the authority cannot be trusted to tell the truth, when authorities disagree on this subject (except for the occasional lone wolf), when the reasoner misquotes the authority, and so forth. Although spotting a fallacious appeal to authority often requires some background knowledge about the subject matter and the who is claimed to be the authority, in brief it can be said we are reasoning fallacious if we accept the words of a supposed authority when we should be suspicious of the authority’s words.

Example:

The moon is covered with dust because the president of our neighborhood association said so.

This is a Fallacious Appeal to Authority because, although the president is an authority on many neighborhood matters, you are given no reason to believe the president is an authority on the composition of the moon. It would be better to appeal to some astronomer or geologist. A TV commercial that gives you a testimonial from a famous film star who wears a Wilson watch and that suggests you, too, should wear that brand of watch is using a fallacious appeal to authority. The film star is an authority on how to act, not on which watch is best for you.

Appeal to Consequence

Arguing that a belief is false because it implies something you’d rather not believe. Also called Argumentum Ad Consequentiam.

Example:

That can’t be Senator Smith there in the videotape going into her apartment. If it were, he’d be a liar about not knowing her. He’s not the kind of man who would lie. He’s a member of my congregation.

Smith may or may not be the person in that videotape, but this kind of arguing should not convince us that it’s someone else in the videotape.

Appeal to Emotions

Your reasoning contains the Fallacy of Appeal to Emotions when someone’s appeal to you to accept their claim is accepted merely because the appeal arouses your feelings of anger, fear, grief, love, outrage, pity, pride, sexuality, sympathy, relief, and so forth. Example of appeal to relief from grief:

[The speaker knows he is talking to an aggrieved person whose house is worth much more than $100,000.] You had a great job and didn’t deserve to lose it. I wish I could help somehow. I do have one idea. Now your family needs financial security even more. You need cash. I can help you. Here is a check for $100,000. Just sign this standard sales agreement, and we can skip the realtors and all the headaches they would create at this critical time in your life.

There is nothing wrong with using emotions when you argue, but it’s a mistake to use emotions as the key premises or as tools to downplay relevant information. Regarding the Fallacy of Appeal to Pity, it is proper to pity people who have had misfortunes, but if as the person’s history instructor you accept Max’s claim that he earned an A on the history quiz because he broke his wrist while playing in your college’s last basketball game, then you’ve used the fallacy of appeal to pity.

Appeal to Force

See Scare Tactic.

Appeal to Ignorance

The Fallacy of Appeal to Ignorance comes in two forms: (1) Not knowing that a certain statement is true is taken to be a proof that it is false. (2) Not knowing that a statement is false is taken to be a proof that it is true. The fallacy occurs in cases where absence of evidence is not good enough evidence of absence. The fallacy uses an unjustified attempt to shift the burden of proof. The fallacy is also called “Argument from Ignorance.”

Example:

Nobody has ever proved to me there’s a God, so I know there is no God.

This kind of reasoning is generally fallacious. It would be proper reasoning only if the proof attempts were quite thorough, and it were the case that, if the being or object were to exist, then there would be a discoverable proof of this. Another common example of the fallacy involves ignorance of a future event: You people have been complaining about the danger of Xs ever since they were invented, but there’s never been any big problem with Xs, so there’s nothing to worry about.

Appeal to Money

The Fallacy of Appeal to Money uses the error of supposing that, if something costs a great deal of money, then it must be better, or supposing that if someone has a great deal of money, then they’re a better person in some way unrelated to having a great deal of money. Similarly it’s a mistake to suppose that if something is cheap it must be of inferior quality, or to suppose that if someone is poor financially then they’re poor at something unrelated to having money.

Example:

He’s rich, so he should be the president of our Parents and Teachers Organization.

Appeal to Past Practice

See Appeal to the People.

Appeal to Pity

See Appeal to Emotions.

Appeal to Snobbery

See Appeal to Emotions.

Appeal to the Gallery

See Appeal to the People.

Appeal to the Masses

See Appeal to the People.

Appeal to the Mob

See Appeal to the People.

Appeal to the People

If you suggest too strongly that someone’s claim or argument is correct simply because it’s what most everyone believes, then your reasoning contains the Fallacy of Appeal to the People. Similarly, if you suggest too strongly that someone’s claim or argument is mistaken simply because it’s not what most everyone believes, then your reasoning also uses the fallacy. Agreement with popular opinion is not necessarily a reliable sign of truth, and deviation from popular opinion is not necessarily a reliable sign of error, but if you assume it is and do so with enthusiasm, then you are using this fallacy. It is essentially the same as the fallacies of Ad Numerum, Appeal to the Gallery, Appeal to the Masses, Argument from Popularity, Argumentum ad Populum, Common Practice, Mob Appeal, Past Practice, Peer Pressure, and Traditional Wisdom. The “too strongly” mentioned above is important in the description of the fallacy because what most everyone believes is, for that reason, somewhat likely to be true, all things considered. However, the fallacy occurs when this degree of support is overestimated.

Example:

You should turn to channel 6. It’s the most watched channel this year.

This is fallacious because of its implicitly accepting the questionable premise that the most watched channel this year is, for that reason alone, the best channel for you. If you stress the idea of appealing to a new idea held by the gallery, masses, mob, peers, people, and so forth, then it is a Bandwagon Fallacy.

Appeal to the Stick

See Appeal to Emotions (fear).

Appeal to Unqualified Authority

See Appeal to Authority.

Appeal to Vanity

See Appeal to Emotions.

Argument from Ignorance

See Appeal to Ignorance.

Argument from Outrage

See Appeal to Emotions.

Argument from Popularity

See Appeal to the People.

Argumentum Ad ….

See Ad …. without the word “Argumentum.”

Argumentum Consensus Gentium

See Appeal to Traditional Wisdom.

Availability Heuristic

We have an unfortunate instinct to base an important decision on an easily recalled, dramatic example, even though we know the example is atypical. It is a specific version of the fallacy of Confirmation Bias.

Example:

I just saw a video of a woman dying by fire in a car crash because she was unable to unbuckle her seat belt as the flames increased in intensity. So, I am deciding today no longer to wear a seat belt when I drive.

This reasoning commits the Fallacy of the Availability Heuristic because the reasoner would realize, if he would stop and think for a moment, that a great many more lives are saved due to wearing seat belts rather than due to not wearing seat belts, and the video of the situation of the woman unable to unbuckle her seat belt in the car crash is an atypical situation. The name of this fallacy is not very memorable, but it is in common use.

Avoiding the Issue

A reasoner who is supposed to address an issue but instead goes off on a tangent is properly accused of using the Fallacy of Avoiding the Issue. Also called missing the point, straying off the subject, digressing, and not sticking to the issue.

Example:

A city official is charged with corruption for awarding contracts to his wife’s consulting firm. In speaking to a reporter about why he is innocent, the city official talks only about his wife’s conservative wardrobe, the family’s lovable dog, and his own accomplishments in supporting Little League baseball.

However, the fallacy isn’t used by a reasoner who says that some other issue must first be settled and then continues by talking about this other issue, provided the reasoner is correct in claiming this dependence of one issue upon the other.

Avoiding the Question

The Fallacy of Avoiding the Question is a type of Fallacy of Avoiding the Issue that occurs when the issue is how to answer some question. The fallacy occurs when someone’s answer doesn’t really respond to the question asked. The fallacy is also called “Changing the Question.”

Example:

Question: Would the Oakland Athletics be in first place if they were to win tomorrow’s game?

Answer: What makes you think they’ll ever win tomorrow’s game?

Bad Seed

Attempting to undermine someone’s reasoning by pointing our their “bad” family history, when it is an irrelevant point. See Genetic Fallacy.

Bald Man

See Line-Drawing.

Bandwagon

If you suggest that someone’s claim is correct simply because it’s what most everyone is coming to believe, then you’re are using the Bandwagon Fallacy. Get up here with us on the wagon where the band is playing, and go where we go, and don’t think too much about the reasons. The Latin term for this Fallacy of Appeal to Novelty is Argumentum ad Novitatem.

Example:

[Advertisement] More and more people are buying sports utility vehicles. It is time you bought one, too.

Like its close cousin, the Fallacy of Appeal to the People, the Bandwagon Fallacy needs to be carefully distinguished from properly defending a claim by pointing out that many people have studied the claim and have come to a reasoned conclusion that it is correct. What most everyone believes is likely to be true, all things considered, and if one defends a claim on those grounds, this is not a fallacious inference. What is fallacious is to be swept up by the excitement of a new idea or new fad and to unquestionably give it too high a degree of your belief solely on the grounds of its new popularity, perhaps thinking simply that ‘new is better.’ The key ingredient that is missing from a bandwagon fallacy is knowledge that an item is popular because of its high quality.

Begging the Question

A form of circular reasoning in which a conclusion is derived from premises that presuppose the conclusion. Normally, the point of good reasoning is to start out at one place and end up somewhere new, namely having reached the goal of increasing the degree of reasonable belief in the conclusion. The point is to make progress, but in cases of begging the question there is no progress, and the arguer is essentially arguing by repeating the point.

Example:

“Women have rights,” said the Bullfighters Association president. “But women shouldn’t fight bulls because a bullfighter is and should be a man.”

The president is saying basically that women shouldn’t fight bulls because women shouldn’t fight bulls. This reasoning isn’t making any progress.

Insofar as the conclusion of a deductively valid argument is “contained” in the premises from which it is deduced, this containing might seem to be a case of presupposing, and thus any deductively valid argument might seem to be begging the question. It is still an open question among logicians as to why some deductively valid arguments are considered to be begging the question and others are not. Some logicians suggest that, in informal reasoning with a deductively valid argument, if the conclusion is psychologically new insofar as the premises are concerned, then the argument isn’t an example of the fallacy. Other logicians suggest that we need to look instead to surrounding circumstances, not to the psychology of the reasoner, in order to assess the quality of the argument. For example, we need to look to the reasons that the reasoner used to accept the premises. Was the premise justified on the basis of accepting the conclusion? A third group of logicians say that, in deciding whether the fallacy is present, more evidence is needed. We must determine whether any premise that is key to deducing the conclusion is adopted rather blindly or instead is a reasonable assumption made by someone accepting their burden of proof. The premise would here be termed reasonable if the arguer could defend it independently of accepting the conclusion that is at issue.

Beside the Point

Arguing for a conclusion that is not relevant to the current issue. Also called Irrelevant Conclusion. It is a form of the Red Herring Fallacy

Biased Generalizing

Generalizing from a biased sample. Using an unrepresentative sample and overestimating the strength of an argument based on that sample.
See Unrepresentative Sample.

Biased Sample

See Unrepresentative Sample.

Biased Statistics

See Unrepresentative Sample.

Bifurcation

See Black-or-White.

Black-or-White

The Black-or-White fallacy or Black-White fallacy is a False Dilemma Fallacy that limits you unfairly to only two choices, as if you were made to choose between black and white.

Example:

Well, it’s time for a decision. Will you contribute $20 to our environmental fund, or are you on the side of environmental destruction?

A proper challenge to this fallacy could be to say, “I do want to prevent the destruction of our environment, but I don’t want to give $20 to your fund. You are placing me between a rock and a hard place.” The key to diagnosing the Black-or-White Fallacy is to determine whether the limited menu is fair or unfair. Simply saying, “Will you contribute $20 or won’t you?” is not unfair. The fallacy shows up in psychology when a person is too apt to treat people simply as friend or enemy, or smart or an idiot. The black-or-white fallacy is often committed intentionally in jokes such as: “My toaster has two settings—burnt and off.” In thinking about this kind of fallacy it is helpful to remember that everything is either black or not black, but not everything is either black or white.

Caricaturization

Attacking a person’s argument by presenting a caricaturization is a form of the Straw Man Fallacy and the Ad Hominem Fallacy. A critical thinker should attack the real man and his argument, not a caricaturization of the man or the argument. Ditto for women, of course. The fallacy is a form of the Straw Man Fallacy because Ideally an argument should not be assessed by a technique that unfairly misrepresents it. The Caricaturization Fallacy is the same as the Fallacy of Refutation by Caricature.

Changing the Question

This is another name for the Fallacy of Avoiding the Question.

Cherry-Picking

Cherry-Picking the Evidence is another name for the Fallacy of Suppressed Evidence.

Circular Reasoning

The Fallacy of Circular Reasoning occurs when the reasoner begins with what he or she is trying to end up with.

Here is Steven Pinker’s example:

Definition: endless loop, n. See loop, endless.

Definition: loop, endless, n. See endless loop.

The most well known examples of circular reasoning are cases of the Fallacy of Begging the Question. Here the circle is as short as possible. However, if the circle is very much larger, including a wide variety of claims and a large set of related concepts, then the circular reasoning can be informative and so is not considered to be fallacious. For example, a dictionary contains a large circle of definitions that use words which are defined in terms of other words that are also defined in the dictionary. Because the dictionary is so informative, it is not considered as a whole to be fallacious. However, a small circle of definitions is considered to be fallacious.

In properly-constructed recursive definitions, defining a term by using that same term is not fallacious. For example, here is an appropriate recursive definition of the term “a stack of coins.” Basis step: Two coins, with one on top of the other, is a stack of coins. Recursion step: If p is a stack of coins, then adding a coin on top of p produces a stack of coins. For a deeper discussion of circular reasoning see Infinitism in Epistemology.

Circumstantial Ad Hominem

See Ad Hominem, Circumstantial.

Clouding the Issue

See Smokescreen.

Common Belief

See Appeal to the People and Traditional Wisdom.

Common Cause

This fallacy occurs during causal reasoning when a causal connection between two kinds of events is claimed when evidence is available indicating that both are the effect of a common cause.

Example:

Noting that the auto accident rate rises and falls with the rate of use of windshield wipers, one concludes that the use of wipers is somehow causing auto accidents.

However, it’s the rain that’s the common cause of both.

Common Practice

See Appeal to the People and Traditional Wisdom.

Complex Question

You use this fallacy when you frame a question so that some controversial presupposition is made by the wording of the question.

Example:

[Reporter’s question] Mr. President: Are you going to continue your policy of wasting taxpayer’s money on missile defense?

The question unfairly presumes the controversial claim that the policy really is a waste of money. The Fallacy of Complex Question is a form of Begging the Question.

Composition

The Composition Fallacy occurs when someone mistakenly assumes that a characteristic of some or all the individuals in a group is also a characteristic of the group itself, the group “composed” of those members. It is the converse of the Division Fallacy.

Example:

Each human cell is very lightweight, so a human being composed of cells is also very lightweight.

Confirmation Bias

The tendency to look for evidence in favor of one’s controversial hypothesis and not to look for disconfirming evidence, or to pay insufficient attention to it. This is the most common kind of Fallacy of Selective Attention, and it is the foundation of many conspiracy theories.

Example:

She loves me, and there are so many ways that she has shown it. When we signed the divorce papers in her lawyer’s office, she wore my favorite color. When she slapped me at the bar and called me a “handsome pig,” she used the word “handsome” when she didn’t have to. When I called her and she said never to call her again, she first asked me how I was doing and whether my life had changed. When I suggested that we should have children in order to keep our marriage together, she laughed. If she can laugh with me, if she wants to know how I am doing and whether my life has changed, and if she calls me “handsome” and wears my favorite color on special occasions, then I know she really loves me.

Using the Fallacy of Confirmation Bias is usually a sign that one has adopted some belief dogmatically and isn’t willing to disconfirm the belief, or is too willing to interpret ambiguous evidence so that it conforms to what one already believes. Confirmation bias often reveals itself in the fact that people of opposing views can each find support for those views in the same piece of evidence.

Conjunction

Mistakenly supposing that event E is less likely than the conjunction of events E and F. Here is an example from the psychologists Daniel Kahneman and Amos Tversky.

Example:

Suppose you know that Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice. Then you are asked to choose which is more likely: (A) Linda is a bank teller or (B) Linda is a bank teller and active in the feminist movement. If you choose (B) you commit the Conjunction Fallacy

Confusing an Explanation with an Excuse

Treating someone’s explanation of a fact as if it were a justification of the fact. Explaining a crime should not be confused with excusing the crime, but it too often is.
Example:

Speaker: The German atrocities committed against the French and Belgians during World War I were in part due to the anger of German soldiers who learned that French and Belgian soldiers were ambushing German soldiers, shooting them in the back, or even poisoning, blinding and castrating them.

Respondent: I don’t understand how you can be so insensitive as to condone those German atrocities.

Consensus Gentium

Fallacy of Argumentum Consensus Gentium (argument from the consensus of the nations). See Traditional Wisdom.

Consequence

See Appeal to Consequence.

Contextomy

See Quoting out of Context.

Converse Accident

If we reason by paying too much attention to exceptions to the rule, and generalize on the exceptions, our reasoning contains this fallacy. This fallacy is the converse of the Accident Fallacy. It is a kind of Hasty Generalization, by generalizing too quickly from a peculiar case.

Example:

I’ve heard that turtles live longer than tarantulas, but the one turtle I bought lived only two days. I bought it at Dowden’s Pet Store. So, I think that turtles bought from pet stores do not live longer than tarantulas.

The original generalization is “Turtles live longer than tarantulas.” There are exceptions, such as the turtle bought from the pet store. Rather than seeing this for what it is, namely an exception, the reasoner places too much trust in this exception and generalizes on it to produce the faulty generalization that turtles bought from pet stores do not live longer than tarantulas.

Cover-up

See Suppressed Evidence.

Cum Hoc, Ergo Propter Hoc

Latin for “with this, therefore because of this.” This is a False Cause Fallacy that doesn’t depend on time order (as does the post hoc fallacy), but on any other chance correlation of the supposed cause being in the presence of the supposed effect.

Example:

Loud musicians live near our low-yield cornfields. So, loud musicians must be causing the low yield.

Curve Fitting

Curve fitting is the process of constructing a curve that has the best fit to a series of data points. The curve is a graph of some mathematical function. The function or functional relationship might be between variable x and variable y, where x is the time of day and y is the temperature of the ocean. When you collect data about some relationship, you inevitably collect information that is affected by noise or statistical fluctuation. If you create a function between x and y that is too sensitive to your data, you will be overemphasizing the noise and producing a function that has less predictive value than need be. If you create your function by interpolating, that is, by drawing straight line segments between all the adjacent data points, or if you create a polynomial function that exactly fits every data point, it is likely that your function will be worse than if you’d produced a function with a smoother curve. Your original error of too closely fitting the data-points is called the Fallacy of Curve Fitting or the Fallacy of Overfitting.

Example:

You want to know the temperature of the ocean today, so you measure it at 8:00 A.M. with one thermometer and get the temperature of 60.1 degrees. Then you measure the ocean at 8:05 A.M. with a different thermometer and get the temperature of 60.2 degrees; then at 8:10 A.M. and get 59.1 degrees perhaps with the first thermometer, and so. If you fit your curve exactly to your data points, then you falsely imply that the ocean’s temperature is shifting all around every five minutes. However, the temperature is probably constant, and the problem is that your prediction is too sensitive to your data, so your curve fits the data points too closely.

Definist

The Definist Fallacy occurs when someone unfairly defines a term so that a controversial position is made easier to defend. Same as the Persuasive Definition.

Example:

During a controversy about the truth or falsity of atheism, the fallacious reasoner says, “Let’s define ‘atheist’ as someone who doesn’t yet realize that God exists.”

Denying the Antecedent

You are using this fallacy if you deny the antecedent of a conditional and then suppose that doing so is a sufficient reason for denying the consequent. This formal fallacy is often mistaken for Modus Tollens, a valid form of argument using the conditional. A conditional is an if-then statement; the if-part is the antecedent, and the then-part is the consequent.

Example:

If she were Brazilian, then she would know that Brazil’s official language is Portuguese. She isn’t Brazilian; she’s from London. So, she surely doesn’t know this about Brazil’s language.

Disregarding Known Science

This fallacy is committed when a person makes a claim that knowingly or unknowingly disregards well known science, science that weighs against the claim. They should know better. This fallacy is a form of the Fallacy of Suppressed Evidence.

Example:

John claims in his grant application that he will be studying the causal effectiveness of bone color on the ability of leg bones to support indigenous New Zealand mammals. He disregards well known scientific knowledge that color is not what causes any bones to work the way they do by saying that this knowledge has never been tested in New Zealand.

Digression

See Avoiding the Issue.

Distraction

See Smokescreen.

Division

Merely because a group as a whole has a characteristic, it often doesn’t follow that individuals in the group have that characteristic. If you suppose that it does follow, when it doesn’t, your reasoning contains the Fallacy of Division. It is the converse of the Composition Fallacy.

Example:

Joshua’s soccer team is the best in the division because it had an undefeated season and won the division title, so their goalie must be the best in the division.

As an example of division, Aristotle gave this example: The number 5 is 2 and 3. But 2 is even and 3 is odd, so 5 is even and odd.

Domino

See Slippery Slope.

Double Standard

There are many situations in which you should judge two things or people by the same standard. If in one of those situations you use different standards for the two, your reasoning contains the Fallacy of Using a Double Standard.

Example:

I know we will hire any man who gets over a 70 percent on the screening test for hiring Post Office employees, but women should have to get an 80 to be hired because they often have to take care of their children.

This example is a fallacy if it can be presumed that men and women should have to meet the same standard for becoming a Post Office employee.

Either/Or

See Black-or-White.

Equivocation

Equivocation is the illegitimate switching of the meaning of a term that occurs twice during the reasoning; it is the use of one word taken in two ways. The fallacy is a kind of Fallacy of Ambiguity.

Example:

Brad is a nobody, but since nobody is perfect, Brad must be perfect, too.

The term “nobody” changes its meaning without warning in the passage. Equivocation can sometimes be very difficult to detect, as in this argument from Walter Burleigh:

If I call you a swine, then I call you an animal.
If I call you an animal, then I’m speaking the truth.
Therefore, if I call you a swine, then I’m speaking the truth.

Etymological

The Etymological Fallacy occurs whenever someone falsely assumes that the meaning of a word can be discovered from its etymology or origins.

Example:

The word “vise” comes from the Latin “that which winds,” so it means anything that winds. Since a hurricane winds around its own eye, it is a vise.

Every and All

The Fallacy of Every and All turns on errors due to the order or scope of the quantifiers “every” and “all” and “any.” This is a version of the Scope Fallacy.

Example:

Every action of ours has some final end. So, there is some common final end to all our actions.

In proposing this fallacious argument, Aristotle believed the common end is the supreme good, so he had a rather optimistic outlook on the direction of history.

Exaggeration

When we overstate or overemphasize a point that is a crucial step in a piece of reasoning, then we are guilty of the Fallacy of Exaggeration. This is a kind of error called Lack of Proportion.

Example:

She’s practically admitted that she intentionally yelled at that student while on the playground in the fourth grade. That’s verbal assault. Then she said nothing when the teacher asked, “Who did that?” That’s lying, plain and simple. Do you want to elect as secretary of this club someone who is a known liar prone to assault? Doing so would be a disgrace to our Collie Club.

When we exaggerate in order to make a joke, though, we do not use the fallacy because we do not intend to be taken literally.

Excluded Middle

See False Dilemma or Black-or-White.

False Analogy

The problem is that the items in the analogy are too dissimilar. When reasoning by analogy, the fallacy occurs when the analogy is irrelevant or very weak or when there is a more relevant disanalogy. See also Faulty Comparison.

Example:

The book Investing for Dummies really helped me understand my finances better. The book Chess for Dummies was written by the same author, was published by the same press, and costs about the same amount. So, this chess book would probably help me understand my finances, too.

False Balance

A specific form of the False Equivalence Fallacy that occurs in the context of news reporting, in which the reporter misleads the audience by suggesting the evidence on two sides of an issue is equally balanced, when the reporter knows that one of the two sides is an extreme outlier. Reporters regularly commit this fallacy in order to appear “fair and balanced.”

Example:

The news report of the yesterday’s city council meeting says, “David Samsung challenged the council by saying the Gracie Mansion is haunted, so it should not be torn down. Councilwoman Miranda Gonzales spoke in favor of dismantling the old mansion saying its land is needed for an expansion of the water treatment facility. Both sides seemed quite fervent in promoting their position.” Then the news report stops there, covering up the facts that the preponderance of scientific evidence implies there is no such thing as being haunted, and that David Samsung is the well known “village idiot” who last month came before the council demanding a tax increase for Santa Claus’ workers at the North Pole.

False Cause

Improperly concluding that one thing is a cause of another. The Fallacy of Non Causa Pro Causa is another name for this fallacy. Its four principal kinds are the Post Hoc Fallacy, the Fallacy of Cum Hoc, Ergo Propter Hoc, the Regression Fallacy, and the Fallacy of Reversing Causation.

Example:

My psychic adviser says to expect bad things when Mars is aligned with Jupiter. Tomorrow Mars will be aligned with Jupiter. So, if a dog were to bite me tomorrow, it would be because of the alignment of Mars with Jupiter.

False Dichotomy

See False Dilemma or Black-or-White.

False Dilemma

A reasoner who unfairly presents too few choices and then implies that a choice must be made among this short menu of choices is using the False Dilemma Fallacy, as does the person who accepts this faulty reasoning.

Example:

A pollster asks you this question about your job: “Would you say your employer is drunk on the job about (a) once a week, (b) twice a week, or (c) more times per week?

The pollster is committing the fallacy by limiting you to only those choices. What about the choice of “no times per week”? Think of the unpleasant choices as being the horns of a bull that is charging toward you. By demanding other choices beyond those on the unfairly limited menu, you thereby “go between the horns” of the dilemma, and are not gored. The fallacy is called the “False Dichotomy Fallacy” or the “Black-or-White” Fallacy when the unfair menu contains only two choices, and thus two horns.

False Equivalence

The Fallacy of False Equivalence is committed when someone implies falsely (and usually indirectly) that the two sides on some issue have basically equivalent evidence, while knowingly covering up the fact that one side’s evidence is much weaker. A form of the Fallacy of Suppressed Evidence.

Example:

A popular science article suggests there is no consensus about the Earth’s age, by quoting one geologist who says she believes the Earth is billions of years old, and then by quoting Bible expert James Ussher who says he calculated from the Bible that the world began on Friday, October 28, 4,004 B.C.E. The article suppresses the evidence that geologists (who are the relevant experts on this issue) have reached a consensus that the Earth is billions of years old.

Far-Fetched Hypothesis

This is the fallacy of offering a bizarre (far-fetched) hypothesis as the correct explanation without first ruling out more mundane explanations.

Example:

Look at that mutilated cow in the field, and see that flattened grass. Aliens must have landed in a flying saucer and savaged the cow to learn more about the beings on our planet.

Faulty Comparison

If you try to make a point about something by comparison, and if you do so by comparing it with the wrong thing, then your reasoning uses the Fallacy of Faulty Comparison or the Fallacy of Questionable Analogy.

Example:

We gave half the members of the hiking club Durell hiking boots and the other half good-quality tennis shoes. After three months of hiking, you can see for yourself that Durell lasted longer. You, too, should use Durell when you need hiking boots.

Shouldn’t Durell hiking boots be compared with other hiking boots, not with tennis shoes?

Faulty Generalization

A fallacy produced by some error in the process of generalizing. See Hasty Generalization or Unrepresentative Generalization for examples.

Faulty Motives

An irrelevant appeal to the motives of the arguer, and supposing that this revelation of their motives will thereby undermine their reasoning. A kind of Ad Hominem Fallacy.

Example:

The councilman’s argument for the new convention center can’t be any good because he stands to gain if it’s built.

Formal Fallacy

Formal fallacies are all the cases or kinds of reasoning that fail to be deductively valid. Formal fallacies are also called Logical Fallacies or Invalidities. That is, they are deductively invalid arguments that are too often believed to be deductively valid.

Example:

Some cats are tigers. Some tigers are animals. So, some cats are animals.

This might at first seem to be a good argument, but actually it is fallacious because it has the same logical form as the following more obviously invalid argument:

Some women are Americans. Some Americans are men. So, some women are men.

Nearly all the infinity of types of invalid inferences have no specific fallacy names.

Four Terms

The Fallacy of Four Terms (quaternio terminorum) occurs when four rather than three categorical terms are used in a standard-form syllogism.

Example:

All rivers have banks. All banks have vaults. So, all rivers have vaults.

The word “banks” occurs as two distinct terms, namely river bank and financial bank, so this example also is an equivocation. Without an equivocation, the four term fallacy is trivially invalid.

Gambler’s

This fallacy occurs when the gambler falsely assumes that the history of outcomes will affect future outcomes.

Example:

I know this is a fair coin, but it has come up heads five times in a row now, so tails is due on the next toss.

The fallacious move was to conclude that the probability of the next toss coming up tails must be more than a half. The assumption that it’s a fair coin is important because, if the coin comes up heads five times in a row, one would otherwise become suspicious that it’s not a fair coin and therefore properly conclude that the probably is high that heads is more likely on the next toss.

Genetic

A critic uses the Genetic Fallacy if the critic attempts to discredit or support a claim or an argument because of its origin (genesis) when such an appeal to origins is irrelevant.

Example:

Whatever your reasons are for buying that gift, they’ve got to be ridiculous. You said yourself that you got the idea for buying it from last night’s fortune cookie. Cookies can’t think!

Fortune cookies are not reliable sources of information about what gift to buy, but the reasons the person is willing to give are likely to be quite relevant and should be listened to. The speaker is committing the Genetic Fallacy by paying too much attention to the genesis of the idea rather than to the reasons offered for it.

If I learn that your plan for building the shopping center next to the Johnson estate originated with Johnson himself, who is likely to profit from the deal, then my request that the planning commission not accept your proposal without independent verification of its merits wouldn’t be committing the genetic fallacy. Because appeals to origins are sometimes relevant and sometimes irrelevant and sometimes on the borderline, in those latter cases it can be very difficult to decide whether the fallacy has been committed. For example, if Sigmund Freud shows that the genesis of a person’s belief in God is their desire for a strong father figure, then does it follow that their belief in God is misplaced, or is Freud’s reasoning committing the Genetic Fallacy?

Group Think

A reasoner uses the Group Think Fallacy if he or she substitutes pride of membership in the group for reasons to support the group’s policy. If that’s what our group thinks, then that’s good enough for me. It’s what I think, too. “Blind” patriotism is a rather nasty version of the fallacy.

Example:

We K-Mart employees know that K-Mart brand items are better than Wall-Mart brand items because, well, they are from K-Mart, aren’t they?

Guilt by Association

Guilt by Association is a version of the Ad Hominem Fallacy in which a person is said to be guilty of error because of the group he or she associates with. The fallacy occurs when we unfairly try to change the issue to be about the speaker’s circumstances rather than about the speaker’s actual argument. Also called “Ad Hominem, Circumstantial.”

Example:

Secretary of State Dean Acheson is too soft on communism, as you can see by his inviting so many fuzzy-headed liberals to his White House cocktail parties.

Has any evidence been presented here that Acheson’s actions are inappropriate in regards to communism? This sort of reasoning is an example of McCarthyism, the technique of smearing liberal Democrats that was so effectively used by the late Senator Joe McCarthy in the early 1950s. In fact, Acheson was strongly anti-communist and the architect of President Truman’s firm policy of containing Soviet power.

Hasty Conclusion

See Jumping to Conclusions.

Hasty Generalization

A Hasty Generalization is a Fallacy of Jumping to Conclusions in which the conclusion is a generalization. See also Biased Statistics.

Example:

I’ve met two people in Nicaragua so far, and they were both nice to me. So, all people I will meet in Nicaragua will be nice to me.

In any Hasty Generalization the key error is to overestimate the strength of an argument that is based on too small a sample for the implied confidence level or error margin. In this argument about Nicaragua, using the word “all” in the conclusion implies zero error margin. With zero error margin you’d need to sample every single person in Nicaragua, not just two people.

Heap

See Line-Drawing.

Hedging

You are hedging if you refine your claim simply to avoid counterevidence and then act as if your revised claim is the same as the original.

Example:

Samantha: David is a totally selfish person.

Yvonne: I thought we was a boy scout leader. Don’t you have to give a lot of your time for that?

Samantha: Well, David’s totally selfish about what he gives money to. He won’t spend a dime on anyone else.

Yvonne: I saw him bidding on things at the high school auction fundraiser.

Samantha: Well, except for that he’s totally selfish about money.

You do not use the fallacy if you explicitly accept the counterevidence, admit that your original claim is incorrect, and then revise it so that it avoids that counterevidence.

Hooded Man

This is an error in reasoning due to confusing the knowing of a thing with the knowing of it under all its various names or descriptions.

Example:

You claim to know Socrates, but you must be lying. You admitted you didn’t know the hooded man over there in the corner, but the hooded man is Socrates.

Hyperbolic Discounting

The Fallacy of Hyperbolic Discounting occurs when someone too heavily weighs the importance of a present reward over a significantly greater reward in the near future, but only slightly differs in their valuations of those two rewards if they are to be received in the far future. The person’s preferences are biased toward the present.

Example:

When asked to decide between receiving an award of $50 now or $60 tomorrow, the person chooses the $50; however, when asked to decide between receiving $50 in two years or $60 in two years and one day, the person chooses the $60.

If the person is in a situation in which $50 now will solve their problem but $60 tomorrow will not, then there is no fallacy in having a bias toward the present.

Hypostatization

The error of inappropriately treating an abstract term as if it were a concrete one. Also known as the Fallacy of Misplaced Concreteness and the Fallacy of Reification.

Example:

Nature decides which organisms live and which die.

Nature isn’t capable of making decisions. The point can be made without reasoning fallaciously by saying: “Which organisms live and which die is determined by natural causes.” Whether a phrase commits the fallacy depends crucially upon whether the use of the inaccurate phrase is inappropriate in the situation. In a poem, it is appropriate and very common to reify nature, hope, fear, forgetfulness, and so forth, that is, to treat them as if they were objects or beings with intentions. In any scientific claim, it is inappropriate.

Ideology-Driven Argumentation

This occurs when an arguer presupposes some aspect of their own ideology that they are unable to defend.

Example:

Senator, if you pass that bill to relax restrictions on gun ownership and allow people to carry concealed handguns, then you are putting your own voters at risk.

The arguer is presupposing a liberal ideology which implies that permitting private citizens to carry concealed handguns increases crime and decreases safety. If the arguer is unable to defend this presumption, then the fallacy is committed regardless of whether the presumption is defensible. If the senator were to accept this liberal ideology, then the senator is likely to accept the arguer’s conclusion, and the argument could be considered to be effective, but still it would be fallacious—such is the difference between rhetoric and logic.

Ignoratio Elenchi

See Irrelevant Conclusion. Also called missing the point.

Ignoring a Common Cause

See Common Cause.

Ignoring Inconvenient Data

See Suppressed Evidence.

Incomplete Evidence

See Suppressed Evidence.

Improper Analogy

Another name for the Fallacy of False Analogy.

Inconsistency

The fallacy occurs when we accept an inconsistent set of claims, that is, when we accept a claim that logically conflicts with other claims we hold.

Example:

I never generalize because everyone who does is a hypocrite.

That last remark implies the speaker does generalizealthough the speaker doesn’t notice this inconsistency with what is said.

Inductive Conversion

Improperly reasoning from a claim of the form “All As are Bs” to “All Bs are As” or from one of the form “Many As are Bs” to “Many Bs are As” and so forth.

Example:

Most professional basketball players are tall, so most tall people are professional basketball players.

The term “conversion” is a technical term in formal logic.

Insufficient Statistics

Drawing a statistical conclusion from a set of data that is clearly too small.

Example:

A pollster interviews ten London voters in one building about which candidate for mayor they support, and upon finding that Churchill receives support from six of the ten, declares that Churchill has the majority support of London voters.

This fallacy is a form of the Fallacy of Jumping to Conclusions.

Intensional

The mistake of treating different descriptions or names of the same object as equivalent even in those contexts in which the differences between them matter. Reporting someone’s beliefs or assertions or making claims about necessity or possibility can be such contexts. In these contexts, replacing a description with another that refers to the same object is not valid and may turn a true sentence into a false one.

Example:

Michelle said she wants to meet her new neighbor Stalnaker tonight. But I happen to know Stalnaker is a spy for North Korea, so Michelle said she wants to meet a spy for North Korea tonight.

Michelle said no such thing. The faulty reasoner illegitimately assumed that what is true of a person under one description will remain true when said of that person under a second description even in this context of indirect quotation. What was true of the person when described as “her new neighbor Stalnaker” is that Michelle said she wants to meet him, but it wasn’t legitimate for me to assume this is true of the same person when he is described as “a spy for North Korea.”

Extensional contexts are those in which it is legitimate to substitute equals for equals with no worry. But any context in which this substitution of co-referring terms is illegitimate is called an intensional context. Intensional contexts are produced by quotation, modality, and intentionality (propositional attitudes). Intensionality is failure of extensionality, thus the name “Intensional Fallacy”.

Invalid Reasoning

An invalid inference. An argument can be assessed by deductive standards to see if the conclusion would have to be true if the premises were to be true. If the argument cannot meet this standard, it is invalid. An argument is invalid only if it is not an instance of any valid argument form. The Fallacy of Invalid Reasoning is a formal fallacy.

Example:

If it’s raining, then there are clouds in the sky. It’s not raining. Therefore, there are no clouds in the sky.

This invalid argument is an instance of Denying the Antecedent. Any invalid inference that is also inductively very weak is a Non Sequitur.

Irrelevant Conclusion

The conclusion that is drawn is irrelevant to the premises; it misses the point.

Example:

In court, Thompson testifies that the defendant is a honorable person, who wouldn’t harm a flea. The defense attorney uses the fallacy by rising to say that Thompson’s testimony shows once again that his client was not near the murder scene.

The testimony of Thompson may be relevant to a request for leniency, but it is irrelevant to any claim about the defendant not being near the murder scene. Other examples of this fallacy are Ad Hominem, Appeal to Authority, Appeal to Emotions, and Argument from Ignorance.

Irrelevant Reason

This fallacy is a kind of Non Sequitur in which the premises are wholly irrelevant to drawing the conclusion.

Example:

Lao Tze Beer is the top selling beer in Thailand. So, it will be the best beer for Canadians.

Is-Ought

The Is-Ought Fallacy occurs when a conclusion expressing what ought to be so is inferred from premises expressing only what is so, in which it is supposed that no implicit or explicit ought-premises are need. There is controversy in the philosophical literature regarding whether this type of inference is always fallacious.

Example:

He’s torturing the cat.

So, he shouldn’t do that.

This argument would not use the fallacy if there were an implicit premise indicating that he is a person and that persons should not torture other beings.

Jumping to Conclusions

It is not always a mistake to make a quick decision, but when we draw a conclusion without taking the trouble to acquire enough of the relevant evidence, our reasoning commits the fallacy of jumping to conclusions, provided there was sufficient time to acquire and assess that extra evidence, and provided that the extra effort it takes to get the evidence isn’t prohibitive.

Example:

This car is really cheap. I’ll buy it.

Hold on. Before concluding that you should buy it, ask yourself whether you need to buy another car and, if so, whether you should lease or rent or just borrow a car when you need to travel by car. If you do need to buy a car, you ought to have someone check its operating condition, or else you should make sure you get a guarantee about the car’s being in working order. And, if you stop to think about it, there may be other factors you should consider before making the purchase, such as its age, size, appearance, and mileage.

Lack of Proportion

The Fallacy of Lack of Proportion occurs either by exaggerating or downplaying or simply not noticing a point that is a crucial step in a piece of reasoning. You exaggerate when you make a mountain out of a molehill. You downplay when you suppress relevant evidence. The Genetic Fallacy blows the genesis of an idea out of proportion.

Example:

Did you hear about that tourist being mugged in Russia last week? And then there was the awful train wreck last year just outside Moscow where three of the twenty-five persons killed were tourists. I’ll never visit Russia.

The speaker is blowing these isolated incidents out of proportion. Millions of tourists visit Russia with no problems. Another example occurs when the speaker simply lacks the information needed to give a factor its proper proportion or weight:

I don’t use electric wires in my home because it is well known that the human body can be injured by electric and magnetic fields.

The speaker does not realize all experts agree that electric and magnetic fields caused by home wiring are harmless. However, touching the metal within those wires is very dangerous.

Line-Drawing

If we improperly reject a vague claim because it is not as precise as we’d like, then we are using the line-drawing fallacy. Being vague is not being hopelessly vague. Also called the Bald Man Fallacy, the Fallacy of the Heap and the Sorites Fallacy.

Example:

Dwayne can never grow bald. Dwayne isn’t bald now. Don’t you agree that if he loses one hair, that won’t make him go from not bald to bald? And if he loses one hair after that, then this one loss, too, won’t make him go from not bald to bald. Therefore, no matter how much hair he loses, he can’t become bald.

Loaded Language

Loaded language is emotive terminology that expresses value judgments. When used in what appears to be an objective description, the terminology unfortunately can cause the listener to adopt those values when in fact no good reason has been given for doing so. Also called Prejudicial Language.

Example:

[News broadcast] In today’s top stories, Senator Smith carelessly cast the deciding vote today to pass both the budget bill and the trailer bill to fund yet another excessive watchdog committee over coastal development.

This broadcast is an editorial posing as a news report.

Loaded Question

Asking a question in a way that unfairly presumes the answer. This fallacy occurs commonly in polls, especially push polls, which are polls designed to push information onto the person being polled and not designed to learn the person’s views.

Example:

“If you knew that candidate B was a liar and crook, would you support candidate A or instead candidate B who is neither a liar nor a crook?”

Logic Chopping

Obscuring the issue by using overly-technical logic tools, especially the techniques of formal symbolic logic, that focus attention on trivial details. A form of Smokescreen and Quibbling.

Logical

See Formal.

Lying

A fallacy of reasoning that depends on intentionally saying something that is known to be false. If the lying occurs in an argument’s premise, then it is an example of the Fallacy of Questionable Premise.

Example:

Abraham Lincoln, Theodore Roosevelt, and John Kennedy were assassinated.

They were U.S. presidents.

Therefore, at least three U.S. presidents have been assassinated.

Roosevelt was never assassinated.

Maldistributed Middle

See Undistributed Middle.

Many Questions

See Complex Question.

Misconditionalization

See Modal Fallacy.

Misleading Accent

See the Fallacy of Accent.

Misleading Vividness

When the Fallacy of Jumping to Conclusions is due to a special emphasis on an anecdote or other piece of evidence, then the Fallacy of Misleading Vividness has occurred.

Example:

Yes, I read the side of the cigarette pack about smoking being harmful to your health. That’s the Surgeon General’s opinion, him and all his statistics. But let me tell you about my uncle. Uncle Harry has smoked cigarettes for forty years now and he’s never been sick a day in his life. He even won a ski race at Lake Tahoe in his age group last year. You should have seen him zip down the mountain. He smoked a cigarette during the award ceremony, and he had a broad smile on his face. I was really proud. I can still remember the cheering. Cigarette smoking can’t be as harmful as people say.

The vivid anecdote is the story about Uncle Harry. Too much emphasis is placed on it and not enough on the statistics from the Surgeon General.

Misplaced Concreteness

Mistakenly supposing that something is a concrete object with independent existence, when it’s not. Also known as the Fallacy of Reification and the Fallacy of Hypostatization.

Example:

There are two footballs lying on the floor of an otherwise empty room. When asked to count all the objects in the room, John says there are three: the two balls plus the group of two.

John mistakenly supposed a group or set of concrete objects is also a concrete object.

A less metaphysical example would be a situation where John says a criminal was caught by K-9 aid, and thereby supposed that K-9 aid was some sort of concrete object. John could have expressed the same point less misleadingly by saying a K-9 dog aided in catching a criminal.

Misplaced Burden of Proof

Committing the error of trying to get someone else to prove you are wrong, when it is your responsibility to prove you are correct.

Example:

Person A: I saw a green alien from outer space.
Person B: What!? Can you prove it?
Person A: You can’t prove I didn’t.

If someone says, “I saw a green alien from outer space,” you properly should ask for some proof. If the person responds with no more than something like, “Prove I didn’t,” then they are not accepting their burden of proof and are improperly trying to place it on your shoulders.

Misrepresentation

If the misrepresentation occurs on purpose, then it is an example of lying. If the misrepresentation occurs during a debate in which there is misrepresentation of the opponent’s claim, then it would be the cause of a Straw Man Fallacy.

Missing the Point

See Irrelevant Conclusion.

Mob Appeal

See Appeal to the People.

Modal

This is the error of treating modal conditionals as if the modality applies only to the then-part of the conditional when it more properly applies to the entire conditional.

Example:

James has two children. If James has two children, then he necessarily has more than one child. So, it is necessarily true that James has more than one child.

This apparently valid argument is invalid. It is not necessarily true that James has more than one child; it’s merely true that he has more than one child. He could have had no children. It is logically possible that James has no children even though he actually has two. The solution to the fallacy is to see that the premise “If James has two children, then he necessarily has more than one child,” requires the modality “necessarily” to apply logically to the entire conditional “If James has two children,then he has more than one child” even though grammatically it applies only to “he has more than one child.” The Modal Fallacy is the most well known of the infinitely many errors involving modal concepts. Modal concepts include necessity, possibility, and so forth.

Monte Carlo

See Gambler’s Fallacy.

Name Calling

See Ad Hominem.

Naturalistic

On a broad interpretation of this fallacy, it applies to any attempt to argue from an “is” to an “ought,” that is, from a list of facts to a conclusion about what ought to be done.

Example:

Because women are naturally capable of bearing and nursing children while men are not, women ought to be the primary caregivers of children.

Here is another example. Owners of financially successful companies are more successful than poor people in the competition for wealth, power and social status. Therefore, the poor deserve to be poor. There is considerable disagreement among philosophers regarding what sorts of arguments the term “Naturalistic Fallacy” legitimately applies to.

Neglecting a Common Cause

See Common Cause.

No Middle Ground

See False Dilemma.

No True Scotsman

This error is a kind of Ad Hoc Rescue of one’s generalization in which the reasoner re-characterizes the situation solely in order to escape refutation of the generalization.

Example:

Smith: All Scotsmen are loyal and brave.

Jones: But McDougal over there is a Scotsman, and he was arrested by his commanding officer for running from the enemy.

Smith: Well, if that’s right, it just shows that McDougal wasn’t a TRUE Scotsman.

Non Causa Pro Causa

This label is Latin for mistaking the “non-cause for the cause.” See False Cause.

Non Sequitur

When a conclusion is supported only by extremely weak reasons or by irrelevant reasons, the argument is fallacious and is said to be a Non Sequitur. However, we usually apply the term only when we cannot think of how to label the argument with a more specific fallacy name. Any deductively invalid inference is a non sequitur if it also very weak when assessed by inductive standards.

Example:

Nuclear disarmament is a risk, but everything in life involves a risk. Every time you drive in a car you are taking a risk. If you’re willing to drive in a car, you should be willing to have disarmament.

The following is not an example: “If she committed the murder, then there’d be his blood stains on her hands. His blood stains are on her hands. So, she committed the murder.” This deductively invalid argument uses the Fallacy of Affirming the Consequent, but it isn’t a non sequitur because it has significant inductive strength.

Obscurum per Obscurius

Explaining something obscure or mysterious by something that is even more obscure or more mysterious.

Example:

Let me explain what a lucky result is. It is a fortuitous collapse of the quantum mechanical wave packet that leads to a surprisingly pleasing result.

One-Sidedness

See the related fallacies of Confirmation BiasSlanting and Suppressed Evidence.

Opposition

Being opposed to someone’s reasoning because of who they are, usually because of what group they are associated with. See the Fallacy of Guilt by Association.

Over-Fitting

See Curve Fitting.

Overgeneralization

See Sweeping Generalization.

Oversimplification

You oversimplify when you cover up relevant complexities or make a complicated problem appear to be too much simpler than it really is.

Example:

President Bush wants our country to trade with Fidel Castro’s Communist Cuba. I say there should be a trade embargo against Cuba. The issue in our election is Cuban trade, and if you are against it, then you should vote for me for president.

Whom to vote for should be decided by considering quite a number of issues in addition to Cuban trade. When an oversimplification results in falsely implying that a minor causal factor is the major one, then the reasoning also uses the False Cause Fallacy.

Past Practice

See Traditional Wisdom.

Pathetic

The Pathetic Fallacy is a mistaken belief due to attributing peculiarly human qualities to inanimate objects (but not to animals). The fallacy is caused by anthropomorphism.

Example:

Aargh, it won’t start again. This old car always breaks down on days when I have a job interview. It must be afraid that if I get a new job, then I’ll be able to afford a replacement, so it doesn’t want me to get to my interview on time.

Peer Pressure

See Appeal to the People.

Persuasive Definition

Some people try to win their arguments by getting you to accept their faulty definition. If you buy into their definition, they’ve practically persuaded you already. Same as the Definist Fallacy. Poisoning the Well when presenting a definition would be an example of a using persuasive definition.

Example:

Let’s define a Democrat as a leftist who desires to overtax the corporations and abolish freedom in the economic sphere.

Perfectionist

If you remark that a proposal or claim should be rejected solely because it doesn’t solve the problem perfectly, in cases where perfection isn’t really required, then you’ve used the Perfectionist Fallacy.

Example:

You said hiring a house cleaner would solve our cleaning problems because we both have full-time jobs. Now, look what happened. Every week, after cleaning the toaster oven, our house cleaner leaves it unplugged. I should never have listened to you about hiring a house cleaner.

Petitio Principii

See Begging the Question.

Poisoning the Well

Poisoning the well is a preemptive attack on a person in order to discredit their testimony or argument in advance of their giving it. A person who thereby becomes unreceptive to the testimony reasons fallaciously and has become a victim of the poisoner. This is a kind of Ad Hominem, Circumstantial Fallacy.

Example:

[Prosecuting attorney in court] When is the defense attorney planning to call that twice-convicted child molester, David Barnington, to the stand? OK, I’ll rephrase that. When is the defense attorney planning to call David Barnington to the stand?

Post Hoc

Suppose we notice that an event of kind A is followed in time by an event of kind B, and then hastily leap to the conclusion that A caused B. If so, our reasoning contains the Post Hoc Fallacy. Correlations are often good evidence of causal connection, so the fallacy occurs only when the leap to the causal conclusion is done “hastily.” The Latin term for the fallacy is Post Hoc, Ergo Propter Hoc (“After this, therefore because of this”). It is a kind of False Cause Fallacy.

Example:

I have noticed a pattern about all the basketball games I’ve been to this year. Every time I buy a good seat, our team wins. Every time I buy a cheap, bad seat, we lose. My buying a good seat must somehow be causing those wins.

Your background knowledge should tell you that this pattern probably won’t continue in the future; it’s just an accidental correlation that tells you nothing about the cause of your team’s wins.

Prejudicial Language

See Loaded Language.

Proof Surrogate

Substituting a distracting comment for a real proof.

Example:

I don’t need to tell a smart person like you that you should vote Republican.

This comment is trying to avoid a serious disagreement about whether one should vote Republican.

Prosecutor’s Fallacy

This is the mistake of over-emphasizing the strength of a piece of evidence while paying insufficient attention to the context.

Example:

Suppose a prosecutor is trying to gain a conviction and points to the evidence that at the scene of the burglary the police found a strand of the burglar’s hair. A forensic test showed that the burglar’s hair matches the suspect’s own hair. The forensic scientist testified that the chance of a randomly selected person producing such a match is only one in two thousand. The prosecutor concludes that the suspect has only a one in two thousand chance of being innocent. On the basis of only this evidence, the prosecutor asks the jury for a conviction.

That is fallacious reasoning, and if you are on the jury you should not be convinced. Here’s why. The prosecutor paid insufficient attention to the pool of potential suspects. Suppose that pool has six million people who could have committed the crime, all other things being equal. If the forensic lab had tested all those people, they’d find that about one in every two thousand of them would have a hair match, but that is three thousand people. The suspect is just one of the 3000, so the suspect is very probably innocent unless the prosecutor can provide more evidence. The prosecutor over-emphasized the strength of a

piece of evidence by focusing on one suspect while paying insufficient attention to the context which suggests a pool of many more suspects.

Prosody

See the Fallacy of Accent.

Quantifier Shift

Confusing the phrase “For all x there is some y” with “There is some (one) y such that for all x.”

Example:

Everybody loves someone, so there is someone whom everybody loves.

The error is also made if you reason this way: “Everything has a cause, so there’s one cause of everything.”

Questionable Begging

See Begging the Question

Questionable Analogy

See False Analogy.

Questionable Cause

See False Cause.

Questionable Premise

If you have sufficient background information to know that a premise is questionable or unlikely to be acceptable, then you use this fallacy if you accept an argument based on that premise. This broad category of fallacies of argumentation includes Appeal to Authority, False Dilemma, Inconsistency, Lying, Stacking the Deck, Straw Man, Suppressed Evidence, and many others.

Quibbling

We quibble when we complain about a minor point and falsely believe that this complaint somehow undermines the main point. To avoid this error, the logical reasoner will not make a mountain out of a mole hill nor take people too literally. Logic Chopping is a kind of quibbling.

Example:

I’ve found typographical errors in your poem, so the poem is neither inspired nor perceptive.

Quoting out of Context

If you quote someone, but select the quotation so that essential context is not available and therefore the person’s views are distorted, then you’ve quoted “out of context.” Quoting out of context in an argument creates a Straw Man Fallacy. The fallacy is also called “contextomy.”

Example:

Smith: I’ve been reading about a peculiar game in this article about vegetarianism. When we play this game, we lean out from a fourth-story window and drop down strings containing “Free food” signs on the end in order to hook unsuspecting passers-by. It’s really outrageous, isn’t it? Yet isn’t that precisely what sports fishermen do for entertainment from their fishing boats? The article says it’s time we put an end to sport fishing.

Jones: Let me quote Smith for you. He says “We…hook unsuspecting passers-by.” What sort of moral monster is this man Smith?

Jones’s selective quotation is fallacious because it makes Smith appear to advocate this immoral activity when the context makes it clear that he doesn’t.

Rationalization

We rationalize when we inauthentically offer reasons to support our claim. We are rationalizing when we give someone a reason to justify our action even though we know this reason is not really our own reason for our action, usually because the offered reason will sound better to the audience than our actual reason.

Example:

“I bought the matzo bread from Kroger’s Supermarket because it is the cheapest brand and I wanted to save money,” says Alex [who knows he bought the bread from Kroger’s Supermarket only because his girlfriend works there].

Red Herring

A red herring is a smelly fish that would distract even a bloodhound. It is also a digression that leads the reasoner off the track of considering only relevant information.

Example:

Will the new tax in Senate Bill 47 unfairly hurt business? I notice that the main provision of the bill is that the tax is higher for large employers (fifty or more employees) as opposed to small employers (six to forty-nine employees). To decide on the fairness of the bill, we must first determine whether employees who work for large employers have better working conditions than employees who work for small employers. I am ready to volunteer for a new committee to study this question. How do you suppose the committee should go about collecting the data we need?

Bringing up the issue of working conditions and the committee is the red herring diverting us from the main issue of whether Senate Bill 47 unfairly hurts business. An intentional false lead in a criminal investigation is another example of a red herring.

Refutation by Caricature

See the Fallacy of Caricaturization.

Regression

This fallacy occurs when regression to the mean is mistaken for a sign of a causal connection. Also called the Regressive Fallacy. It is a kind of False Cause Fallacy.

Example:

You are investigating the average heights of groups of people living in the United States. You sample some people living in Columbus, Ohio and determine their average height. You have the numerical figure for the mean height of people living in the U.S., and you notice that members of your sample from Columbus have an average height that differs from this mean. Your second sample of the same size is from people living in Dayton, Ohio. When you find that this group’s average height is closer to the U.S. mean height [as it is very likely to be due to common statistical regression to the mean], you falsely conclude that there must be something causing people living in Dayton to be more like the average U.S. resident than people living in Columbus.

There is most probably nothing causing people from Dayton to be more like the average resident of the U.S.; but rather what is happening is that averages are regressing to the mean.

Reification

Considering a word to be referring to an object, when the meaning of the word can be accounted for more mundanely without assuming the object exists. Also known as the Fallacy of Misplaced Concreteness and the Hypostatization.

Example:

 The 19th century composer Tchaikovsky described the introduction to his Fifth Symphony as “a complete resignation before fate.”

He is treating “fate” as if it is naming some object, when it would be less misleading, but also less poetic, to say the introduction suggests that listeners will resign themselves to accepting whatever events happen to them. The Fallacy occurs also when someone says, “I succumbed to nostalgia.” Without committing the fallacy, one can make the same point by saying, “My mental state caused actions that would best be described as my reflecting an unusual desire to return to some past period of my life.” Another common way the Fallacy is used is when someone says that if you understand what “Sherlock Holmes” means, then Sherlock Holmes exists in your understanding. The larger point being made in this last example is that nouns can be meaningful without them referring to an object, yet those who use the Fallacy of Reification do not understand this point.

Reversing Causation

Drawing an improper conclusion about causation due to a causal assumption that reverses cause and effect. A kind of False Cause Fallacy.

Example:

All the corporate officers of Miami Electronics and Power have big boats. If you’re ever going to become an officer of MEP, you’d better get a bigger boat.

The false assumption here is that having a big boat helps cause you to be an officer in MEP, whereas the reverse is true. Being an officer causes you to have the high income that enables you to purchase a big boat.

Scapegoating

If you unfairly blame an unpopular person or group of people for a problem, then you are scapegoating. This is a kind of Fallacy of Appeal to Emotions.

Example:

Augurs were official diviners of ancient Rome. During the pre-Christian period, when Christians were unpopular, an augur would make a prediction for the emperor about, say, whether a military attack would have a successful outcome. If the prediction failed to come true, the augur would not admit failure but instead would blame nearby Christians for their evil influence on his divining powers. The elimination of these Christians, the augur would claim, could restore his divining powers and help the emperor. By using this reasoning tactic, the augur was scapegoating the Christians.

Scare Tactic

If you suppose that terrorizing your opponent is giving him a reason for believing that you are correct, then you are using a scare tactic and reasoning fallaciously.

Example:

David: My father owns the department store that gives your newspaper fifteen percent of all its advertising revenue, so I’m sure you won’t want to publish any story of my arrest for spray painting the college.

Newspaper editor: Yes, David, I see your point. The story really isn’t newsworthy.

David has given the editor a financial reason not to publish, but he has not given a relevant reason why the story is not newsworthy. David’s tactics are scaring the editor, but it’s the editor who uses the Scare Tactic Fallacy, not David. David has merely used a scare tactic. This fallacy’s name emphasizes the cause of the fallacy rather than the error itself. See also the related Fallacy of Appeal to Emotions.

Scope

The Scope Fallacy is caused by improperly changing or misrepresenting the scope of a phrase.

Example:

Every concerned citizen who believes that someone living in the US is a terrorist should make a report to the authorities. But Shelley told me herself that she believes there are terrorists living in the US, yet she hasn’t made any reports. So, she must not be a concerned citizen.

The first sentence has ambiguous scope. It was probably originally meant in this sense: Every concerned citizen who believes (of someone that this person is living in the US and is a terrorist) should make a report to the authorities. But the speaker is clearly taking the sentence in its other, less plausible sense: Every concerned citizen who believes (that there is someone or other living in the US who is a terrorist) should make a report to the authorities. Scope fallacies usually are Amphibolies.

Secundum Quid

See Accident and Converse Accident, two versions of the fallacy.

Selective Attention

Improperly focusing attention on certain things and ignoring others.

Example:

Father: Justine, how was your school day today? Another C on the history test like last time?

Justine: Dad, I got an A- on my history test today. Isn’t that great? Only one student got an A.

Father: I see you weren’t the one with the A. And what about the math quiz?

Justine: I think I did OK, better than last time.

Father: If you really did well, you’d be sure. What I’m sure of is that today was a pretty bad day for you.

The pessimist who pays attention to all the bad news and ignores the good news thereby use the Fallacy of Selective Attention. The remedy for this fallacy is to pay attention to all the relevant evidence. The most common examples of selective attention are the fallacy of Suppressed Evidence and the fallacy of Confirmation Bias. See also the Sharpshooter’s Fallacy.

Self-Fulfilling Prophecy

The fallacy occurs when the act of prophesying will itself produce the effect that is prophesied, but the reasoner doesn’t recognize this and believes the prophesy is a significant insight.

Example:

A group of students are selected to be interviewed individually by the teacher. Each selected student is told that the teacher has predicted they will do significantly better in their future school work. Actually, though, the teacher has no special information about the students and has picked the group at random. If the students believe this prediction about themselves, then, given human psychology, it is likely that they will do better merely because of the teacher’s making the prediction.

The prediction will fulfill itself, so to speak, and the students’ reasoning contains the fallacy.

This fallacy can be dangerous in an atmosphere of potential war between nations when the leader of a nation predicts that their nation will go to war against their enemy. This prediction could very well precipitate an enemy attack because the enemy calculates that if war is inevitable then it is to their military advantage not to get caught by surprise.

Self-Selection

A Biased Generalization in which the bias is due to self-selection for membership in the sample used to make the generalization.

Example:

The radio announcer at a student radio station in New York asks listeners to call in and say whether they favor Jones or Smith for president. 80% of the callers favor Jones, so the announcer declares that Americans prefer Jones to Smith.

The problem here is that the callers selected themselves for membership in the sample, but clearly the sample is unlikely to be representative of Americans.

Sharpshooter’s

The Sharpshooter’s Fallacy gets its name from someone shooting a rifle at the side of the barn and then going over and drawing a target and bulls eye concentrically around the bullet hole. The fallacy is caused by overemphasizing random results or making selective use of coincidence. See the Fallacy of Selective Attention.

Example:

Psychic Sarah makes twenty-six predictions about what will happen next year. When one, but only one, of the predictions comes true, she says, “Aha! I can see into the future.”

Slanting

This error occurs when the issue is not treated fairly because of misrepresenting the evidence by, say, suppressing part of it, or misconstruing some of it, or simply lying. See the following related fallacies: Confirmation BiasLying, Misrepresentation, Questionable Premise, Quoting out of Context, Straw Man, Suppressed Evidence.

Slippery Slope

Suppose someone claims that a first step (in a chain of causes and effects, or a chain of reasoning) will probably lead to a second step that in turn will probably lead to another step and so on until a final step ends in trouble. If the likelihood of the trouble occurring is exaggerated, the Slippery Slope Fallacy is present.

Example:

Mom: Those look like bags under your eyes. Are you getting enough sleep?

Jeff: I had a test and stayed up late studying.

Mom: You didn’t take any drugs, did you?

Jeff: Just caffeine in my coffee, like I always do.

Mom: Jeff! You know what happens when people take drugs! Pretty soon the caffeine won’t be strong enough. Then you will take something stronger, maybe someone’s diet pill. Then, something even stronger. Eventually, you will be doing cocaine. Then you will be a crack addict! So, don’t drink that coffee.

The form of a Slippery Slope Fallacy looks like this:

A often leads to B.

B often leads to C.

C often leads to D.

Z leads to HELL.

We don’t want to go to HELL.

So, don’t take that first step A.

The key claim in the fallacy is that taking the first step will lead to the final, unacceptable step. Arguments of this form may or may not be fallacious depending on the probabilities involved in each step. The analyst asks how likely it is that taking the first step will lead to the final step. For example, if A leads to B with a probability of 80 percent, and B leads to C with a probability of 80 percent, and C leads to D with a probability of 80 percent, is it likely that A will eventually lead to D? No, not at all; there is about a 50% chance. The proper analysis of a slippery slope argument depends on sensitivity to such probabilistic calculations. Regarding terminology, if the chain of reasoning A, B, C, D, …, Z is about causes, then the fallacy is called the Domino Fallacy.

Small Sample

This is the fallacy of using too small a sample. If the sample is too small to provide a representative sample of the population, and if we have the background information to know that there is this problem with sample size, yet we still accept the generalization upon the sample results, then we use the fallacy. This fallacy is the Fallacy of Hasty Generalization, but it emphasizes statistical sampling techniques.

Example:

I’ve eaten in restaurants twice in my life, and both times I’ve gotten sick. I’ve learned one thing from these experiences: restaurants make me sick.

How big a sample do you need to avoid the fallacy? Relying on background knowledge about a population’s lack of diversity can reduce the sample size needed for the generalization. With a completely homogeneous population, a sample of one is large enough to be representative of the population; if we’ve seen one electron, we’ve seen them all. However, eating in one restaurant is not like eating in any restaurant, so far as getting sick is concerned. We cannot place a specific number on sample size below which the fallacy is produced unless we know about homogeneity of the population and the margin of error and the confidence level.

Smear Tactic

A smear tactic is an unfair characterization either of the opponent or the opponent’s position or argument. Smearing the opponent causes an Ad Hominem Fallacy. Smearing the opponent’s argument causes a Straw Man Fallacy.

Smokescreen

This fallacy occurs by offering too many details in order either to obscure the point or to cover-up counter-evidence. In the latter case it would be an example of the Fallacy of Suppressed Evidence. If you produce a smokescreen by bringing up an irrelevant issue, then you produce a Red Herring Fallacy. Sometimes called Clouding the Issue.

Example:

Senator, wait before you vote on Senate Bill 88. Do you realize that Delaware passed a bill on the same subject in 1932, but it was ruled unconstitutional for these twenty reasons. Let me list them here…. Also, before you vote on SB 88 you need to know that …. And so on.

There is no recipe to follow in distinguishing smokescreens from reasonable appeals to caution and care.

Sorites

See Line-Drawing.

Special Pleading

Special pleading is a form of inconsistency in which the reasoner doesn’t apply his or her principles consistently. It is the fallacy of applying a general principle to various situations but not applying it to a special situation that interests the arguer even though the general principle properly applies to that special situation, too.

Example:

Everyone has a duty to help the police do their job, no matter who the suspect is. That is why we must support investigations into corruption in the police department. No person is above the law. Of course, if the police come knocking on my door to ask about my neighbors and the robberies in our building, I know nothing. I’m not about to rat on anybody.

In our example, the principle of helping the police is applied to investigations of police officers but not to one’s neighbors.

Specificity

Drawing an overly specific conclusion from the evidence. A kind of jumping to conclusions.

Example:

The trigonometry calculation came out to 5,005.6833 feet, so that’s how wide the cloud is up there.

Stacking the Deck

See Suppressed Evidence and Slanting.

Stereotyping

Using stereotypes as if they are accurate generalizations for the whole group is an error in reasoning. Stereotypes are general beliefs we use to categorize people, objects, and events; but these beliefs are overstatements that shouldn’t be taken literally. For example, consider the stereotype “She’s Mexican, so she’s going to be late.” This conveys a mistaken impression of all Mexicans. On the other hand, even though most Mexicans are punctual, a German is more apt to be punctual than a Mexican, and this fact is said to be the “kernel of truth” in the stereotype. The danger in our using stereotypes is that speakers or listeners will not realize that even the best stereotypes are accurate only when taken probabilistically. As a consequence, the use of stereotypes can breed racism, sexism, and other forms of bigotry.

Example:

German people aren’t good at dancing our sambas. She’s German. So, she’s not going to be any good at dancing our sambas.

This argument is deductively valid, but it’s unsound because it rests on a false, stereotypical premise. The grain of truth in the stereotype is that the average German doesn’t dance sambas as well as the average South American, but to overgeneralize and presume that ALL Germans are poor samba dancers compared to South Americans is a mistake called “stereotyping.”

Straw Man

Your reasoning contains the Straw Man Fallacy whenever you attribute an easily refuted position to your opponent, one that the opponent would not endorse, and then proceed to attack the easily refuted position (the straw man) believing you have thereby undermined the real man, the opponent’s actual position. If the unfair and inaccurate representation is on purpose, then the Straw Man Fallacy is caused by lying.

Example (a debate before the city council):

Opponent: Because of the killing and suffering of Indians that followed Columbus’s discovery of America, the City of Berkeley should declare that Columbus Day will no longer be observed in our city.

Speaker: This is ridiculous, fellow members of the city council. It’s not true that everybody who ever came to America from another country somehow oppressed the Indians. I say we should continue to observe Columbus Day, and vote down this resolution that will make the City of Berkeley the laughing stock of the nation.

The Opponent is likely to respond with “Wait! That’s not what I said.” The Speaker has twisted what his Opponent said. The Opponent never said nor even indirectly suggested that everybody who ever came to America from another country somehow oppressed the Indians.

Style Over Substance

Unfortunately the style with which an argument is presented is sometimes taken as adding to the substance or strength of the argument.

Example:

You’ve just been told by the salesperson that the new Maytag is an excellent washing machine because it has a double washing cycle. If you notice that the salesperson smiled at you and was well dressed, this does not add to the quality of the salesperson’s argument, but unfortunately it does for those who are influenced by style over substance, as most of us are.

Subjectivist

The Subjectivist Fallacy occurs when it is mistakenly supposed that a good reason to reject a claim is that truth on the matter is relative to the person or group.

Example:

Justine has just given Jake her reasons for believing that the Devil is an imaginary evil person. Jake, not wanting to accept her conclusion, responds with, “That’s perhaps true for you, but it’s not true for me.”

Superstitious Thinking

Reasoning deserves to be called superstitious if it is based on reasons that are well known to be unacceptable, usually due to unreasonable fear of the unknown, trust in magic, or an obviously false idea of what can cause what. A belief produced by superstitious reasoning is called a superstition. The fallacy is an instance of the False Cause Fallacy.

Example:

I never walk under ladders; it’s bad luck.

It may be a good idea not to walk under ladders, but a proper reason to believe this is that workers on ladders occasionally drop things, and that ladders might have dripping wet paint that could damage your clothes. An improper reason for not walking under ladders is that it is bad luck to do so.

Suppressed Evidence

Intentionally failing to use information suspected of being relevant and significant is committing the fallacy of suppressed evidence. This fallacy usually occurs when the information counts against one’s own conclusion. Perhaps the arguer is not mentioning that experts have recently objected to one of his premises. The fallacy is a kind of Fallacy of Selective Attention.

Example:

Buying the Cray Mac 11 computer for our company was the right thing to do. It meets our company’s needs; it runs the programs we want it to run; it will be delivered quickly; and it costs much less than what we had budgeted.

This appears to be a good argument, but you’d change your assessment of the argument if you learned the speaker has intentionally suppressed the relevant evidence that the company’s Cray Mac 11 was purchased from his brother-in-law at a 30 percent higher price than it could have been purchased elsewhere, and if you learned that a recent unbiased analysis of ten comparable computers placed the Cray Mac 11 near the bottom of the list.

If the relevant information is not intentionally suppressed but rather inadvertently overlooked, the fallacy of suppressed evidence also is said to occur, although the fallacy’s name is misleading in this case. The fallacy is also called the Fallacy of Incomplete Evidence and Cherry-Picking the Evidence. See also Slanting.

Sweeping Generalization

See Fallacy of Accident.

Syllogistic

Syllogistic fallacies are kinds of invalid categorical syllogisms. This list contains the Fallacy of Undistributed Middle and the Fallacy of Four Terms, and a few others though there are a great many such formal fallacies.

Tokenism

If you interpret a merely token gesture as an adequate substitute for the real thing, you’ve been taken in by tokenism.

Example:

How can you call our organization racist? After all, our receptionist is African American.

If you accept this line of reasoning, you have been taken in by tokenism.

Traditional Wisdom

If you say or imply that a practice must be OK today simply because it has been the apparently wise practice in the past, then your reasoning contains the fallacy of traditional wisdom. Procedures that are being practiced and that have a tradition of being practiced might or might not be able to be given a good justification, but merely saying that they have been practiced in the past is not always good enough, in which case the fallacy is present. Also called Argumentum Consensus Gentium when the traditional wisdom is that of nations.

Example:

Of course we should buy IBM’s computer whenever we need new computers. We have been buying IBM as far back as anyone can remember.

The “of course” is the problem. The traditional wisdom of IBM being the right buy is some reason to buy IBM next time, but it’s not a good enough reason in a climate of changing products, so the “of course” indicates that the Fallacy of Traditional Wisdom has occurred. The fallacy is essentially the same as the fallacies of Appeal to the Common Practice, Gallery, Masses, Mob, Past Practice, People, Peers, and Popularity.

Tu Quoque

The Fallacy of Tu Quoque occurs in our reasoning if we conclude that someone’s argument not to perform some act must be faulty because the arguer himself or herself has performed it. Similarly, when we point out that the arguer doesn’t practice what he or she preaches, and then suppose that there must be an error in the preaching for only this reason, then we are reasoning fallaciously and creating a Tu Quoque. This is a kind of Ad Hominem Circumstantial Fallacy.

Example:

Look who’s talking. You say I shouldn’t become an alcoholic because it will hurt me and my family, yet you yourself are an alcoholic, so your argument can’t be worth listening to.

Discovering that a speaker is a hypocrite is a reason to be suspicious of the speaker’s reasoning, but it is not a sufficient reason to discount it.

Two Wrongs do not Make a Right

When you defend your wrong action as being right because someone previously has acted wrongly, you are using the fallacy called “Two Wrongs do not Make a Right.” This is a special kind of Ad Hominem Fallacy.

Example:

Oops, no paper this morning. Somebody in our apartment building probably stole my newspaper. So, that makes it OK for me to steal one from my neighbor’s doormat while nobody else is out here in the hallway.

Undistributed Middle

In syllogistic logic, failing to distribute the middle term over at least one of the other terms is the fallacy of undistributed middle. Also called the Fallacy of Maldistributed Middle.

Example:

All collies are animals.

All dogs are animals.

Therefore, all collies are dogs.

The middle term (“animals”) is in the predicate of both universal affirmative premises and therefore is undistributed. This formal fallacy has the logical form: All C are A. All D are A. Therefore, all C are D.

Unfalsifiability

This error in explanation occurs when the explanation contains a claim that is not falsifiable, because there is no way to check on the claim. That is, there would be no way to show the claim to be false if it were false.

Example:

He lied because he’s possessed by demons.

This could be the correct explanation of his lying, but there’s no way to check on whether it’s correct. You can check whether he’s twitching and moaning, but this won’t be evidence about whether a supernatural force is controlling his body. The claim that he’s possessed can’t be verified if it’s true, and it can’t be falsified if it’s false. So, the claim is too odd to be relied upon for an explanation of his lying. Relying on the claim is an instance of fallacious reasoning.

Unrepresentative Generalization

If the plants on my plate are not representative of all plants, then the following generalization should not be trusted.

Example:

Each plant on my plate is edible.

So, all plants are edible.

The set of plants on my plate is called “the sample” in the technical vocabulary of statistics, and the set of all plants is called “the target population.” If you are going to generalize on a sample, then you want your sample to be representative of the target population, that is, to be like it in the relevant respects. This fallacy is the same as the Fallacy of Unrepresentative Sample.

Unrepresentative Sample

If the means of collecting the sample from the population are likely to produce a sample that is unrepresentative of the population, then a generalization upon the sample data is an inference using the fallacy of unrepresentative sample. A kind of Hasty Generalization. When some of the statistical evidence is expected to be relevant to the results but is hidden or overlooked, the fallacy is called Suppressed Evidence. There are many ways to bias a sample. Knowingly selecting atypical members of the population produces a biased sample.

Example:

The two men in the matching green suits that I met at the Star Trek Convention in Las Vegas had a terrible fear of cats. I remember their saying they were from France. I’ve never met anyone else from France, so I suppose everyone there has a terrible fear of cats.

Most people’s background information is sufficient to tell them that people at this sort of convention are unlikely to be representative, that is, are likely to be atypical members of the rest of society. Having a small sample does not by itself cause the sample to be biased. Small samples are OK if there is a corresponding large margin of error or low confidence level.

Large samples can be unrepresentative, too.

Example:

We’ve polled over 400,000 Southern Baptists and asked them whether the best religion in the world is Southern Baptist. We have over 99% agreement, which proves our point about which religion is best.

Getting a larger sample size does not overcome sampling bias.

Untestability

See Unfalsifiability.

Vested Interest

The Vested Interest Fallacy occurs when a person argues that someone’s claim is incorrect or their recommended action is not worthy of being followed because the person is motivated by their interest in gaining something by it, with the implication that were it not for this vested interest then the person wouldn’t make the claim or recommend the action. Because this reasoning attacks the reasoner rather than the reasoning itself, it is a kind of Ad Hominem fallacy.

Example:

According to Samantha we all should vote for Anderson for Congress. Yet she’s a lobbyist in the pay of Anderson and will get a nice job in the capitol if he’s elected, so that convinces me that she is giving bad advice.

This is fallacious reasoning by the speaker because whether Samantha is giving good advice about Anderson ought to depend on Anderson’s qualifications, not on whether Samantha will or won’t get a nice job if he’s elected.

Victory by Definition

Same as the fallacy of Persuasive Definition.

Weak Analogy

See False Analogy.

Willed ignorance

I’ve got my mind made up, so don’t confuse me with the facts. This is usually a case of the Traditional Wisdom Fallacy.

Example:

Of course she’s made a mistake. We’ve always had meat and potatoes for dinner, and our ancestors have always had meat and potatoes for dinner, and so nobody knows what they’re talking about when they start saying meat and potatoes are bad for us.

Wishful Thinking

A reasoner who suggests that a claim is true, or false, merely because he or she strongly hopes it is, is using the fallacy of wishful thinking. Wishing something is true is not a relevant reason for claiming that it is actually true.

Example:

There’s got to be an error here in the history book. It says Thomas Jefferson had slaves. I don’t believe it. He was our best president, and a good president would never do such a thing. That would be awful.

You-Too

This is an informal name for the Tu Quoque fallacy.

7. References and Further Reading

  • Eemeren, Frans H. van, R. F. Grootendorst, F. S. Henkemans, J. A. Blair, R. H. Johnson, E. C. W. Krabbe, C. W. Plantin, D. N. Walton, C. A. Willard, J. A. Woods, and D. F. Zarefsky, 1996. Fundamentals of Argumentation Theory: A Handbook of Historical Backgrounds and Contemporary Developments. Mahwah, New Jersey, Lawrence Erlbaum Associates, Publishers.
  • Fearnside, W. Ward and William B. Holther, 1959. Fallacy: The Counterfeit of Argument. Prentice-Hall, Inc. Englewood Cliffs, New Jersey.
  • Fischer, David Hackett., 1970. Historian’s Fallacies: Toward Logic of Historical Thought. New York, Harper & Row, New York, N.Y.
    • This book contains additional fallacies to those in this article, but they are much less common, and many have obscure names.
  • Groarke, Leo and C. Tindale, 2003. Good Reasoning Matters! 3rd edition, Toronto, Oxford University Press.
  • Hamblin, Charles L., 1970. Fallacies. London, Methuen.
  • Hansen, Has V. and R. C. Pinto., 1995. Fallacies: Classical and Contemporary Readings. University Park, Pennsylvania State University Press.
  • Huff, Darrell, 1954. How to Lie with Statistics. New York, W. W. Norton.
  • Levi, D. S., 1994. “Begging What is at Issue in the Argument,” Argumentation, 8, 265-282.
  • Schwartz, Thomas, 1981. “Logic as a Liberal Art,” Teaching Philosophy 4, 231-247.
  • Walton, Douglas N., 1989. Informal Logic: A Handbook for Critical Argumentation. Cambridge, Cambridge University Press.
  • Walton, Douglas N., 1995. A Pragmatic Theory of Fallacy. Tuscaloosa, University of Alabama Press.
  • Walton, Douglas N., 1997. Appeal to Expert Opinion: Arguments from Authority. University Park, Pennsylvania State University Press.
  • Whately, Richard, 1836. Elements of Logic. New York, Jackson.
  • Woods, John and D. N. Walton, 1989. Fallacies: Selected Papers 1972-1982. Dordrecht, Holland, Foris.

Research on the fallacies of informal logic is regularly published in the following journals: Argumentation, Argumentation and Advocacy, Informal Logic, Philosophy and Rhetoric, and Teaching Philosophy.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

The Ethics and Epistemology of Trust

Trust is a topic of long-standing philosophical interest because it is indispensable to the success of almost every kind of coordinated human activity, from politics and business to sport and scientific research. Even more, trust is necessary for the successful dissemination of knowledge, and, by extension, for nearly any form of practical deliberation and planning that requires us to make use of more information than we are able to gather individually and verify ourselves. In short, without trust, we could achieve few of our goals and would know very little. Despite trust’s fundamental importance in human life, there is substantial philosophical disagreement about what trust is, and further, how trusting is normatively constrained and best theorized about in relation to other things we value. Consequently, contemporary philosophical literature on trust features a range of different theoretical options for making sense of trust, and these options differ in how they (among other things) take trust to relate to such things as reliance, optimism, belief, obligations, monitoring, expectations, competence, trustworthiness, assurance, and doubt. With the aim of exploring these myriad issues in an organized way, this article is divided into three sections, each of which offers an overview of key (and sometimes interconnected) ethical and epistemological themes in the philosophy of trust: (1) The Nature of Trust; (2) The Normativity of Trust; and (3) The Value of Trust.
Table of Contents

  1. The Nature of Trust
    1. Reliance vs. Interpersonal Trust
    2. Two-place vs. Three-place Trust
    3. Trust and Belief: Doxastic, Non-doxastic and Performance-theoretic Accounts
    4. Deception Detection and Monitoring
  2. The Normativity of Trust
    1. Entitlement to Trust
    2. Trust in Words
    3. Obligation to Trust
    4. Trustworthiness
  3. The Value of Trust
  4. References and Further Reading

1. The Nature of Trust

 What is trust? To a very first approximation, trust is an attitude or a hybrid of attitudes (for instance, optimism, hope, belief, and so forth) toward a trustee, that involves some (non-negligible) vulnerability to being betrayed on the truster’s side. This general remark, of course, does not take us very far. For example, we may ask: what kind of attitude (or hybrid of attitudes) is trust exactly? Suppose that (as some philosophers of trust maintain) trust requires an attitude of optimism. Even if that is right, getting a grip on trust requires a further conception of what the truster, qua truster, must be optimistic about. One standard answer here proceeds as follows: trust (at least, in the paradigmatic case of interpersonal trust) involves some form of optimism that the trustee will take care of things as we have entrusted them. In the special case of trusting the testimony of another—a topic at the centre of the epistemology of trust—this will involve at least some form of optimism that the speaker is living up to her expectations as a testifier; for instance, that the speaker knows what she says or, more weakly, is telling the truth.

Even at this level of specificity, though, the nature of trust remains fairly elusive. Does trusting involve (for example) merely optimism that the trustee will take care of things as entrusted, or does it also involve optimism that the trustee will do so compresently (that is, concurrently) with certain beliefs, non-doxastic attitudes, emotions or motivations on the part of the trustee, such as with goodwill (Baier 1986; Jones 1996). Moreover, and apart from such positive characterizations of trust, does trust also have a negative condition to the effect that one fails to genuinely trust another if one—past some threshold of vigilance—monitors the trustee (or otherwise, reflects critically on the trust relationship so as to attempt to minimize risk)?

These are among the questions that occupy philosophers working on the nature of trust. This section explores four subthemes aimed at clarifying trust’s nature: these concern (a) the distinction between trust and reliance; (b) two-place vs three-place trust; (c) doxastic versus non-doxastic conditions on trust; (d) deception detection and monitoring.

a. Reliance vs. Interpersonal Trust

Reliance is ubiquitous. You rely on the weather not to suddenly drop by 20 degrees, leaving you shivering; you rely on your chair not to give out, causing you to tumble to the floor. In these cases, are you trusting the weather and trusting your chair, respectively? Many philosophers working on trust believe the correct answer here is “no”. This is so even though, in each case, you are depending on these things in a way that leaves you potentially vulnerable.

The idea that trust is a kind of dependence that does not reduce to mere reliance (of the sort that might be apposite to things like chairs and the weather) is widely accepted. According to Annette Baier (1986: 244) the crux of the difference is that trust involves relying on another not just to take care of things any old way (for instance, out of fear, begrudgingly, accidentally, and so forth) but rather that they do so out of goodwill toward the truster; relatedly, a salient kind of vulnerability one subjects oneself to in trusting is vulnerability to the limits of that goodwill. On this way of thinking, then, you are not trusting someone if you (for instance) rely on that person to act in a characteristically self-centred way, even if you depend on them to do so, and even if you fully expect them to do so.

Katherine Hawley (2014, 2019) rejects the idea that what distinguishes trust from mere reliance has anything to do with the trustee’s motives or goodwill. Instead, on her account, the crucial difference is that in cases of trust, but not of mere reliance, a commitment on the part of the trustee must be in place. Consider a situation in which you reliably bring too much lunch to work, because you are a bad judge of quantities, and I get to eat your leftovers. My attitude to you in this situation is one of reliance, but not trust; in Hawley’s view, that is because you have made no commitment to provide me with lunch:

However, if we adapt the case so as to suggest commitment, it starts to look more like a matter of trust. Suppose we enjoy eating together regularly, you describe your plans for the next day, I say how much I’m looking forward to it, and so on. To the extent that this involves a commitment on your part, it seems reasonable for me to feel betrayed and expect apologies if one day you fail to bring lunch and I go hungry (Hawley 2014: 10).

If it is right that trust differs in important ways from mere reliance, then a consequence is that while reliance is something we can have toward people (when we merely depend on them) as well as toward objects (for instance, when we depend on the weather and chairs), not just anything can be genuinely trusted. Karen Jones (1996) captures this point, one that circumscribes people as the fitting objects of genuine trust, as follows:

One can only trust things that have wills, since only things with wills can have goodwills—although having a will is to be given a generous interpretation so as to include, for example, firms and government bodies. Machinery can be relied on, but only agents, natural or artificial, can be trusted (1996: 14).

If, as the foregoing suggests, trust relationships are best understood as a special subset of reliance relationships, should we also expect the appropriate attitudes toward misplaced trust to be a subset of a more general attitude-type we might have in response to misplaced reliance?

Katherine Hawley (2014) thinks so. As she puts it, misplaced trust warrants a feeling of betrayal. But the same is not so for misplaced (mere) reliance. Suppose, to draw from an example she offers (2014: 2) that a shelf you rely on to support a vase gives out; it would be inappropriate, Hawley maintains, to feel betrayed, even if a more general attitude of (mere) disappointment befits such misplaced reliance. Misplaced trust, by contrast, befits a feeling of betrayal.

In contrast with the above thinking, according to which disanalogies between trust and mere reliance are taken to support distinguishing trust from reliance, some philosophers have taken a more permissive approach to trust, by distinguishing between two senses of trust that differ with respect to the similarity of each to mere reliance.

Paul Faulkner (2011: 246; compare McMyler 2011), for example, distinguishes between what he calls predictive and affective trust. Predictive trust involves merely reliance in conjunction with a belief that the trustee will take care of things (namely, a prediction). Misplaced predictions warrant disappointment, not betrayal, and so predictive trust (like mere reliance) cannot be betrayed. Affective trust, by contrast, is a thick, interpersonal normative notion, and, according to Faulkner, it involves, along with reliance, a kind of normative expectation to the effect that the trustee (i) ought to prove dependable; and that they (ii) will prove dependable for that reason. On this view, it is affective trust that is uniquely subject to betrayal, even though predictive trust, which is a genuine variety of trust, is not.

b. Two-place vs. Three-place Trust

The distinction between two-place and three-place trust, first drawn by Horsburgh (1960), is meant to capture a simple idea: sometimes when we trust someone, we trust them to do some particular thing (see also Holton 1994; Hardin 1992), for example, you might trust your neighbour to water your plant while you are away on holiday but not to look after your daughter. This is three-place trust, with an infinitival component (schematically: A trusts B to X). Not all trusting fits this schema. You might also simply trust your neighbour generally (schematically: A trusts B) and in a way that does not involve any particular task in mind. Three- and two-place trust are thus different in the sense that the object of trust is specified in the former case but not in the latter.

While there is nothing philosophically contentious about drawing this distinction, the relationship between two- and three-place trust becomes contested when one of these kinds of trust is taken to be, in some sense, more fundamental than the other. To be clear, it is uncontentious that philosophers, as Faulkner (2015: 242) notes, tend to “focus” on three-place trust. What is contentious is whether any—and if so, which—of these notions is theoretically more basic.

The overwhelming view in the literature maintains that three-place trust is the fundamental notion and that two-place (as well as one-place) trust are derivative upon three-place trust (Baier 1986; Holton 1994; Jones 1996; Faulkner 2007; Hieronymi 2008; Hawley 2014; compare Faulkner 2015). This view can be called three-place fundamentalism.

According to Baier, for instance, trust is centrally concerned with “one person trusting another with some valued thing” (1986: 236) and for Hawley, trust is “primarily a three-place relation, involving two people and a task” (2014: 2). We might think of two-place (X trusts Y) trust as derived from three-place trust (X trusts Y to phi) in a way that is broadly analogous to how one might extract a diachronic view of someone on the basis of discrete interactions, as opposed to starting with any such diachronic view. On this way of thinking, three-place trust leads to two place trust over time, and is established on the basis of it.

Resistance to three-place fundamentalism has been advanced by Faulkner (2015) and Domenicucci and Holton (2017). Faulkner takes as a starting point that it is a desideratum on any plausible account of trust that it should accommodate infant trust, and thus, “that it not make essential to trusting the use of concepts or abilities which a child cannot be reasonably believed to possess” (1986: 244). As Faulkner (2015: 5) maintains, however, an infant, in trusting its mother “need not have any further thought; the trust is no more than a confidence or faith – a trust, as we say – in his mother”. And so, Faulkner reasons, if we take Baier’s constraint seriously, then we “have to take two-place trust as basic rather than three-place trust.”

A second strand of arguments against three-place fundamentalism is owed to Domenicucci and Holton (2017). According to them, the kind of derivation of two-place trust from three-place trust that is put forward by three-place fundamentalists is implausible for other similar kinds of attitudes like love and friendship:

No one—or at least, hardly anyone—thinks that we should understand what it is for Antony to love Cleopatra in terms of the three place relation ‘Antony loves Cleopatra for her __’, or in terms of any other three-place relation. Likewise hardly anyone thinks that we should understand the two place relation of friendship in terms of some underlying three-place relation […]. To this extent at least, we suggest that trust might be like love and friendship (2017: 149-50).

In response to this kind of argument by association, a proponent of three-place fundamentalism might either deny that these three- to two-place derivations are really problematic in the case of love or friendship, or instead grant that they are and maintain that trust is disanalogous.

In order to get a better sense of whether two-place trust might be unproblematically derived from three-place trust, regardless of whether the same holds mutatis mutandis for love in friendship, it will be helpful to look at a concrete attempt to do so. For example, according to Hawley (2014), three-place trust should be analyzed as: X relies on Y to phi because Y believes Y has a commitment to phi. And then, two-place trust defined simply as “reliance on someone to fulfil whatever commitments she may have” (2014: 16). If something like Hawley’s reduction is unproblematic, then, as one line of response might go, this trumps whatever concerns one might have about the prospects of making analogous moves in the love and friendship cases.

c. Trust and Belief: Doxastic, Non-doxastic and Performance-theoretic Accounts

Where does belief fit in to an account of trust? In particular, what beliefs (if any) must a truster have about whether the trustee will prove trustworthy? Proponents of doxastic accounts (Adler 1994; Hieronymi 2008; Keren 2014; McMyler 2011) hold that trust involves a belief on the part of the truster. On the simpler, straightforward incarnation of this view, when A trusts B to do X, A believes that B will do X. Other theorists propose more sophisticated belief-based accounts: on Hawley’s (2019) account, for instance, to trust someone to do something is to believe that she has a commitment to doing it, and to rely upon her to meet that commitment. Conversely, to distrust someone to do something is to believe that she has a commitment to doing it, and yet not rely upon her to meet that commitment.

Non-doxastic accounts (Jones 1996; McLeod 2002; Paul Faulkner 2007; 2011; Baker 1987) have a negative and a positive thesis. The negative thesis is just the denial of the belief requirement on trust that proponents of doxastic accounts accept (namely, a denial that trusting someone to do something entails the corresponding belief that they will do that thing). This negative thesis, to note, is perfectly compatible with the idea that trust oftentimes involves such a belief. What is maintained is that it is not essential. The positive thesis embraced by non-doxastic accounts involves a characterization of some further non-doxastic attitude the truster, qua truster, must have with respect to the trustee’s proving trustworthy.

An example of such a further (non-doxastic) attitude, on non-doxastic accounts, is optimism. For example, on Jones’ (1996) view, you trust your neighbour to bring back the garden tools you loaned her only if you are optimistic that she will bring them back, and regardless of whether you believe she will. It should be pointed out that oftentimes, optimism will lead to the acquisition of a corresponding belief. Importantly for Jones, the kind of optimism that characterizes trust is not itself to be explained in terms of belief but rather in terms of affective attitudes entirely. Such a commitment is more generally shared by non-doxastic views which take trust to involve affective attitudes that might be apt to prompt corresponding beliefs.

Quite a few important debates about trust turn on the matter of whether a doxastic account or a non-doxastic account is correct. For example, discussions of the rationality of trust will look one way if trust essentially involves belief and another way if it does not (Jones 1996; Keren 2014). Relatedly, what one says about trust and belief will bear importantly on how one thinks about the relationship between trust and monitoring, as well as the distinction between paradigmatic trust and therapeutic trust (the kind of trust one engages in in order to build trustworthiness; see Horsburgh 1960; Hieronymi 2008; Frost-Arnold 2014).

A notable advantage of the doxastic account is that it simplifies the epistemology of trust—and in particular, how trust can provide reasons for belief. Suppose, for instance, that the doxastic account is correct, and so your trusting your colleague’s word that they will return your laptop involves believing that they will return your laptop. This belief, some think, conjoined with the fact that your colleague tells you they will return your laptop, gives you a reason to believe that they will return your laptop. As Faulkner (2017: 113) puts it, on the doxastic account, “[t]rust gives a reason for belief because belief can provide reason for belief”. Non-doxastic accounts, by contrast, require further explanation as to why trusting someone would ever give you a reason to believe what they say.

Another advantage of doxastic accounts is that they are well-positioned to distinguish trusting someone to do something and mere optimistic wishing. Suppose, for instance, you loan £100 to a loved one with a terrible track record for repaying debts. Such a person may have lost your trust years ago, and yet you may remain optimistic and wishful that they will be trustworthy on this occasion. What distinguishes this attitude from genuine trust on the doxastic account is simply that you lack any belief that your loved one will prove trustworthy. Explaining this difference is more difficult on non-doxastic accounts. This is especially the case on non-doxastic accounts according to which trust not only does not involve belief but positively precludes it, by essentially involving a kind of “leap of faith” (Möllering 2006) that differs in important ways from belief.

Nonetheless, non-doxastic accounts have been emboldened in light of various serious objections that have been raised to doxastic accounts. One often raised objection of this kind highlights a key disanalogy with respect to how trust and belief interact with evidence, respectively (Faulkner 2007):

[Trust] need not be based on evidence and can demonstrate a wilful insensitivity to the evidence. Indeed there is a tension between acting on trust and acting on evidence that is illustrated in the idea that one does not actually trust someone to do something if one only believes they will do it when one has evidence that they will (2007: 876).

As Baker (1987) unpacks this idea, trusting can require ignoring counterevidence—as one might do when one trusts a friend despite evidence of guilt—whereas believing does not.

A particular type of example that puts pressure on doxastic accounts’ ability to accommodate dis-analogies with belief concerns therapeutic trust. In cases of therapeutic trust, the purpose of trusting is to promote trustworthiness, and is thereby not predicated on prior belief of trustworthiness. Take a case in which one trusts a teenager with an important task, in hopes that by trusting them, it will then lead them to become more trustworthy in the future. In this kind of case, we are plausibly trusting, but not on the basis of prior evidence or belief we have that the trustee will succeed on this occasion. To the contrary: we trust with the aim of establishing trustworthiness (Frost-Arnold 2014; Faulkner 2011). To the extent that such a description of this kind of case is right, therapeutic trust offers a counterexample to the doxastic account, as it involves trust in the absence of belief.

A third kind of account—the performance-theoretic account of trust (Carter 2020a, 2020c)—makes no essential commitment as to whether trusting involves belief. On the performance-theoretic account, what is essential to the attitude of trusting is how it is normatively constrained. An attitude is a trust attitude (toward a trustee, T, and a task, X) just in case the attitude is successful if and only if T takes care of X as entrusted to. Just as there is a sense in which, for example, your archery shot is not successful if it misses the target (see, for example, Sosa 2010a, 2015; Carter 2020b), your trusting someone to keep a secret misses its mark, and so fails to be successful trust, if the trustee spills the beans. With reference to this criterion of successful (and unsuccessful) trust, the performance-theoretic account aims to explain what good and bad trusting involves (see §2.a), and also why some variety of trust is more valuable than others (see §3).

d. Deception Detection and Monitoring

Given that trusting inherently involves the incurring of some level of risk to the truster, it is natural to think that trust would in some way be improved by the truster doing what she can to minimize such risk, for example, by monitoring the trustee with an eye to pre-empting any potential betrayal or at least mitigating the expected disvalue of potential betrayal.

This prima facie plausible suggestion, however, raises some perplexities. As Annette Baier (1986) puts it: “Trust is a fragile plant […] which may not endure inspection of its roots, even when they were, before inspection, quite healthy” (1986: 260). There is something intuitive about this point. If, for instance, A trusts B to drive the car straight home after work—but then proceeds to surreptitiously drive behind B the entire way in order to make sure that B really does drive straight home, it seems that A in doing so is no longer trusting B. The trust, it seems, dissolves through the process of such monitoring.

Extrapolating from such cases, it seems that trust inherently involves not only subjecting oneself to some risk, but also remaining subjected to such risk—or, at least—behaving in ways that are compatible with one’s viewing oneself as (remaining to be) subjected to such risk.

The above idea of course needs sharpened. For example, trusting is plausibly not destroyed by negligible monitoring. The crux of the idea seems to be, as Faulkner (2011, §5) puts it, that “too much reflection” on the trust relation, perhaps in conjunction with making attempts to minimize risks that trust will be betrayed, can undermine trust. Specifying what “too much reflection” or monitoring involves, however, and how reflecting relates to monitoring to begin with, remains a difficult task.

One form of monitoring—construed loosely—that is plausibly compatible with trusting is contingency planning (Carter 2020c). For example, suppose you trust your teenager to drive your car to work and back in order that they may undertake a summer job. A prudent mitigation against the additional risk incurred (for instance, that the car will be wrecked in the process) will be to buy some additional insurance upon entrusting the teenager with the car. The purchasing of this insurance, however, does not itself undermine the trusting relationship, even though it involves a kind of risk mitigating behaviour.

One explanation here turns on the distinction between (i) mitigating against the risk that trust will be betrayed; and (ii) mitigating against the extent or severity of the harm or damage incurred if trust is betrayed. Contingency planning involves type-(ii) mitigation, whereas, for example, trailing behind the teenager with your own car, which is plausibly incompatible with trusting, is of type-(i).

2. The Normativity of Trust

Norms of trust arise between the two parties of reciprocal trust: a norm to be trusting in response to the invitation to trust, and to be trustworthy in response to the other’s trusting reliance (Fricker 2018). The former normativity lies “on the truster’s side”, and the latter on the trustee’s side. In this section, we discuss norms on trusting by looking at these two kinds of norms—that govern the truster and the trustee, respectively—in turn.

This section first discusses general norms on trusting on the truster’s side, and then engages—in some detail—with the complex issue of the norms governing trust in another’s words specifically. Second, it discusses the normativity of trust on the trustee’s side and the nature of trustworthiness.

a. Entitlement to Trust

If—as doxastic accounts maintain—trust is a species of belief (Hieronymi 2008), then the rational norms governing trust govern belief, such that (for example) it will be irrational to trust someone whom you have strong evidence to be unreliable, and the norm violation here is the same kind of norm violation in play in a case where one simply believes, against the evidence, that an individual is trustworthy. Thus: to the extent that one is rationally entitled to believe the trustee is trustworthy, with respect to F, one thereby has an entitlement (on these kinds of views) to trust the trustee to F.

The norms that govern trust on the truster’s side will look different on non-doxastic accounts. For example, on a proposal like Frost-Arnold’s (2014), according to which trust is characterized as a kind of non-doxastic acceptance rather than as belief, the rationality governing trusting will be the rationality of acceptance, where rational acceptance can in principle come apart from rational belief. For one thing, whereas the rationality of belief is exclusively epistemically constrained, the rationality of acceptance need not be. In cases of therapeutic trust, for example, it might be practically rational (namely, rational with reference to the adopted end of building a trusting relationship) to accept that the trustee will F, and thus, to use the proposition that they will F as a premise in practical deliberation (see Bratman 1992; Cohen 1989)—that is, to act as if it is true that they will F. Of course, acting as if a proposition is true neither implies nor is implied by believing that it is true.

On performance-theoretic accounts, trusting is subject, on the truster’s side, to three kinds of evaluative norms, which correspond with three kinds of positive evaluative assessments: success, competence, and aptness. Whereas trusting is successful if and only if the trustee takes care of things as intrusted, trusting is competent if and only if one’s trusting issues from a reliable disposition—namely, a competence—to trust successfully when appropriately situated (for discussion, see Carter 2020a).

Just as successful trust might be incompetent as when one trusts someone with a well-known track record of unreliability who happens to prove trustworthy on this particular occasion, likewise, trust might fail to be successful despite being competent, as when one trusts an ordinarily reliable individual who, due to fluke luck, fails to take care of things as entrusted on this particular occasion. Even if trust is both successful and competent, however, there remains a sense in which it could fall short of the third kind of evaluative standard—namely, aptness. Aptness demands success because competence, and not merely success and competence (see Sosa 2010a, 2015; Carter 2020a, 2020b). Trust is apt, accordingly, if and only if one trusts successfully such that the successful trust manifests her trusting competence.

b. Trust in Words

Why not lie? (Or, more generally, why not promise to take care of things, and then renege on that promise whenever it is convenient to do so?) According to a fairly popular answer (Faulkner 2011; Simion 2020b), deception is bad not only for the deceived, but it is bad likewise for the deceiver (see also Kant). If one cultivates a reputation as being untrustworthy, then this comes with practical costs in one’s community; the untrustworthy person, recognized as such, is outcast, and de facto foregoes the (otherwise possible) social benefits of trusting.

Things are more complicated, however, in one-off trust-exchanges—where the risk of the disvalue of cultivating an untrustworthy reputation is minimal. The question can be reposed within the one-off context: why not lie and deceive, when it is convenient to do so, in one-off exchanges? In one-off interactions where we (i) do not know others’ motivations but (ii) do appreciate that there is a general motivation to be unreliable (for example, to reap gains of betrayal), it is surprising that we find as much trustworthy behaviour as we do. Why do people not betray to a greater extent than they do in such circumstances, given that betrayal seems prima facie the most rational decision-theoretic move?

According to Faulkner, when we communicate with another as to the facts, we face a situation akin to a prisoner’s dilemma (2011: 6). In a prisoner’s dilemma, our aggregate well-being will be maximized if we both cooperate. However, given the logic of the situation, it looks like the rational thing to do for each of us is to defect. We are then faced with a problem: how to ensure the cooperative outcome?

Similarly, Faulkner argues, speakers and audiences have different interests in communication. The audience is interested in learning the truth. In contrast, engaging in conversations is to the advantage of speakers because it is a means of influencing others: through an audience’s acceptance of what we say, we can get an audience to think, feel, and act in specific ways. So, according to Faulkner, our interest, qua speakers’, is being believed, because we have a more basic interest in influencing others. The commitment to telling the truth would not be best for the speaker. The best outcome for a speaker would be to receive an audience’s trust and yet have the liberty to tell the truth or not (2011: 5-6).

There are four main reactions to this problem in the literature in the epistemology of testimony. According to Reductionism (Adler 1994; Audi 1997, 2004, 2006; Faulkner 2011; Fricker 1994, 1995, 2017, 2018; Hume 1739; Lipton 1998; Lyons 1997), in virtue of this lack of alignment of hearer and speaker interests, one needs positive, independent reasons to trust their speaker: since communication is like a prisoner’s dilemma, the hearer needs a reason for thinking or presuming that the speaker has chosen the cooperative, helpful outcome. Anti-Reductionism (Burge 1993, 1997; Coady 1973, 1992; Goldberg 2006, 2010; Goldman 1999; Graham 2010, 2012a, 2015; Greco 2015, 2019; Green 2008; Reid 1764; Simion 2020b; Simion and Kelp 2018) rejects this claim. According to these philosophers, we have a default (absent defeaters) entitlement to believe what we are being told. In turn, this default entitlement is derivable on a priori grounds from the nature of reason (Burge 1993, 1997), sourced in social norms of truth-telling (Graham 2012b), social roles (Greco 2015), the reliance on other people’s justification-conferring processes (Goldberg 2010), or from the knowledge norm of assertion (Simion 2020b). Other than these two main views, we also encounter hybrid views (Lackey 2003, 2008; Pritchard 2004) that try to impose weaker conditions on testimonial justification than Reductionism, while, at the same time, not being as liberal about it as Anti-Reductionism. Last but not least, a fourth reaction to Faulkner’s problem of cooperation for testimonial exchanges is scepticism (Graham 2012a; Simion 2020b); on this view, the problem does not get off the ground to begin with.

According to Faulkner himself, trust lies at the heart of the solution to his problem of cooperation, that is, it gives speakers reasons to tell the truth (2011, Ch. 1; 2017). Faulkner thinks that the problem is resolved “once one recognizes how trust itself can give reasons for cooperating” (2017: 9). When the hearer H believes that the speaker S can see that H is relying on S for information about whether p, and in addition H trusts S for that information, then H will make a number of presumptions: 1. H believes that S recognizes H’s trusting dependence on S proving informative; 2. H presumes that if S recognizes H’s trusting dependence, then S will recognize that H normatively expects S to prove informative; 3. H presumes that if S recognizes H’s expectation that S should prove informative, then, other things being equal, S will prove informative for this reason; 4. Hence, taking the attitude of trust involves presuming that the trusted will prove trustworthy (2011: 130). The hearer’s presumption that the speaker will prove informative rationalizes the hearer’s uptake of the speaker testimony.

Furthermore, Faulkner claims, H’s trust gives S “a reason to be trustworthy”, such that S is, as a result, more likely to be trustworthy: it raises the objective probability that S will prove informative in utterance. In this fashion, “acts of trust can create as well as sustain trusting relations” (2011: 156-7). As Graham (2012a) puts it, “the hearer’s trust—the hearer’s normative expectation, which rationalizes uptake—then ‘engages,’ so to speak, the speaker’s internalization of the norm, which thereby motivates the speaker to choose the informative outcome.” Speakers who have internalized these norms will then often enough choose the informative outcome when they see that audiences need information; they will be “motivated to conform” because they have “internalized the norm” and so “intrinsically value” compliance (2011: 186). As such, the de facto reliability of testimony is explained by the fact that the trust placed in hearers by the speakers triggers, on the speakers’ side, the internalization of social norms of trust, which, in turn, makes speakers objectively likely to put hearers’ informational interests before their own.

According to Peter Graham (2012a), however, Faulkner’s own solution threatens to dissolve the problem of cooperation rather than solve it (Graham 2012a). Recall how the problem was set up: the thought was that speakers only care about being believed, whether they are speaking the truth or not, which is why the hearer needs some reason for thinking the speaker is telling them the truth. But if speakers have internalized social norms of trustworthiness, it is not true that speakers are just as apt to prove uninformative as informative. It is not true that they are only interested in being believed. Rather, they are out to inform, to prove helpful; due to having internalized the relevant trustworthiness norms, speakers are committed to informative outcomes (Graham 2012a).

Another version of scepticism about the problem of cooperation is voiced in Simion’s “Testimonial Contractarianism” (2020b). Recall that, according to Faulkner, in testimonial exchanges, the default position for speakers involves no commitment to telling the truth. If that is the case, he argues, the default position for hearers involves no entitlement to believe. Here is the argument unpacked:

(P1) Hearers are interested in truth; speakers are interested in being believed.

(P2) The default position for speakers is seeing to their own interests rather than to the interests of the hearers.

(P3) Therefore, it is not the case that the default position for speakers is telling the truth (from 1 and 2).

(P4) The default position for hearers is trust only if the default position for speakers is telling the truth.

(C) Therefore, it is not the case that the default position for hearers is trust (from 3 and 4).

There is one important worry for this argument: on the reconstruction above, the conclusion does not follow. In particular, the problem is with premise (3), which is not supported by (1) and (2) (Simion 2020b). That is because being interested in being believed does not exclude also being interested in telling the truth. Speakers might still—by default—also be interested in telling the truth on independent grounds, that is, independently of their concern (or, rather, lack thereof) with hearers’ interests; indeed, the sources of entitlement proposed by the Anti-Reductionist—for instance, the existence of social norms of truth-telling, the knowledge norm of assertion and so forth—may well constitute themselves in reasons for the speaker to tell the truth—absent overriding incentive to do otherwise. If that is the case, telling the truth will be default for hearers, therefore, trust will be default for hearers. What the defender of the Problem of Cooperation needs, then, for validity, is to replace (P1) with the stronger (P1*): Hearers are interested in truth; speakers are only interested in being believed. However, it is not clear that (P1*) spells out the correct utility profile of the case: are all speakers really such that they only care about being believed? This seems like a fairly heavy empirical assumption that is in need of further defence.

c. Obligation to Trust

A final normative species that merits discussion on the truster’s side is the obligation to trust. Obligations to trust can be generated, trivially, by promise-making (compare Owens 2017) or by other kinds of cooperative agreements (Faulkner 2011, Ch. 1). Of more philosophical interest are cases where obligations to trust are generated without explicit agreements.

One case of particular interest here arises in the literature on testimonial injustice, pioneered by Miranda Fricker (2007). Put roughly, testimonial injustice occurs when a speaker receives an unfair deficit of credibility from a hearer due to prejudice on the hearer’s part, resulting in the speaker’s being prevented from sharing what she knows.

An example of testimonial injustice that Fricker uses as a reference point is from Harper Lee’s To Kill a Mockingbird, where Tom Robinson, a black man on trial after being falsely accused of raping a white woman, has his testimony dismissed due to the prejudiced preconceptions on the part of the jury which owes to deeply seated racial stereotypes. In this case, the jury makes a deflated credibility judgement of Robinson, and as a result, he is unable to convey to them the knowledge that he has of the true events which occurred.

On one way of thinking about norms of trusting on the truster’s side, the members of the jury have mere entitlements to trust Robinson’s testimony though no obligation to do so; thus, their distrust of Robinson is not norm-violating. This gloss of the situation, on Fricker’s view, is incomplete; it fails to take into account the sense in which Robinson is wronged in his capacity as a knower as a result of this distrust. An appreciation of this wrong, according to Fricker, should lead us to think of the relevant norm on the hearer’s side as generating an obligation rather than a mere permission to believe; as such, on this view, distrust that arises from affording a speaker a prejudiced credibility deficit is not merely an instance of foregoing trusting when one is entitled to trust, but failing to trust when one should. For additional work discussing the relationship between trust and testimonial injustice see, for example, Origgi (2012); Medina (2011); Wanderer (2017); Carter and Meehan (2020).

Fricker’s ground-breaking proposal concerns cases when one is harmed in their capacity as a knower via distrust sourced in prejudice. That being said, several philosophers believe that the phenomenon generalizes beyond cases of distinctively prejudicial distrust; that is, that it lies in the nature and normativity of telling that we have a defeasible obligation to trust testifiers, and that failure to do so is impermissible, whether it is sourced in prejudice or not. Indeed, G. E. M. Anscombe (1979) and J. L. Austin (1946) famously believed that you can insult someone by refusing their testimony.

We can distinguish between three general accounts of what it is that hearers owe to speakers and why: presumption-based accounts, purport-based accounts, and function-based accounts. The key questions for all accounts are whether they successfully deliver an obligation to trust, what rationale they provide, and whether their rationale is ultimately satisfactory.

While there are differences in the details, the core idea behind presumption-based views (Gibbard 1990, Hinchman 2005, Moran 2006, Ridge 2014) is that when a speaker S tells a hearer H that p, say, S incurs certain responsibilities for the truth of p. Crucially, H, in virtue of recognising what S is doing, thereby acquires a reason for presuming S to be trustworthy in their assertion that p. But since anyone who is to be presumed trustworthy in asserting that p ought to be believed, we get exactly what we were looking for: an obligation to trust speakers alongside an answer to the rationale question.

Of course, the question remains whether the rationale provided is ultimately convincing. Sandy Goldberg (2020) argues that the answer is no. To see what he takes to be the most important reason for this, one should first look at a distinction Goldberg introduces between a practical entitlement to hold someone responsible and an epistemic entitlement to believe that they are responsible. Crucially, one can have the former without the latter. For instance, if your teenager tells you that they will be home by midnight and they are not, you will have a practical entitlement to hold them responsible even if you do not have an epistemic entitlement to believe that they are responsible. Importantly, to establish a presumption of trustworthiness, you need to make a case for an epistemic entitlement to believe. According to Goldberg, however, presumption-based accounts only deliver an entitlement to hold speakers responsible for their assertions, not an entitlement to believe that they are responsible. That is to say, when S tells H that p and thereby incurs certain responsibilities for the truth of p and when H recognises that this is what S is doing, H comes by an entitlement to hold S responsible for the truth of p. Crucially, to get to the presumption of trustworthiness we need more than this, as the case of the teenager clearly indicates. But presumption-based accounts do not offer more (Goldberg 2020, Ch. 4).

Another problem for these views is sourced in the fact that extant presumption-based accounts are distinctively personal: all accounts share the idea that in telling an addressee that p, speakers perform a further operation on them and that it is this further operation that generates the obligation on the addressee’s side. In virtue of this, presumption-based accounts deliver too limited a presumption of trustworthiness. To see this, we should go back to Fricker’s cases of epistemic injustice: it looks as though, not believing what a testifier says in virtue of prejudice is equally bad, whether one is the addressee of the instance of telling in question or merely overhears it (Goldberg 2020).

Goldberg’s own alternative proposal is purport-based: according to him, assertion has a job description, which is to present a content as true in such a way that, were the audience to accept it on the basis of accepting the speaker’s speech contribution, the resulting belief would be a candidate for knowledge (Goldberg 2020, Ch. 5). Since assertion has this job description, when speakers make assertions, they purport to achieve exactly what the job description says. Moreover, it is common knowledge that this is what speakers purport to do. But since assertion will achieve its job description only if the speaker meets certain epistemic standards and since this is also common knowledge, the audience will recognise that the performed speech act achieves its aim only if the relevant epistemic standards are met. Finally, this exerts normative pressure on hearers. To be more precise, hearers owe it to speakers to recognize them as agents who purport to be in compliance with the epistemic standards at work and to treat them accordingly.

According to Goldberg, our obligation toward speakers is weaker than presumption-based accounts would have it: in the typical case of testimony, what we owe to the speakers is not to outright believe them, but rather to properly assess their speech act epistemically. The reason for this, Goldberg argues, is that we do not have access to their evidence, or their deliberations; given that this is so, the best we can do is to adjust our doxastic reaction to “a proper (epistemic) assessment of the speaker’s epistemic authority, since in doing so they are adjusting their doxastic reaction to a proper (epistemic) assessment of the act in which she conveyed having such authority” (Goldberg 2020, Ch. 5).

As a first observation, note that Goldberg’s purport-based account deals better with cases of testimonial injustice than presumption-based accounts. After all, since the normative pressure is generated by the fact that it is common knowledge that in asserting speakers represent themselves as meeting the relevant epistemic standards, the normative pressure is on anyone who happens to listen in the conversation, not just on the direct addressees of the speech act.

With this point in play, let us return to Goldberg’s argument that there is no obligation to believe. According to Goldberg, this is because hearers do not have access to speakers’ reasons and their deliberations. One question is why exactly this should matter. After all, one might argue, the fact that the speaker asserted that p provides them with sufficient reason to believe that p (absent defeat, of course). That the assertion does not also give hearers access to the speakers’ own reasons and deliberations does nothing to detract from this, unless one endorses dramatically strong versions of reductionism about testimony (which Goldberg himself would not want to endorse). If so, the fact that assertions do not afford hearers access to speakers’ reasons and deliberations provides little reason to believe that there is no obligation to believe on the part of the hearer (Kelp & Simion 2020a).

An alternative way to ground an obligation to trust testimony (Kelp & Simion 2020a) relies on the plausible idea that the speech act of assertion has the epistemic function to generate true belief (Graham 2010), or knowledge (Kelp 2018; Kelp & Simion 2020a; Simion 2020a). According to this view, belief-responses on behalf of hearers contribute to the explanation of the continuous existence of the practice of asserting: were hearers to stop believing what they are being told, speakers would lose incentive to assert, and the practice would soon disappear. Since this is so, and since hearers are plausibly criticism-averse, it makes sense to have a norm that imposes an obligation on the hearers to believe what they are being told (absent defeat). Like that, in virtue of their criticism-aversion, hearers will reliably obey the norm—that is, will reliably form the corresponding beliefs—which, in turn, will keep the practice of assertion going (Kelp & Simion 2020a, Ch. 6).

One potential worry for this view is that it does not deliver the “normative oomph” that we want from a satisfactory account of the hearer’s obligation to trust: think of paradigm cases of epistemic injustice again. The hearers in these cases seem to fail in substantive moral and epistemic ways. However, on the function-based view, their failure is restricted to breaking a norm internal to the practice of assertion. Since norms internal to practices need not deliver substantive oughts outside of the practice itself—think, for instance, of rules of games—the function-based view still owes us an account of the normative strength of the “ought to believe” that drops out of their picture.

d. Trustworthiness

As the previous sections of this article show, trust can be a two-place or a three-place relation. In the former case, it is a relation between a trustor and a trustee, as in “Ann trusts George”. Two-place trust seems to be a fairly demanding affair: when we say that Ann trusts George simpliciter, we seem to attribute a fairly robust attitude to Ann, one whereby she trusts him in (at least) several respects. In contrast, three-place trust is a less involved affair: when we say that Ann trusts George to do the dishes, we need not say much about their relationship otherwise.

This contrast is preserved when we switch from focusing on the trustor’s trust to the trustee’s trustworthiness. That is, one can be trustworthy simpliciter (corresponding to a two-place trust relation) but one can also be trustworthy with regard to a particular matter—that is, two-place trustworthiness (Jones 1996) —corresponding to three-place trust. For instance, a surgeon might well be extremely trustworthy when it comes to performing surgery well, but not in any other respects.

Some philosophers working on trustworthiness focus more on two-place trust. As such, since the two-place trust relation is intuitively a more robust one, they put forward accounts of trustworthiness that are generally quite demanding, in that they require the trustee to be reliably making good on their commitments, but also to do so out of the right motive.

The classic account of such kind is Annette Baier’s goodwill-based account; in a similar vein, others combine reliance on goodwill with certain expectations (Jones 1996) including in one case a normative expectation of goodwill (Cogley 2012). According to this kind of view, the trustworthy person fulfils their commitments in virtue of their goodwill toward the trustor. This view, according to Baier, makes sense of the intuition that there is a difference between trustworthiness and mere reliability, that corresponds to the difference between trust and mere reliance.

The most widely spread worry about these accounts of trustworthiness is that they are too strong: we can trust other people without presuming that they have goodwill. Indeed, our everyday trust in strangers falls into this category. If so, the argument goes, this seems to suggest that whether or not people are making good on their commitments out of goodwill or not is largely inconsequential: “[w]e are often content to trust without knowing much about the psychology of the one-trusted, supposing merely that they have psychological traits sufficient to get the job done” (Blackburn 1998).

Another worry for these accounts is that, while plausible as accounts of trustworthiness simpliciter, they give counterintuitive results in cases of two-place trustworthiness: indeed, whether George is trustworthy when it comes to washing the dishes or not seems not to depend on his goodwill, nor on other such noble motives. The goodwill view is too strong.

Furthermore, it looks as though there is a reason to believe the goodwill view is, at the same time, too weak. To see this, consider the case of a convicted felon and his mother: it looks as though they can have a goodwill-based relationship, and thus be trustworthy within the scope thereof, while, at the same time, not being someone whom we would describe as trustworthy (Potter 2002: 8).

If all of this is true, it begins to look as though the presence of goodwill is independent of the presence of trustworthiness. This observation motivates accounts of trustworthiness that rely on less highbrow motives underlying the trustee’s reliability. One such account is the social contract view of trustworthiness. According to this view, the motives underlying people’s making good on their commitments are sourced in social norms and the unfortunate consequences to one’s reputation and general wellbeing of breaking them (Hardin 2002: 53; see also O’Neill 2002; Dasgupta 2000). Self-interest determines trustworthiness on these accounts.

It is easy to see that social contract views do well in accounting for trustworthiness in three-place trust relations: George is trustworthy when it comes to washing the dishes, on this view: he makes good on his commitments in virtue of social norms making it such that it is in his best interest to do so. The main worry for these views, however, is that they will be too permissive, and thus have difficulties in distinguishing between trustworthiness proper and mere reliability. Relatedly, the worry goes, these views seem less well equipped to deal with trustworthiness simpliciter, that is, the kind of trustworthiness that corresponds to a two-place trust relation. For instance, on a social contract view, it would seem that a sexist employer who treats female employees well only because he believes that he would face legal sanctions if he did not, will come out as trustworthy (Potter 2002: 5). This is intuitively an unfortunate result.

One thought that gets prompted by the case of the sexist employer is that trustworthiness is a character trait that virtuous people possess; after all, this seems to be something that the sexist employer is missing. On Nancy Potter’s view, trustworthiness is a disposition to respond to trust in appropriate ways, given “who one is in relation [to]” and given other virtues that one possesses or ought to possess (for example, justice, compassion) (2002: 25). According to Potter, a trustworthy person is “one who can be counted on, as a matter of the sort of person he or she is, to take care of those things that others entrust to one.

When it comes to demandingness, the virtue-based view seems to lie somewhere in-between the goodwill view, on one hand, and the social contract view, on the other. It seems more permissive than the former in that it can account for the trustworthiness of strangers insofar as they display the virtue at stake. It seems more demanding than the latter in that it purports to account for the intuition that mere reliability is not enough for trustworthiness: rather, what is required is reliability sourced in good character.

An important criticism of virtue-based views comes from Jones (2012). According to her, trustworthiness does not fit the normative profile of virtue in the following way: if trustworthiness was a virtue, then being untrustworthy would be a vice. However, according to Jones, that cannot be right: after all, we are often required to be untrustworthy in one respect or another—for instance, because of conflicting normative constraints—but it cannot be that being vicious is ever required.

Another problem for Potter’s specific view are its apparent un-informativeness; first, defining the trustworthy person as “a person who can be counted on as a matter of the sort of person he or she is” threatens vicious circularity: after all, it defines the trustworthy as those that can be trusted. Relatedly, the account turns out to be too vague to give definite predictions in a series of cases. Take again the case of the sexist employer: why is it that he cannot be “counted on, as a matter of the sort of person he is, to take care of those things that others entrust to one” in his relationship with his female employees? After all, in virtue of the sort of person he is—that is, the sort of person who cares about not suffering the social consequences of mistreating them—he can be counted on to treat his employees well. If that is so, Potter’s view will not do much better than social contract views when it comes to distinguishing trustworthiness from mere reliability.

Several philosophers propose less demanding accounts of trustworthiness. Katherine Hawley’s (2019) view falls squarely within this camp. According to her, trustworthiness is a matter of avoiding unfulfilled commitments, which requires both caution in incurring new commitments and diligence in fulfilling existing commitments. Crucially, on this view, one can be trustworthy regardless of one’s motives for fulfilling one’s commitments. Hawley’s is a negative account of trustworthiness, which means that one can be trustworthy while avoiding commitments as far as possible. Untrustworthiness can arise from insincerity or bad intentions, but it can also arise from enthusiasm and becoming over-committed. A trustworthy person must not allow her commitments to outstrip her competence.

One natural question that arises for this view is: how about commitments that we do not, but we should take on board? Am I a trustworthy friend if I never take on any commitments toward my friends? According to Hawley, in practice, through friendship, work and other social engagements we take on meta-commitments—commitments to incur future commitments. These can make it a matter of trustworthiness to take on certain new commitments.

Another view in a similar, externalist vein, is developed by Kelp and Simion (2020b). According to them, trustworthiness is a disposition to fulfil one’s obligations. What drives the view is the thought that one can fail to fulfil one’s commitments in virtue of being in a bad environment—an environment that “masks” the normative disposition in question—while, at the same time, remaining a trustworthy person. Again, on this view as well, whether the disposition in question is there in virtue of good will or not is inconsequential. That being said, their view can accommodate the thought that people who comply with a particular norm for the wrong reason are less trustworthy than their good-willing counterparts. To see how, take the sexist employer again: insofar as it is plausible that there are norms against sexism, as well as norms against mistreating one’s female employees, the sexist employer fulfils the obligations generated by the latter but not by the former. In this, he is trustworthy when it comes to treating his employees well, but not trustworthy when it comes to treating them well for the right reason.

Another advantage of this view is that it explains the intuitive difference in robustness between two-place trustworthiness and trustworthiness simpliciter. According to this account, one is trustworthy simpliciter when one meets a contextually-variant threshold of two-place trustworthiness for contextually-salient obligations. For instance, a philosophy professor is trustworthy simpliciter in the philosophy department just in case she has a disposition to meet enough of her contextually salient obligations: do her research and teaching, not be late for meetings, answer emails promptly, help students with their essays and so forth. Plausibly, some of these contextually salient obligations will include doing these things for the right reasons. If so, the view is able to account for the fact that trustworthiness simpliciter is more demanding than two-place trustworthiness.

3. The Value of Trust

Trust is valuable. Without it, we face not only cooperation problems, but we also incur substantial risks to our well-being—namely, those ubiquitous risks to life that characterize—at the limit case—the Hobbesian (1651/1970) “state of nature”. Accordingly, one very general argument for the value of trust appeals to the disutility of its absence (see also Alfano 2020).

Moreover, apart from merely serving as an enabling condition for other valuable things (like the possibility of large-scale collective projects for societal benefit), trust is also instrumentally valuable for both the truster and the trustee as a way of resolving particular (including one-off) cooperation problems in such a way as to facilitate mutual profit (see §2). Furthermore, trust is instrumentally valuable as a way of building trusting relationships (Solomon and Flores 2003). For example, trust can effectively be used—as when one trusts a teenager with a car to help cultivate a trust relationship—in order to make more likely the attainments of the benefits of trust (for both the truster and the trustee) further down the road (Horsburgh 1960; Jones 2004; Frost-Arnold 2014; see also the discussion of therapeutic trust above).

Apart from trust’s uncontroversial instrumental value (for helpful discussion, see O’Neill 2002), some philosophers believe that trust has final value. Something X is instrumentally valuable, with respect to an end, Y, insofar as it is valuable as a means to Y; instrumental value can be contrasted with final value. Something is finally valuable iff it is valuable for its own sake. A paradigmatic example of something instrumentally valuable is money, which we value because of its usefulness in getting other things; an example of something (arguably) finally valuable is happiness.

One way to defend the view that trust can be finally valuable, and not merely instrumentally valuable, is to supplement the performance-theoretic view of trust (see §1.c and §2.a) with some additional (albeit somewhat contentious) axiological premises as follows:

(P1) Apt trust is successful trust that is because of trust-relevant competence. (from the performance-theoretic view of trust)

(P2) Something is an achievement if and only if it is a success because of competence. (Premise)

(C1) So, apt trust is an achievement. (from P1 and P2)
(P3) Achievements are finally valuable. (Premise)

(C2) So, apt trust has final value. (from C1 and P3)

Premise (2) of the argument is mostly uncontentious, and is taken for granted widely in contemporary virtue epistemology (for instance, Greco 2009, 2010; Haddock, Millar, and Pritchard 2009; Sosa 2010b) and elsewhere (Feinberg 1970; Bradford 2013, 2015).

Premise (3), however, is where the action lies. Even if apt trust is an achievement, given that it involves a kind of success because of ability (that is, trust-relevant competences), we would need some positive reason to connect the “success because of ability” structure with final value if we are to accept (P3).

A strong line here defends (3) by maintaining that all achievements (including evil achievements and “trivial” achievements) are finally valuable, because successes because of ability (no matter what the success, no matter what the ability used) have a value that is not reducible to just the value of the success.

This kind of argument faces some well-worn objections (for some helpful discussions, see Kelp and Simion 2016; Dutant 2013; Goldman and Olsson 2009; Sylvan 2017). A more nuanced line of argument for C2 will weaken (3) so that it says, instead, that (3*) some achievements are finally valuable. But with this weaker premise in play, (3*) and (C1) no longer entail C2; what would be needed—and this remains an open problem for work on the axiology of trust—is a further premise to the effect that the kind of achievement that features in apt trust, specifically, is among the finally valuable rather than non-finally valuable achievements. And a defence of such a further premise, of course, will turn on further considerations about (among other things) the value of successful and competent trust, perhaps also in the context of wider communities of trust.

4. References and Further Reading

  • Adler, Jonathan E. 1994. ‘Testimony, Trust, Knowing’. The Journal of Philosophy 91 (5): 264–275.
  • Alfano, Mark. 2020. ‘The Topology of Communities of Trust’. Russian Sociological Review 15 (4): 3-57. https://doi.org/10.17323/1728-192X-2016-4-30-56/.
  • Anscombe, G. E. M. 1979. ‘What Is It to Believe Someone?’ In Rationality and Religious Belief, edited by C. F. Delaney, 141–151. South Bend: University of Notre Dame Press.
  • Audi, Robert. 1997. ‘The Place of Testimony in the Fabric of Knowledge and Justification’. American Philosophical Quarterly 34 (4): 405–422.
  • Audi, Robert. 2004. ‘The a Priori Authority of Testimony’. Philosophical Issues 14: 18–34.
  • Audi, Robert. 2006. ‘Testimony, Credulity, and Veracity’. In The Epistemology of Testimony, edited by Jennifer Lackey and Ernest Sosa, 25–49. Oxford University Press.
  • Austin, J. L. 1946. ‘Other Minds.’ Proceedings of the Aristotelian Society Supplement 20: 148–187.
  • Baier, Annette. 1986. ‘Trust and Antitrust’. Ethics 96 (2): 231–260. https://doi.org/10.1086/292745.
  • Baker, Judith. 1987. ‘Trust and Rationality’. Pacific Philosophical Quarterly 68 (1): 1–13. https://doi.org/10.1111/j.1468-0114.1987.tb00280.x.
  • Blackburn, Simon. 1998. Ruling Passions: A Theory of Practical Reasoning. Oxford University Press UK.
  • Bond Jr, Charles F., and Bella M. DePaulo. 2006. ‘Accuracy of Deception Judgments’. Personality and Social Psychology Review 10 (3): 214–234.
  • Bradford, Gwen. 2013. ‘The Value of Achievements’. Pacific Philosophical Quarterly 94 (2): 204–224.
  • Bradford, Gwen. 2015. Achievement. Oxford University Press.
  • Bratman, M. 1992. ‘Practical Reasoning and Acceptance in a Context’. Mind 101 (401): 1–16.
  • Burge, Tyler. 1993. ‘Content Preservation’. Philosophical Review 102 (4): 457–488.
  • Burge, Tyler. 1997. ‘Interlocution, Perception, and Memory’. Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 86 (1): 21–47.
  • Carter, J. Adam. 2020a. ‘On Behalf of a Bi-Level Account of Trust’. Philosophical Studies, 2020, issue 177, pages 2299–2322.
  • Carter, J. Adam. 2020b. ‘De Minimis Normativism: A New Theory of Full Aptness’. Philosophical Quarterly.
  • Carter, J. Adam. 2020c. ‘Therapeutic Trust’. Manuscript.
  • Carter, J. Adam, and Daniella Meehan. 2020. ‘Trust, Distrust, and Epistemic Injustice’. Educational Philosophy and Theory.
  • Coady, C. A. J. 1973. ‘Testimony and Observation’. American Philosophical Quarterly 108 (2): 149–55.
  • Coady, C. A. J. 1992. Testimony: A Philosophical Study. Oxford University Press.
  • Cogley, Zac. 2012. ‘Trust and the Trickster Problem’. Analytic Philosophy 53 (1): 30–47. https://doi.org/10.1111/j.2153-960X.2012.00546.x.
  • Cohen, L. Jonathan. 1989. ‘Belief and Acceptance’. Mind 98 (391): 367–389.
  • Dasgupta, Partha. 2000. ‘Trust as a Commodity’. Trust: Making and Breaking Cooperative Relations 4: 49–72.
  • deTurck, Mark A., Janet J. Harszlak, Darlene J. Bodhorn, and Lynne A. Texter. 1990. ‘The Effects of Training Social Perceivers to Detect Deception from Behavioral Cues’. Communication Quarterly 38 (2): 189–199.
  • Domenicucci, Jacopo, and Richard Holton. 2017. ‘Trust as a Two-Place Relation’. The Philosophy of Trust, 149–160.
  • Dutant, Julien. 2013. ‘In Defence of Swamping’. Thought: A Journal of Philosophy 2 (4): 357–366.
  • Faulkner, Paul. 2007. ‘A Genealogy of Trust’. Episteme 4 (3): 305–321. https://doi.org/10.3366/E174236000700010X.
  • Faulkner, Paul. 2011. Knowledge on Trust. Oxford: Oxford University Press.
  • Faulkner, Paul. 2015. ‘The Attitude of Trust Is Basic’. Analysis 75 (3): 424–429.
  • Faulkner, Paul. 2017. ‘The Problem of Trust’. The Philosophy of Trust, 109–28.
  • Feinberg, Joel. 1970. Doing and Deserving; Essays in the Theory of Responsibility. Princeton: Princeton University Press.
  • Fricker, Elizabeth. 1994. ‘Against Gullibility’. In Knowing from Words, 125–161. Springer.
  • Fricker, Elizabeth. 1995. ‘Critical Notice’. Mind 104 (414): 393–411.
  • Fricker, Elizabeth. 2017. ‘Inference to the Best Explanation and the Receipt of Testimony: Testimonial Reductionism Vindicated’. Best Explanations: New Essays on Inference to the Best Explanation, 262–94.
  • Fricker, Elizabeth. 2018. Trust and Testimonial Justification.
  • Fricker, Miranda. 2007. Epistemic Injustice: Power and the Ethics of Knowing. Oxford University Press.
  • Frost-Arnold, Karen. 2014. ‘The Cognitive Attitude of Rational Trust’. Synthese 191 (9): 1957–1974.
  • Gibbard, Allan. 1990. Wise Choices, Apt Feelings: A Theory of Normative Judgment. Cambridge, MA: Harvard University Press.
  • Goldberg, Sanford C. 2006. ‘Reductionism and the Distinctiveness of Testimonial Knowledge’. The Epistemology of Testimony, 127–44.
  • Goldberg, Sanford C. 2010. Relying on Others: An Essay in Epistemology. Oxford University Press.
  • Goldberg, Sanford C. 2020. Conversational Pressure. Oxford University Press.
  • Goldman, Alvin I. 1999. ‘Knowledge in a Social World’. Oxford University Press.
  • Goldman, Alvin, and Erik J. Olsson. 2009. ‘Reliabilism and the Value of Knowledge’. In Epistemic Value, edited by Adrian Haddock, Alan Millar, and Duncan Pritchard, 19–41. Oxford University Press.
  • Graham, Peter J. 2010. ‘Testimonial Entitlement and the Function of Comprehension’. In Social Epistemology, edited by Duncan Pritchard, Alan Millar, and Adrian Haddock, 148–74. Oxford University Press.
  • Graham, Peter J. 2012a. ‘Testimony, Trust, and Social Norms’. Abstracta 6 (3): 92–116.
  • Graham, Peter J. 2012b. ‘Epistemic Entitlement’. Noûs 46 (3): 449–82. https://doi.org/10.1111/j.1468-0068.2010.00815.x.
  • Graham, Peter J. 2015. ‘Epistemic Normativity and Social Norms’. In Epistemic Evaluation: Purposeful Epistemology, edited by David Henderson, and John Greco, 247-273. Oxford University Press.
  • Greco, John. 2009. ‘The Value Problem’. In Epistemic Value, edited by Adrian Haddock, Alan Millar, and Duncan Pritchard, 313–22. Oxford: Oxford University Press.
  • Greco, John. 2010. Achieving Knowledge: A Virtue-Theoretic Account of Epistemic Normativity. Cambridge University Press.
  • Greco, John. 2015. ‘Testimonial Knowledge’. Epistemic Evaluation: Purposeful Epistemology, 274-290.
  • Greco, John. 2019. ‘The Transmission of Knowledge and Garbage’. Synthese 197: 1–12.
  • Green, Christopher R. 2008. ‘Epistemology of Testimony’. Internet Encyclopedia of Philosophy, 1–42. https://iep.utm.edu/ep-testi/
  • Haddock, Adrian, Alan Millar, and Duncan Pritchard, eds. 2009. Epistemic Value. Oxford University Press.
  • Hardin, Russell. 1992. ‘The Street-Level Epistemology of Trust’. Analyse & Kritik 14 (2): 152–176.
  • Hardin, Russell. 2002. Trust and Trustworthiness. Russell Sage Foundation.
  • Hawley, Katherine. 2014. ‘Trust, Distrust and Commitment’. Noûs 48 (1): 1–20.
  • Hawley, Katherine. 2019. How to Be Trustworthy. Oxford University Press, USA.
  • Hieronymi, Pamela. 2008. ‘The Reasons of Trust’. Australasian Journal of Philosophy 86 (2): 213–36. https://doi.org/10.1080/00048400801886496.
  • Hinchman, Edward, 2005. ‘Telling as Inviting to Trust’. Philosophy and Phenomenological Research 70: 562-87.
  • Hobbes, Thomas. 1970. ‘Leviathan (1651)’. Glasgow.
  • Holton, Richard. 1994. ‘Deciding to Trust, Coming to Believe’. Australasian Journal of Philosophy 72 (1): 63–76. https://doi.org/10.1080/00048409412345881.
  • Horsburgh, H. J. N. 1960. ‘The Ethics of Trust’. The Philosophical Quarterly (1950-) 10 (41): 343–54. https://doi.org/10.2307/2216409.
  • Hume, David. 2000(1739). Treatise on Human Nature. Oxford University Press.
  • Jones, Karen. 1996. ‘Trust as an Affective Attitude’. Ethics 107 (1): 4–25.
  • Jones, Karen. 2004. ‘Trust and Terror’. In Moral Psychology: Feminist Ethics and Social Theory, edited by Peggy DesAutels and Margaret Urban Walker, 3–18. Rowman & Littlefield.
  • Jones, Karen. 2012. ‘Trustworthiness’. Ethics 123 (1): 61–85.
  • Kelp, Christoph. 2018. ‘Assertion: A Function First Account.’ Nous 52, 411-42.
  • Kelp, Christoph, and Simion, Mona. 2020a. Knowledge Sharing: A Functionalist Account of Assertion. Manuscript.
  • Kelp, Christoph, and Simion, Mona. 2020b. ‘What Is Trustworthiness?’ Manuscript.
  • Kelp, Christoph, and Simion, Mona. 2016. The Tertiary Value Problem and the Superiority of Knowledge (with C. Kelp). American Philosophical Quarterly 53 (4): 397-411.
  • Keren, Arnon. 2014. ‘Trust and Belief: A Preemptive Reasons Account’. Synthese 191 (12): 2593–2615.
  • Kraut, Robert. 1980. ‘Humans as Lie Detectors’. Journal of Communication 30 (4): 209–218.
  • Lackey, Jennifer. 2003. ‘A Minimal Expression of Non–Reductionism in the Epistemology of Testimony’. Noûs 37 (4): 706–723.
  • Lackey, Jennifer. 2008. Learning from Words: Testimony as a Source of Knowledge. Oxford University Press.
  • Lipton, Peter. 1998. ‘The Epistemology of Testimony’. Studies in History and Philosophy of Science Part A 29 (1): 1–31.
  • Lyons, Jack. 1997. ‘Testimony, Induction and Folk Psychology’. Australasian Journal of Philosophy 75 (2): 163–178.
  • McLeod, Carolyn. 2002. Self-Trust and Reproductive Autonomy. MIT Press.
  • McMyler, Benjamin. 2011. Testimony, Trust, and Authority. Oxford University Press USA.
  • Medina, José. 2011. ‘The Relevance of Credibility Excess in a Proportional View of Epistemic Injustice: Differential Epistemic Authority and the Social Imaginary’. Social Epistemology 25 (1): 15–35.
  • Medina, José. 2013. The Epistemology of Resistance: Gender and Racial Oppression, Epistemic Injustice, and the Social Imagination. Oxford University Press.
  • Möllering, Guido. 2006. Trust: Reason, Routine, Reflexivity. Elsevier.
  • Moran, Richard. 2006. ‘Getting Told and Being Believed’. In Jennifer Lackey and Ernest Sosa (eds.), The Epistemology of Testimony. Oxford: Oxford University Press.
  • O’Neill, Onora. 2002. Autonomy and Trust in Bioethics. Cambridge University Press.
  • Origgi, Gloria. 2012. ‘Epistemic Injustice and Epistemic Trust’. Social Epistemology 26 (2): 221–235.
  • Owens, David. 2017. ‘Trusting a Promise and Other Things’. In Paul Faulkner and Thomas Simpson (eds.), New Philosophical Perspectives on Trust, 214–29. Oxford University Press.
  • Piovarchy, Adam. 2020. ‘Responsibility for Testimonial Injustice’. Philosophical Studies, 1–19. https://doi.org/10.1007/s11098-020-01447-6
  • Pohlhaus Jr, Gaile. 2014. ‘Discerning the Primary Epistemic Harm in Cases of Testimonial Injustice’. Social Epistemology 28 (2): 99–114.
  • Potter, Nancy Nyquist. 2002. How Can I Be Trusted? A Virtue Theory of Trustworthiness. Rowman & Littlefield.
  • Pritchard, Duncan. 2004. ‘The Epistemology of Testimony’. Philosophical Issues 14: 326–348.
  • Rabinowicz, Wlodek, and Toni Ronnow-Rasmussen. 2000. ‘II-A Distinction in Value: Intrinsic and For Its Own Sake’. Proceedings of the Aristotelian Society 100 (1): 33–51.
  • Reid, Thomas. 1764. ‘An Inquiry into the Mind on the Principles of Common Sense’. In The Works of Thomas Reid, edited by W.H. Bart. Maclachlan & Stewart.
  • Ridge, Michael. 2014. Impassioned Belief. Oxford: Oxford University Press.
  • Simion, Mona. 2020a. Shifty Speech and Independent Thought: Epistemic Normativity in Context. Oxford: Oxford University Press.
  • Simion, Mona. 2020b. ‘Testimonial Contractarianism: A Knowledge-First Social Epistemology’. Noûs 1-26. https://doi.org/10.1111/nous.12337
  • Simion, Mona, and Christoph Kelp. 2018. ‘How to Be an Anti-Reductionist’. Synthese. https://doi.org/10.1007/s11229-018-1722-y.
  • Solomon, Robert C., and Fernando Flores. 2003. Building Trust: In Business, Politics, Relationships, and Life. Oxford University Press USA.
  • Sosa, Ernest. 2010a. ‘How Competence Matters in Epistemology’. Philosophical Perspectives 24 (1): 465–475.
  • Sosa, Ernest. 2010b. ‘Value Matters in Epistemology’. The Journal of Philosophy 107 (4): 167–190.
  • Sosa, Ernest. 2015. Judgment and Agency. Oxford: Oxford University Press.
  • Sylvan, Kurt. 2017. ‘Veritism Unswamped’. Mind 127 (506): 381–435.
  • Wanderer, Jeremy. 2017. ‘Varieties of Testimonial Injustice’. In Ian James Kidd, José Medina, and Gaile Pohlhaus Jr. (eds.), The Routledge Handbook of Epistemic Injustice, 27–40. Routledge.
  • Williamson, Timothy. 2000. Knowledge and Its Limits. Oxford University Press.

 

Author Information

J. Adam Carter
Email: adam.carter@glasgow.ac.uk
University of Glasgow
United Kingdom

and

Mona Simion
Email: mona.simion@glasgow.ac.uk
University of Glasgow
United Kingdom

Tyler Burge (1946—)

Tyler Burge is an American philosopher who has done influential work in several areas of philosophy. These include philosophy of language, logic, philosophy of mind, epistemology, philosophy of science (primarily philosophy of psychology), and history of philosophy (focusing especially on Frege, but also on the classical rationalists—Descartes, Leibniz, and Kant). Burge has also done some work in psychology itself.

Burge is best known for his extended elaboration and defense of the thesis of anti-individualism. This is the thesis that most representational mental states depend for their natures upon phenomena that are not determined by the individual’s own body and other characteristics. In other words, what it means to represent a subject matter—whether in perception, language, or thought—is not fully determined by individualistic characteristics of the brain, body, or person involved. One of the most famous illustrations of this point is Burge’s argument that psychologically representing a kind such as water requires the fulfillment of certain non-individualistic conditions; such as having been in causal contact with instances of the kind, having acquired the representational content through communication with others, having theorized about it, and so forth. A consequence of Burge’s anti-individualism, in this case, is that two thinkers who are physically indiscernible (who are, for example, neurologically indistinguishable in a certain sense) can differ in that one of them, but not the other, has thoughts containing the concept “water”.

When Burge first proposed the thesis of anti-individualism, it was common for philosophers to reject it for one reason or another. It is a measure of Burge’s influence, and the power of his arguments, that the early 21st century saw few philosophers deny the truth of the view.

Nevertheless, there is much more to Burge’s philosophical contributions than anti-individualism. Most of Burge’s more influential theses and arguments are briefly described in this article. An attempt is made to convey how the seemingly disparate topics addressed in Burge’s corpus are unified by certain central commitments and interests. Foremost among these is Burge’s long-standing interest in understanding the differences between the minds of human beings, on one hand, and the minds of other animals, on the other. This interest colors and informs Burge’s work on language, mind, and epistemology in particular.

Table of Contents

  1. Life and Influence
  2. Language and Logic
  3. Anti-Individualism
  4. De Re Representation
  5. Mind and Body
  6. Justification and Entitlement
  7. Interlocution
  8. Self-Knowledge
  9. Memory and Reasoning
  10. Reflection
  11. Perception
  12. History of Philosophy
  13. Psychology
  14. References and Further Reading
    1. Primary Literature
      1. Books
      2. Articles
    2. Secondary Literature

1. Life and Influence

Charles Tyler Burge graduated from Wesleyan University in 1967. He obtained his Ph.D. at Princeton University in 1971, his dissertation being directed by Donald Davidson. He is married with two sons. Burge’s wife, Dorli Burge, was prior to her retirement a clinical psychologist. Burge’s eldest son, Johannes, is Assistant Professor in Vision Science at the University of Pennsylvania. His younger son, Daniel, completed a Ph.D. in 20th century American History at Boston University.

Burge is a fan of sport and enjoys traveling and hiking. He also reads widely outside of philosophy (particularly literature, history, history of science, history of mathematics, psychology, biology, music, and art history). Three of Burge’s interests are classical music, fine food, and fine wine.

A list of Burge’s main philosophical contributions would include the following seven areas. First, in his dissertation and the 1970s more generally, Burge focused attention upon the central significance of context-dependent referential and representational devices, including many uses of proper names, as well as what he came to call “applications” in language and thought. This was during a philosophical era in which it was widely believed that such devices were reducible to context-independent representational elements such as linguistic descriptions and concepts in thought. Burge also appealed to demonstrative- or indexical-like elements in perhaps unexpected areas, such as in his treatment of the semantical paradox. A concern with referential representation, which Burge does not believe to be confined solely to empirical cases, has been as close to his central philosophical interest as any topic. Much the same could be said about Burge’s long-standing interest in predication and attribution. (See sections 2 and 4.) Burge’s work on context-dependent aspects of representation is indebted to Keith Donnellan and Saul Kripke.

Second, while broadly-understood anti-individualism has been a dominant view in the history of philosophy, Burge was the first philosopher to articulate the doctrine, to argue for it, and to mine it for many of its implications. Anti-individualism is the view that the natures of most representational states and events are partly dependent on relations to matters beyond the individuals with representational abilities. In the 20th century, anti-individualism went from being, for a decade or more after Burge discussed its several forms or aspects, a minority view to a view that is rarely even questioned in serious philosophical work today. Furthermore, the discussion of anti-individualism engendered by Burge’s work breathed new life into at least two somewhat languishing areas of philosophy: the problem of mental causation, and the nature of authoritative self-knowledge, each of which has since then become widely discussed and recognized as central areas of philosophy of mind, and epistemology, respectively. (See sections 3, 5 and 8.)

Third, Burge’s work on interlocution (commonly called “testimony”) has been widely discussed. He was among the first to defend a non-reductionist view of interlocution, one which remains among the best-articulated and supported accounts of our basic epistemic warrant for relying upon the words of others (see section 7); and Burge later extended this work to provide new ways of thinking about the problem of other minds, on one hand, and the epistemology of computer-aided mathematical proofs, on the other.

Fourth, beginning with his work on self-knowledge and interlocution, Burge began a rationalist initiative in epistemology that has been influential, in addition to areas already mentioned, in discussions of memory, the first-person concept, reflection and understanding, and other abilities, such as certain forms of reasoning, that seem to be peculiar to human beings. Central to Burge’s limited form of rationalism is his powerful case against the once-common view that both analytic and a priori truths are in some way insubstantial or vacuous; as well as his rejection of the closely related view that apriority is to be reduced to analyticity. (See sections 6-10.)

Fifth, beginning in the mid- to late-1980s all the way up to the early 21st century, Burge developed a detailed understanding of the nature of perception. Integral to this understanding has been the extent to which Burge has immersed himself in the psychology of perception as well as developmental psychology and ethology. Some of Burge’s work on perception is as much a contribution to psychology as to philosophy; one of the articles he has published on the topic covers a prominent and hotly contested question in psychology—the question whether infants and non-human animals attribute psychological states to agents with whom they interact. Parallel with these developments has been Burge’s articulation of a novel account of perceptual epistemic warrant. (See sections 6, 11 and 13.)

Sixth, throughout his career Burge has resisted the tendency of philosophers of mind, especially in the United States, to accept some form of materialism. While it may not have been a central focus of his published work, Burge has over time formulated and defended a version of dualism about the relation between the mind and the body in the literature today. Burge’s view holds that minds, mental states, and mental events are not identical with bodies, physical states, or physical events. It is important to note, however, that Burge’s dualism is not a substance dualism such as the view commonly attributed to Descartes. It is instead a “modest dualism” motivated by the view that counting mental events as physical events does no scientific or other good conceptual work; similarly, for mental properties and minds themselves. This is one example of Burge’s more general resistance to forms of reductionism in philosophy. (See section 5.)

The seventh respect in which Burge’s work has been influential is not confined to a certain body of work or a defended thesis. It lies in providing an antidote to the pervasive tendency, in several areas of philosophy, toward hyper-intellectualization. The earliest paper in which Burge discusses hyper-intellectualization is his short criticism of David Lewis’s account of convention (1975). The tendency toward hyper-intellectualization is exhibited in individualism about linguistic, mental, or perceptual representational content—the idea being that the individual herself must somehow be infallible concerning the proper application conditions of terms, concepts, and even perceptual attributives. It is at the center of the syndrome of views, called Compensatory Individual Representationalism, that Burge criticizes at some length. These views insist that objective empirical representation requires that the individual must in some way herself represent necessary conditions for objective representation. Hyper-intellectualization motivates various forms of epistemic internalism, according to which epistemic warrant requires that the individual be able in some way to prove that her beliefs are warranted, or at least to have good, articulable grounds for believing that they are. Finally, hyper-intellectualization permeates even action theory, which tends to model necessary conditions for animal action upon the intentional actions of mature human language-users. Burge has resisted all of these hyper-intellectualizing tendencies within philosophy, and to a lesser extent in psychology. (See sections 3, 6, 7 and 11.)

If there is a single, overriding objective running throughout Burge’s long and productive career, it is to understand wherein human beings are similar to, and different from, other animals in representational and cognitive respects. As he put the point early on, in the context of a discussion of anti-individualism:

I think that ultimately the greatest interest of the various arguments lies not in defeating individualism, but in opening routes for exploring the concepts of objectivity and the mental, and more especially those aspects of the mental that are distinctive of persons. (1986b, 194 fn. 1)

This large program has involved not only investigating the psychological powers that seem to be unique to human beings—such as a priori warranted cognition and reflection, and authoritative self-knowledge and self-understanding—but also competencies that we share with a wide variety of non-human animals, principally memory, action, and perception. (See sections 3, 4, 7, and 8-11.)

2. Language and Logic

Burge’s early work in philosophy of language and logic centered on semantics and logical form. The work on semantics constitutes the beginning of Burge’s lifelong goal of understanding reference and representation—beginning in language and proceeding to thought and perception. This work includes the logical form of de re thought (1977); the semantics of proper names (1973); demonstrative and indexical constructions (1974a); and also mass and singular terms (1972; 1974b). While the work on context-dependent reference was the dominant special case of Burge’s thought and writing on semantics and logical form, it also includes Burge’s work on paradoxes, especially the strengthened liar (semantic) paradox and the epistemic paradox.

Significant later work on logic and language prominently includes articles on logic and analyticity, and on predication and truth (2003a; 2007b).

3. Anti-Individualism

Anti-individualism is the view that the natures of most thoughts, and perceptual states and events, are partly determined by matters beyond the natures of individual thinkers and perceivers. By the “nature” of these mental states we understand that without which they would not be the mental states they are. So the representational contents of thoughts and perceptual states, for example, are essential to their natures. If “they” had different contents, they would be different thoughts or states. As Burge emphasizes, anti-individualism has been the dominant view in the history of philosophy. It was present in Aristotle, arguably in Descartes, and in many other major philosophers in the Western canon. When Burge, partly building upon slightly earlier work by Hilary Putnam, came explicitly to formulate and defend the view, it became controversial. There are several reasons for this. One is that materialistic views in philosophy of mind seemed incompatible with the implications of anti-individualism. Another was a tendency, which began to be dislodged only after the mid-20th century, to place very high emphasis upon phenomenology and introspective “flashes” of insight when it came to discussions of the natures of representational mental states and events. There are rear-guard defenses of the cognitive relevance of phenomenology that still have currency today. But anti-individualism appears to have become widely, if sometimes reluctantly, accepted.

As noted, anti-individualism is the view that most psychological representational states and events depend for their natures upon relations to subject matters beyond the representing individuals or their psychological sub-systems. This view was first defended in Burge’s seminal article, “Individualism and the Mental” (1979a). Some of Burge’s arguments for anti-individualism employ the Twin-Earth thought-experiment methodology originally set out by Putnam. Burge went beyond Putnam, among other ways, by arguing that the intentional natures of many mental states themselves, rather than merely associated linguistic meanings, depend for their natures on relations to a subject matter. Burge has also argued at length against Putnam’s view (which Putnam has since given up) that meanings and thought contents involving natural kinds are indexical in character.

There are five distinct arguments for anti-individualism in Burge’s work. The order in which they were published is as follows. First, Burge argued that many representational mental states depend for their natures upon relations to a social environment (1979a). Second, he argued that psychologically representing natural kinds such as water and aluminum depends upon relations to entities in the environment (1982). Third, Burge argued that having thoughts containing concepts corresponding to artefactual kinds such as sofas is compatible with radical, non-standard theorizing about the kinds (1986a). Fourth, Burge constructed a thought experiment that appears to show that even the contents of perception may depend for their natures on relations to entities purportedly perceived (1986b; 1986c). Finally, Burge has provided an argument for a version of empirical anti-individualism that he regards as both necessarily true and a priori: “empirical representational states as of the environment constitutively depend partly on entering into environment-individual causal relations” (2010, 69). This final argument has superseded the fourth as the main ground of perceptual anti-individualism. It could also be said that it provides the strongest ground for anti-individualism in general, at least for empirical cases, since propositional attitudes containing concepts such as “arthritis”, “water”, and “sofa”, are all parasitic, albeit in complex and non-fully-understood ways, upon basic perceptual categories covered by the fifth argument. Finally, while it is a priori that perceptual systems and states are partly individuated by relations to an environment, it is an empirical fact that there are perceptual states and events.

Rather than discussing each of these arguments in detail, the remainder of the section focuses on one of Burge’s schematic representations of the common structure of several of the arguments. The thought experiments in question involve three steps. In the first, one judges that someone could have thoughts about “a given kind or property as such, even though that person is not omniscient about its nature” (2013a, 548). For example, one can think thoughts about electrons without being infallible about the natures of electrons. This lack of omniscience can take the form of incomplete understanding, as in the case of the concept of arthritis. It can stem from an inability to distinguish the kind water from a look-alike liquid in a world that contains no water, or theorizing about water. Or it can issue from non-standard theorizing about entities, say sofas, despite fully grasping the concept of sofa.

In the second step, one imagines a situation just like the one just considered, but in which the person’s mistaken beliefs are in fact true. That is to say, one considers a situation in which the kind or property differs from its counterpart in the first situation, but in ways in which the individual cannot discriminate the kind or property in the first situation from its counterpart in the second step. Thus, in this step of the argument the thoughts one would normally express with the words used by the subject, present in the first step, are in fact true.

In the third step “one judges that in the second environment, the individual could not have thoughts about arthritis … [or] sofas, as such” (2013a, 549). The reason, of course, is that the relevant entities in the second step are not the same as those in the first step. There are additional qualifications that must be made, such as that it must be presupposed that, while there is no arthritis, water, or sofas, in the second step, no alternative ways of acquiring the concepts of arthritis, water or sofa is available or utilized. Burge continues:

The conclusion is that what thoughts an individual can have—indeed, the nature of the individual’s thoughts—depends partly on relations that the individual bears to the relevant environments. For we can imagine the individual’s make-up invariant between the actual and counterfactual situations in all other ways pertinent to his psychology. What explains the possibility of thinking the thoughts in the first environment and the impossibility of thinking them in the second is a network of relations that the individual bears to his physical or social surroundings. (2013a, 549)

In other words, the person is able to use the concepts of arthritis, water, and sofa in the first step of the argument for the same reasons that all of us can think with these concepts. Even if the person were indiscernible in individualistic respects, however, changes in the environment could preclude him from thinking with these concepts. If this is correct, then it cannot be the case that the thoughts that one can think with are fully determined by individualistic factors. That is to say, two possible situations in which a person is indistinguishable with respect to individualist factors can differ as regards the thoughts that she thinks.

What this schematic formulation of the first three thought experiments for anti-individualism emphasizes is arguably the same as the reason that it has come to be so widely accepted. As Burge had earlier put the point: the schematic representation of the arguments “exploits the lack of omniscience that is the inevitable consequence of objective reference to an empirical subject matter” (2007, 22-23). Thus, opposition to anti-individualism, or at least opposition to the three arguments in question, must in some way deny our lack of omniscience about the natures of our thoughts, or the conditions necessary for our thinking them. This denial appears to be unreasonable and without a solid foundation.

4. De Re Representation

To a first approximation, de dicto representation is representation that is entirely conceptualized and does not in any way rely upon non-inferential or demonstrative-like relations for its nature. By contrast, de re representation is both partly nonconceptual and reliant upon demonstrative-like relations (at least in empirical cases) for the determination of its nature. For example, the representational content in “that red sphere” is de re; it depends for its nature on a demonstrative-like relation holding between the representer and the putative subject matter. By contrast, “the shortest spy in all the world in 2019” is de dicto. It is completely conceptualized and is not in any way dependent for its nature on demonstrative-like relations. When Burge first began publishing on the topic, it was very common to hold that de re belief attributions (for example) could be reduced to de dicto ascriptions of belief.

Burge’s early work on de re representation sought to achieve three primary goals (1977). First, he provided a pair of characterizations of the fundamental nature of de re representation in language and in thought: a semantical and an epistemic characterization. The semantical account “maintains that an ascription ascribes a de re attitude by ascribing a relation between what is expressed by an open sentence, understood as having a free variable marking a demonstrative-like application, and a re to which the free variable is referentially related” (2007f, 68). The epistemic account, by contrast, maintains that an attitude is de re if it is not completely conceptualized. The second goal of Burge’s early paper on de re belief was to argue that any individual with de dicto beliefs must also have de re beliefs (1977, section II). Finally, Burge argued that the converse does not hold: it is possible to have de re beliefs but not de dicto beliefs. From the second and third claims it follows, contrary to most work on the topic at the time, that de re representation is in important respects more fundamental than de dicto representation.

Burge’s later work on de re representation includes a presentation of and an argument for five theses concerning de re states and attitudes. The first four theses concern specifically perception and perception-based belief. Thesis one is that all representation involves representation-as (2009a, 249-250). This thesis follows from the view that all perceptual content, and the content of all perception-based belief, involves attribution as well as reference. There is no such thing as “neat” perception. All perception is perspectival and involves attribution of properties (which may or may not correctly characterize the objects of perception, even assuming that perceptual reference succeeds). Thesis two is that all perception and perception-based belief is guided by general attributives (2009a, 252). An attributive is the perceptual analog of a predicate, for example, “red” in the perceptual content “that red sphere”. Perceptual representation must be carried out in such a way that one or more attributives is associated with the perception and guides the ostensible perceptual reference. The third thesis is that successful perceptual reference requires that some perceptual attribution must veridically characterize the entity perceived (2009a, 289-290). A main idea of this thesis is that something must make it the case that perceptual reference has succeeded, in a given instance, rather than failed. What must be so is not only that the right sort of causal relation obtains between the perceiver and the perceptual referent, but that some attributive associated with the object of perception veridically applies to it. Like the second thesis, this one is fully compatible with the fact that perceptual reference can succeed even where many attributions, including those most salient, fail. The difference is that the second thesis concerns only purported perceptual reference, while the third concerns successful reference. Successful reference is compatible with the incorrectness of some perceptual attribution, even if an attributive that functions to guide the reference fails to apply to the referent. But the third thesis, to reiterate, does require that some perceptual attributive present in the psychology of the individual correctly applies to the referent.

To summarize: the first thesis says that every representation must have a mode of representation. It is impossible for representation to occur neat. The second thesis holds that even (merely) purported reference requires attribution. And the third thesis states that successful perceptual reference requires that some attributives associated in the psychology of the individual with the reference correctly apply to the referent.

Burge’s final two theses concerning de re representation are more general and abstract. The fourth thesis states that necessary preconditions on perception and perceptual reference provide opportunities for a priori warranted belief and knowledge concerning perception. In Burge’s words: “Some of our perceptually based de re states and attitudes, involving context-based singular representations, can yield apriori warranted beliefs that are not parasitic on purely logical or mathematical truths” (2009a, 298). An example of such knowledge might be the following:

(AC*) If that object [perceptually presented as a trackable, integral body] exists, it is trackable and integral. (compare Burge 2009a, 301)

This thesis arguably follows from the third thesis concerning de re perceptual representation. It follows, to reiterate, because a minimal, necessary condition upon successful perceptual reference is that some attributive associated (by the individual or its perceptual system) with the referent veridically applies to the referent of perception—and the most general, necessarily applicable attributive where perceptual reference is concerned is that the ostensible entity perceived be a trackable, integral body. Finally, the fifth thesis concerning de re representational states and events provides a general characterization of de re representation that does not apply merely to empirical cases:

A mental state or attitude is autonomously (and proleptically) de re with respect to a representational position in its representational content if and only if the representational position contains a representational content that represents (purports to refer) nondescriptively and is backed by an epistemic competence to make non-inferential, immediate, nondiscursive attributions to the re. (2009a, 316)

The use of “autonomously” here is necessary to exclude reliance upon others in perception-based reference. Such reliance can be de re even if the third thesis fails (2009a, 290-291). “Proleptically” is meant to allow for representation that fails to refer. Technically speaking, failed perceptual or perception-based reference is never de re. But it is nevertheless purported de re reference and so is covered by the fifth thesis.

For discussion of non-empirical cases of de re representation, which Burge allowed for even in “Belief De Re”, see Burge (2007f, 69-75) and (2009a, 309-316).

It should be re-emphasized that two of Burge’s primary philosophical interests, throughout his career, have been de re reference and representation (1977; 2007f), on one hand, and the nature of predication, on the other (2007b; 2010a). These topics connect directly with the aforementioned interest in understanding representational and epistemic abilities that seem to be unique to human beings.

5. Mind and Body

Burge’s early work on the mind/body problem centered around sustained criticism of certain ways the problem of mental causation has been used to support versions of materialism (1992; 1993b). Burge’s criticisms of materialism about mind, including the argument against token-identity materialism, date back to “Individualism and the Mental” (1979a, section IV). As noted earlier, Burge’s position on the mind/body problem is a modest form of dualism that is principally motivated by the failure [of reduction of minds, mental states, and mental events, on one hand, to the body or brain, physical states, and physical events, on the other] to provide empirical or conceptual explanatory illumination. He has also done work on consciousness, and provided a pair of new arguments against what he calls “compositional materialism”.

Beginning in the late 1980s, many philosophers expressed doubts concerning the probity of our ordinary conception of mental causation. Discussion of anti-individualism partially provoked this series of discussions. Some argued that, absent some reductive materialist understanding of mental causation, we are faced with the prospect of epiphenomenalism—the view that instances of mental properties do not do any genuine causal work but are mere impotent concomitants of instances of physical properties. Burge argues that the grounds for rejecting epiphenomenalism are far stronger than any of the reasons that have been advanced in favor of the epiphenomenalist threat. He points out that, were there a serious worry about how mental properties can be causally efficacious, the properties of the special sciences such as biology and geology would be under as much threat as those in commonsense psychology or psychological science. Such causal psychological explanation “works very well, within familiar limits, in ordinary life; it is used extensively in psychology and the social sciences; and it is needed in understanding physical science, indeed any sort of rational enterprise” (1993b, 362). Such explanatory success itself shows, other things equal, the “respectability” of the ordinary conception of mental causation: “Our best understanding of causation comes from reflecting on good instances of causal explanation and causal attribution in the context of explanatory theories” (2010b, 471).

Burge has also provided arguments against some forms of materialism. One such argument employs considerations made available by anti-individualism to contend that physical events, as ordinarily individuated, cannot in the general case be identical with mental events (1979a, 141f.; 1993b, 349f.). This variation in mental events across individualistically indiscernible thinkers would not be possible, of course, if mental events were identical with physical events. In other words, if mental events were identical with physical events then mere variation in environment could not constitutively affect individuals’ mental events. Needless to say, the falsity of token-identity materialism entails the falsity of a claim of type-identity.

Burge has also provided another line of thought on the mind-body problem that supports his “modest dualism” concerning the relation of the mental to the physical. The most plausible of the various versions of materialism, Burge holds, is compositional materialism—the view that psychologies or minds, like tectonic plates and biological organisms, are composed of physical particles. However, like all forms of materialism, compositional materialism makes strong, empirically specific, claims. Burge writes:@

The burden on compositional materialism is heavy. It must correlate neural causes and their effects with psychological causes and their effects. And it must illuminate psychological causation, of both physical and psychological effects, in ways familiar from the material sciences (2010b, 479).

He holds that there is no support in science for the compositional materialist’s commitment to the view that mental states and events are identical with composites of physical materials.

The two new arguments against compositional materialism run roughly as follows. The first turns on the difficulty of seeing how “material compositional structures could ground causation by propositional psychological states or events” (2010b, 482). Physical causal structures—broadly construed, to include causation in the non-psychological special sciences—do not appear to have a rational structure. The propositional structures that help to type-individuate certain psychological kinds do have a rational structure. Hence, it is prima facie implausible that psychological causation could be reduced to physical-cum-compositional causal structures. The second argument is similar but does not turn on the notion of causation. Burge argues that:@

the physical structure of material composites consists in physical bonds among the parts. According to modern natural science, there is no place in the physical structure of material composites for rational, propositional bonds. The structure of propositional psychological states and events constitutively includes propositional, rational structure. So propositional states and events are not material composites. (2010b, 483)

Burge admits the abstractness of the arguments, and allows that subsequent theoretical developments might show how compositional materialism can overcome them. However, he suggests that the changes would have fundamentally to alter how either material states and events or psychological states and events are conceived.

Finally, Burge has written two articles on consciousness. The first of these defends three points. One is that all kinds of consciousness, including access consciousness, presuppose the presence of phenomenal consciousness. Phenomenal consciousness is the “what it is like” aspect certain mental states and events. The claim of presupposition is that no individual can be conscious, in any way, unless it has mental states some of which are phenomenally conscious. The second point is that the notion of access consciousness, as understood by Ned Block, for example, needs refinement. As Block understands access consciousness, it concerns mental states that are poised for use in rational activity (1997). Burge argues that this dispositional characterization runs afoul of the general principle that consciousness, of whatever sort, is constitutively an occurrent phenomenon. Burge’s refinement of the notion of access consciousness is called “rational-access consciousness”. The third point is that we should make at least conceptual space for the idea of phenomenal qualities that are not conscious throughout their instantiation in an individual.

Burge’s second paper on consciousness: (a) notes mounting evidence that a person could have phenomenal qualities without the qualities being rationally accessible; (b) explores ways in which a state could be rationally-access conscious despite not being phenomenally conscious; (c) distinguishes phenomenal consciousness from other phenomena, such as attention, thought, and perception; and (d) sets out a unified framework for understanding all aspects of phenomenal consciousness, as a type of phenomenal presentation of qualities to subjects (2007e).

6. Justification and Entitlement

Burge draws a crucial distinction between two forms of epistemic warrant. One is justification. A justified belief is one that is warranted by reason or reasons. By contrast, an epistemic entitlement is an epistemic warrant that does not consist in the possession of reasons. Entitlement is usually defined by Burge negatively—in such way, because there is no simple way to express what entitlement consists in that abstracts from the nature of the representational competence in question.

The distinction was first articulated in “Content Preservation” (1993a). Burge there explained that:

(t)he distinction between justification and entitlement is this: Although both have positive force in rationally supporting a propositional attitude or cognitive practice, and in constituting an epistemic right to it, entitlements are epistemic rights or warrants that need not be understood by or even accessible to the subject. We are entitled to rely, other things equal, on perception, memory, deductive and inductive reasoning, and on … the word of others. (230)

What entitlement consists in with respect to each of these cases is different. What they do have in common is the negative characteristics listed. Burge continues:

The unsophisticated are entitled to rely on their perceptual beliefs. Philosophers may articulate these entitlements. But being entitled does not require being able to justify reliance on these resources, or even to conceive such a justification. Justifications … involve reasons that people have and have access to. (1993a, 230)

Throughout his career, Burge has provided explanations for our entitlement to rely upon interlocution, certain types of self-knowledge and self-understanding, memory, reasoning, and perception. The last of these is briefly sketched before some misunderstandings of the distinction between justification and entitlement are warned against. The case of perceptual entitlement provides one of the best illustrations of the nature of entitlement in general.

People are entitled to rely upon their perceptual beliefs just in case the beliefs in question: (a) are the product of a natural perceptual competence, that is functioning properly; (b) are of types that are reliable, where the requirement of reliability is restricted to a certain type of environment; and (c) have contents that are normally transduced from perceptual states that themselves are reliably veridical (Burge 2003c, sections VI and VIII; 2020, section I). These points are part of a much larger and more complex discussion, of course. The point for now is that each of (a)-(c) are examples of elements of entitlements. As is the case with all entitlements, individuals who are perceptually entitled to their beliefs do not have to know anything concerning (a)-(c); and indeed need not even be able to understand the explanation of the entitlement, or the concept “entitlement”. A final key point is that while all entitlements, like all epistemic warrants generally for Burge, must be the product of reliable belief-forming competences, no entitlement consists purely in reliability. In the case of perception, the sort of reliability that is necessary for entitlement is reliability in the kind of environment that contributed to making the individual’s perceptual states and beliefs what they are (2003c, section VI; 2020, section I).

Numerous critics of Burge have misunderstood the nature of entitlement, and/or the distinction between justification and entitlement. Rather than exhaustively cataloging these misinterpretations, the remainder of the section is devoted to articulating the four main sources of misunderstanding. Keeping these in mind would help to prevent further interpretive mistakes. In increasing levels of subtlety, the mistakes are the following. The first error is simply to disregard Burge’s insistence that entitlements need not be appealed to, or be even within the ken, of the entitled individual. The fact that an individual has no knowledge of any warranting conditions, in a given case, is not a reason for doubting that she is entitled to the relevant range of beliefs.

The second error is insisting that entitlement be understood in terms of “epistemic grounds”, or “evidence”. Each of these notions suggests the idea of epistemic materials in some way made use of by the believer. But entitlement is never something that accrues to a belief, or by extension to a believer, because of something that he or she does, or even recognizes. The example of perceptual entitlement, which accrues in virtue of conditions (a)-(c) above, illustrates these points. The individuation conditions of perceptual states or beliefs are in no sense epistemic grounds. The notion of evidence is even less appropriate for describing entitlement. While evidence can be made up of many different sorts of entities, or states of affairs, evidence must be possessed or appreciated by a subject in order for it to provide an epistemic warrant. But in that case, on Burge’s view, the warrant would be a justification rather than an entitlement.

A variant on this second source of misunderstanding is to assume that since justification is an epistemic warrant by reason, and reasons are propositional, all propositional elements of epistemic warrants are justifications (or parts of justifications). Several types of entitlements involve propositionality—examples of which are interlocution, authoritative self-knowledge, and even perception (in the sense that perceptual beliefs to which we are entitled must have a propositional structure appropriately derived from the content of relevant perceptual states). But none is a justification or an element in a justification. Being propositional is necessary, but not sufficient, for an element of an epistemic warrant to be, or to be involved in, a justification (as opposed to an entitlement). Another way to put the point is to explain that being propositional in structure is necessary, but not sufficient, for being a reason.

The third tendency that leads to misunderstandings of Burge’s two notions of epistemic warrant is the assumption that they are mutually exclusive. On this view, a belief warranted by justification (entitlement) cannot also be warranted by entitlement (justification). Not only is this not the case, but in fact all beliefs that are justified are also beliefs to which the relevant believer is entitled. Every belief that a thinker is justified in holding is also a belief that is produced by a relevantly reliable, natural competence. (Though the converse obviously does not hold.) Entitlement is the result of a well-functioning, natural, reliable belief-forming competence. There are two species of justification for Burge. In the first case, one is justified in believing a self-evident content such as “I am thinking”, or “2+2=4”. In effect, these contents are reasons for themselves—believing them is enough, other things equal, for the beliefs to be epistemically warranted and indeed to be knowledge. The second kind of justification involves inference. If a sound inference is made by a subject, the premises support the conclusion, and the believer understands why the inference is truth-preserving (or truth-tending), then the belief is justified. Notice that each of these kinds of justified beliefs are, for Burge, also the products of well-functioning, natural, reliable belief-forming competencies. The competence in the case of contents that are reasons for themselves is understanding; and the competence in the second case is a complex of understanding the contents, understanding the pattern of reasoning, and actually reasoning from the content of the premises to the content of the conclusion. So all cases of justification are also cases in which the justified believer is entitled to his or her beliefs.

The subtlest mistake often made by commenters concerning Burge’s notions of justification and entitlement is to assume that what Burge says is not true of entitlement is true of his notion of justification. After all, in “Content Preservation”, Burge states that entitlement “need not be understood by or even accessible to the subject” (1993a, 230). And later, in “Perceptual Entitlement”, Burge makes a number of additional negative claims about entitlement. He writes that entitlement “does not require the warranted individual to be capable of understanding the warrant”, and that entitlement is a “warrant that need not be fully conceptually accessible, even on reflection, to the warranted individual” (2003c, 503). Finally, Burge argues that children, for example, are entitled to their perceptual beliefs, rather than being justified, because they lack sophisticated concepts such as epistemic, entails, perceptual state, and so forth (2003c, 521). So we have the following negative specifications concerning entitlement:

(I) It does not require understanding the warrant;

(II) It does not require being able to access the warrant;

            and

(III) It does not require the use of sophisticated concepts such as those mentioned above.

The mistake, of course, is to assume that these things that are not required by entitlement are required by justification, as Burge understands justification. This difficulty is a reflection of the fact that Burge, in these passages and others like them, is doing two things at once. He is not only explaining how he thinks of entitlement and justification, but also distinguishing entitlement from extant conceptions of justification. Since his conception of justification differs from most other conceptions, it is a fallacy to infer from (I)-(III), together with relevant context, that they must be abilities or capacities that justification does require.

This is not to say that (I)-(III) are wholly irrelevant to Burge’s notion of justification. For his conception is not completely unlike others’ conceptions. For example, one who believes that 2+2=4 based on his or her understanding of the content does understand the warrant—for the warrant is the content itself. So what (I) denies of entitlement is sometimes true of Burge’s notion of justification. Similarly, a relative neophyte who understands at least basic logic, and makes a sound inference in which the premises support the conclusion, is in one perfectly good sense able to access his or her warrant for believing the conclusion, as in what is denied in (II). The notion of access in question, when Burge invokes the notion in characterizations of epistemic warrant, is conscious access. (See section 5 above.)

But the other two claims are more problematic. Burge’s conception of justification is not as demanding as one which holds that the denials of (II) and (III) correctly characterize what is necessary for justification. Thus, while perceptual entitlement is the primary form of epistemic warrant for those with empirical beliefs, it is not impossible for children or adults to have justifications for their perceptual beliefs. It is only that these will almost always be partial. They will usually be able to access and understand the warrant (the entitlement) only partially. In effect, they are justified only to the extent that they have an understanding of the nature of perceptual entitlement. Fully understanding the warrant, the entitlement, would require concepts such as those mentioned in (IV). But even children and many adults, as noted, are capable of approximating understanding of the warrant. Burge gives the example of a person averring a perceptual belief and providing in support of his belief the claim that it looks that way to him or her. This is a kind of justification. But there is no (full) understanding of the warrant, and likely not even possession of all the concepts employed in a discursive representation of the complete warrant. Finally, Burge’s notion of justification, or epistemic support by reason, is even weaker than these remarks suggest. For he holds that some nonhuman animals probably have reasons for some of their perceptual beliefs (and therefore have justifications for them)—but these animals can in no sense at all access or understand the warrant. As Burge writes, “My notion of having a reason or justification does not require reflection or understanding. That is a further matter” (2003c, 505 fn. 1). This passage brings out how different Burge’s notion of justification is from many others’ conceptions; and it helps to explain why it is an error to assume that what Burge says is not true of entitlement is true of (his notion of) justification.

7. Interlocution

Burge’s early work on interlocution (or testimony) defended two principal theses. One is the “Acceptance Principle”—the view, roughly speaking, that one is prima facie epistemically warranted in relying upon the word of another. The argument for this principle draws upon three a priori theses: (a) speech and the written word are indications of propositional thought; (b) propositional thought is an indication of a rational source; and (c) rational sources can be relied upon to present truth. The other thesis Burge defended was that it is possible to be purely a priori warranted in believing a proposition on the basis of interlocution (1993a). Burge came to regard this second thesis as a large mistake (2013b, section III), and has since then held that the required initial perceptual uptake of the words in question—utilization of which is made in (a)—makes all interlocutionary knowledge and warranted belief at least minimally empirical in epistemic support. It should be noted, however, that Burge’s view on our most basic interlocutionary warrant remains distinctive in that he regards it as fundamentally non-inferential in character. It is an entitlement—whose nature is structured and supported by the Acceptance Principle, and the argument for it—rather than a justification. Furthermore, none of the critics of Burge’s early view on interlocutionary entitlement identified the specific problem that eventually convinced him that the early view had to be given up.

The specific problem in question was that Burge had initially held that since interlocutionary warrant could persist in certain cases, even as perceptual identification of an utterance failed, the warrant could not be based, even partly, on perception. Burge came to believe that this persistence was possible only because of a massive presumption of reliability where perception was concerned. So the fact that interlocutionary warrant could obtain even where perception failed does not show that the warrant is epistemically independent of perception (2013b, section III).

8. Self-Knowledge

Burge’s views on self-knowledge developed over three periods. The first of these consisted largely in a demonstration that anti-individualism is not, contrary to a common view at the time, inconsistent with or in any tension with our possession of some authoritative self-knowledge (1986d; compare 2013, 8). Burge pointed to certain “basic cases” of self-knowledge—such as those involving the content of “I am now entertaining the thought that water is wet”—which are infallible despite consisting partly in concepts that are anti-individualistically individuated. Using the terms that Burge introduced later, this content is a pure cogito case. It is infallible in the sense that thinking the content makes it true. It is also self-verifying in the sense that thinking the content provides an epistemic warrant, and indeed knowledge, that it is the case. There are also impure cogito cases, an example of which is “I am hereby thinking [in the sense of committing myself to the view] that writing requires concentration”. This self-ascription is not infallible. One can think the content, even taking oneself to endorse the first-order content in question, but one can fail actually to commit oneself to it. But impure cogito cases are still self-verifying. The intentional content in such cases “is such that its normal use requires a performative, reflexive, self-verifying thought” (2003e, 417-418). What Burge calls “basic self-knowledge” in his early work on self-knowledge is comprised of cogito cases, pure and impure. He is explicit, however, that not all authoritative self-knowledge, much less all self-knowledge in general, has these features.

To reiterate, the central point of this early work was simply to demonstrate that there is no incompatibility between our possession of authoritative self-knowledge and anti-individualism. Basic cases of self-knowledge illustrate this. One further way to explain why there is no incompatibility is to note that the conditions that, in accordance with anti-individualism, must be in place for the first-order contents to be thought are necessarily also in place when one self-ascribes such an attitude to oneself (2013, 8).

The second period of Burge’s work on self-knowledge centered around a more complete discussion of the different forms of authoritative self-knowledge, as well as defending the thesis that a significant part of our warrant for non-basic cases of such self-knowledge derives from its indispensable role in critical reasoning (1996). Critical reasoning is meta-representational reasoning that conceptualizes attitudes and reasons as such. The role of (non-basic) authoritative self-knowledge in critical reasoning is part of our entitlement to relevant self-ascriptions of attitudes in general. This second period thus extended Burge’s account of authoritative self-knowledge to non-cogito instances of self-knowledge. It also began the project of explaining wherein we are entitled to authoritative self-knowledge among instances where the self-ascriptions are not self-verifying. Since cogito cases provide reasons for themselves, as it were, basic cases of self-knowledge involve justification. By contrast, non-basic cases of authoritative self-knowledge are warranted by entitlement rather than justification. (See section 6.)

The third period of Burge’s work on self-knowledge consisted in a full discussion of the nature and foundations of authoritative self-knowledge (2011a). Burge argues that authoritative self-knowledge, including a certain sort of self-understanding, is necessary for our role in making attributions concerning, and being subject to, norms of critical reasoning and morality. A key to authoritative self-knowledge, as stressed by Burge from the beginning of his work on the topic, is the absence of the possibility of brute error. Brute error is an error that is not in any way due to malfunctioning or misuse of a representational competence. In perception, for example, one can be led into error despite the fact that one’s perceptual system is working fully reliably; if, say, light is manipulated in certain ways. By contrast, while error is possible in most cases of authoritative self-knowledge, it is possible only when there is misuse or malfunction. Since misuse and malfunction undermine the epistemic warrant, it can be said that instances of authoritative self-knowledge for Burge are “warrant factive”—warrant entails, in such cases, true self-ascriptions of mental states.

The full, unified account of self-knowledge in Burge (2011a) explains each element in our entitlement to self-knowledge and self-understanding. The account is extended to cover, not only basic cases of self-knowledge, but also knowledge of standing mental states; of perceptual states; and of phenomenal states such as pain. The unified treatment explains why its indispensable role in critical reasoning is not all there is to our entitlement to (non-basic cases of) self-knowledge and self-understanding. Burge’s explanation of the impossibility of brute error with respect to authoritative self-knowledge makes essential use of the notion of “preservational psychological powers”, such as purely preservative memory and betokening understanding. Betokening understanding is understanding of particular instances of propositional representational content. The unification culminates in an argument that shows how immunity to brute error follows from the nature of certain representational competencies, along with the nature of epistemic entitlement (2011a, 213f). In yet later work, Burge explained in detail the relation between authoritative self-knowledge and critical reasoning (2013, 23-24).

9. Memory and Reasoning

Two of Burge’s most important philosophical contributions are his identification and elucidation of the notion of purely preservative memory, on one hand, and his discussion of critical reasoning, particularly its relation to self-knowledge and the first-person concept, on the other.

Burge’s discussion of memory and persons distinguishes three different forms of memory: experiential memory; substantive content memory; and purely preservative memory (2003c, 407-408). Experiential memory is memory of something one did, or that happened to one, from one’s own perspective. Substantive content memory is closer to our ordinary notion of simply recalling a fact, or something that happened, without having experienced it personally. Purely preservative memory, by contrast, simply holds a remembered (or seemingly remembered) content, along with the content’s warrant and the associated attitude or state, in place for later use. When I remember blowing out the candles at my 14th birthday party, this is normally experiential memory. Remembering that the United States tried at least a dozen times to assassinate Fidel Castro, in most cases, is an example of substantive content memory. When one conducts inference over time, by contrast, memory functions simply to hold earlier steps along with their respective warrants in place for later use in the reasoning. This sort of memory is purely preservative. Burge argues that no successful reasoning over time is possible without purely preservative memory. Purely preservative memory also plays an important role in Burge’s earlier account of the epistemology of interlocution (1993a; 2013b); and in his most developed account of the epistemology of self-knowledge and self-understanding (2011a).

In “Memory and Persons” he discussed the role of memory in psychological representation as well as the issue of personal identity. Burge argues that memory is “integral to being a person, indeed to having a representational mind” (2003b, 407). He does this by arguing that three common sorts of mental acts, states, and events—those involving intentional agency, perception, and inference—presume or presuppose the retention of de se representational elements in memory. De se states have two functions. First, they mark an origin of representation. In the case of a perceptual state this might be between an animal’s eyes. Second, they are constitutively associated with an animal’s perspectives, needs, and goals. Thus, a dog might not simply represent in perceptual memory the location of a bone—but instead, the location of his or her bone. De se markers are also called by Burge “ego-centric indexes” (2003c; 2019).

Intentional agency requires retention in memory of de se representational elements because intention formation and fulfillment frequently take place over time. If someone else executes the sort of action that one intends for oneself, this would not count as fulfillment of the veridicality condition of one’s intention. Marking one’s own fulfillment (or the lack of it) requires retention in memory of one’s own de se representational elements. Another example is perception. It requires the use of perceptual contents. This use always and constitutively involves possession or acquisition of repeatable perceptual abilities. “Such repeatable abilities include a systematic ability to connect, from moment to moment, successive perceptions to one another and to the standpoint from which they represent” (2003b, 415). The activity necessarily involved in perception, too, involves retention of de se contents in purely preservative memory. Inference, finally, requires this same sort of retention for reasons alluded to above. If reliance on a content used earlier in a piece of reasoning is not ego-centrally indexed to the reasoner, then simple reliance on the content cannot epistemically support one’s conclusion. The warrant would have to be re-acquired whenever use was made of a given step in the process of reasoning—making reasoning over time impossible.

It follows from these arguments that attempts to reduce personal identity to memory-involving stretches of consciousness cannot be successful. Locke is commonly read as attempting to carry-out such a reduction. Butler pointed out a definitional circularity—memory cannot be used in defining personal identity because genuine memories presuppose such identity. Philosophers such as Derek Parfit and Sydney Shoemaker utilized a notion of “quasi-memory”—a mental state just like memory but which does not presuppose personal identity—in an attempt to explain personal identity in more fundamental terms. Burge’s argumentation shows that this strategy involves an explanatory circularity. Only a creature with a representational mind could have quasi-memories. However, for reasons set out in the previous two paragraphs, having a representational mind requires de se representational elements that themselves presuppose personal identity over time. Hence, quasi-memory presupposes genuine memory, and cannot therefore be used to define or explain it (2003b, sections VI-XI).

As noted in the previous section, critical reasoning is meta-representational reasoning that characterizes propositional attitudes and reasons as such. One of Burge’s most important discussions of critical reasoning explains how fully understanding such reasoning requires use and understanding of the full, first-person singular concept “I” (1998).

Descartes famously inferred his existence from the fact that he was thinking. He believed that this reasoning was immune to serious skeptical challenges. Some philosophers, most notably Lichtenberg, questioned this. They reasoned that while it might be the case that one can know one is thinking, simply by reflecting on the matter, the ontological move from thinking to a thinker seems dubious at worst, and unsupported at best. Burge argues, using only premises that Lichtenberg was himself doubtless committed to—such as that it is a worthwhile philosophical project to understand reason and reasoning—that the first-person singular concept is not dispensable in the way that Lichtenberg and others have thought. Among other things, Burge’s argument provides a vindication of Descartes’s reasoning about the cogito. The argument shows that Descartes’s inference to his existence as a thinker from the cogito is not rationally unsupported, as Lichtenberg and others had suggested.

All reasons that thinkers have are, in Burge’s terminology, “reasons-to”. That is, they are not merely recognitions of (for example) logical entailments among propositions—they enjoin one to change or maintain one’s system of beliefs or actions. This requires not merely recognition of the relevance of a rational review, but also acting upon it. “In other words, fully understanding the concept of reason involves not merely mastering an evaluative system for appraising attitudes … [but also] mastering and conceptualizing the application of reasons in actual reasoning” (1998, 389). Furthermore, reasons must sometimes exert their force immediately. Their implementational relevance, that is to say, is sometimes not subject to further possible rational considerations. Instead, the reasons carry “a rationally immediate incumbency to shape [attitudes] in accordance with the evaluation” of which the reasons are part (1998, 396). Burge argues that full understanding of reasoning in general, and this rational immediacy in particular, requires understanding and employing the full “I”-concept. If correct, this refutes Lichtenberg’s contention that the “I”-concept is only practically necessary; and it supports Descartes’s view that understanding and thought alone are sufficient to establish one’s existence as a thinker. Only by adverting to the “I” concept can we fully explain the immediate rational relevance that reasons sometimes enjoy in a rational activity.

10. Reflection

Burge has also discussed the epistemology of intellection (that is, reason and understanding) and reflection. He argues that classical rationalists maintained three principles concerning reflection. One is that reflection in an individual is always, at least in principle, sufficient to bring to conscious articulation steps or conclusions of the reflection. Another is that reflection is capable of yielding a priori warranted belief and knowledge of objective subject matters. The final classical principle about reflection is that success in reflection requires skillful reasoning and is frequently difficult—it is not a matter simply of attaining immediate understanding or knowledge from a “flash” of insight (2013a, 535-537).

Burge accepts the second and third principles about reflection but rejects the first. He argues that anti-individualism together with advances in psychology show the first principle to be untenable. Anti-individualism shows that “the representational states one is in are less a matter of cognitive control and internal mastery, even ‘implicit’ cognitive control and mastery, than classical views assumed” (2013a, 538). Advances in psychology cast doubt on the first thesis primarily because it seems that many nonhuman animals, as well as human infants, think thoughts (and thus have concepts) despite lacking the ability to reflect on them; and because it has become increasingly clear that much cognition is modular and therefore inaccessible to conscious reflection, even in normal, mature human beings.

Burge has also carried out extensive work on how reflection can (and sometimes, unaided, cannot) “yield fuller understanding of our own concepts and conceptual abilities” (2007d, 165); on the emergence of logical truth and logical consequence as the key notions in understanding logic and deductive reasoning (which discussion includes an argument that fully understanding reasoning commits one ontologically to an infinite number of mathematical entities) (2003a); and on the nature and different forms of incomplete understanding (2012, section III). Finally, a substantial portion of Burge’s other work makes extensive use of a priori reflection—an excellent example being “Memory and Persons” (see section 9).

11. Perception

Burge’s writing on perception is voluminous in scope. Most historically important is Origins of Objectivity (2010). [This book is not most centrally about perception, as some commentators have suggested, but on what its title indicates: the conditions necessary and sufficient for objective psychological reference. A much more complete treatment of perception is to be found in the successor volume to OriginsPerception: First Form of Mind (2021)]. The first part of the present section deals with Burge’s work on the structure and content of perception. The second part briefly describes his 2020 article on perceptual warrant.

Origins is divided into three parts. Part I provides an introduction, a detailed discussion of terminology, and consideration of the bearing of anti-individualism on the rest of the volume’s contents. Part II is a wide-ranging discussion of conceptions of the resources necessary for empirical reference and representation, covering both the analytic and the continental traditions, and spanning the entire 20th century. Part III develops in some detail Burge’s conception of perceptual representation: including biological and methodological backgrounds; the nature of perception as constitutively associated with perceptual constancies; discussion of some of the most basic perceptual representational categories; and a few “glimpses forward”, one of which is mentioned below.

Part I characterizes a view that Burge calls “Compensatory Individual Representationalism” (CIR). With respect to perception, this is the view that the operation of the perceptual system, even when taken in tandem with ordinary relevant causal relations, is insufficient for objective reference to and representation of the empirical world. The individual perceiver must herself compensate for this insufficiency in some way if objective reference is to be possible. This view is then contrasted with Burge’s own view of the origins of objective reference and representation, which is partly grounded in anti-individualism as well as the sciences of perceptual psychology, developmental psychology, and ethology.

Part II of Origins critically discusses all the major versions of CIR. The discussion is comprehensive, including analyses of several highly influential 20th-century philosophers (and some prominent psychologists) who reflected upon the matter in print. There are two families of CIR. The first family holds that a more primitive level of representation is needed, underlying ordinary empirical representation, without which representation of prosaic entities in the environment is not possible. Bertrand Russell is an example of one who held a first-family version of CIR. Representation of the physical world, on his view, was parasitic upon being acquainted—representing—sense data (2010, 119). Second family forms of CIR did not require a more primitive level of representation. They did require, however, that certain advanced competencies be in place if objective reference and empirical representation are to be possible. Peter Strawson, for example, held that objective representation requires the use of a comprehensive spatial framework, as well as the use of one’s position in this represented allocentric space (2010, 160).

Both families of CIR share a negative and a positive claim. The negative claim is that the normal functioning of a perceptual system, together with regular causal relations, is insufficient for objective empirical representation. The positive claim is that such representation requires that an individual in some way herself represents necessary conditions upon objective representation. Burge argues that all versions of CIR are without serious argumentative or empirical support. This includes even versions of CIR that are compatible with anti-individualism. Burge extracted the detailed discussion of Quine’s version of the syndrome in an article (2009b).

The central chapter of Part III of Origins, chapter 9, discusses Burge’s conception of the nature of perceptual representation, including what distinguishes perception from other sensory systems. It argues that perception is paradigmatically attributable to individuals; sensory; representational; a form of objectification; and involves perceptual constancies. All perception must occur in the psychology of an individual with perceptual capacities, and in normal cases some individual perceptions must be attributable to the individual (as opposed to its subsystems). Perception is a special sort of sensory system—a system that functions to represent through the sort of objectification that perceptual constancies consist in. Perception is constitutively a representational competence, for Burge. Objectification involves, inter alia, marking an important divide between mere sensory responses, on one hand, and representational capacities that include such responses, but which cannot be explained solely in terms of them, on the other (2010, 396). Finally, perceptual constancies “are capacities to represent environmental attributes, or environmental particulars, as the same, despite radically different proximal stimulations” (2010, 114).

Burge argues that genuine objective perception begins, for human beings, nearly at birth, and is achieved in dozens or hundreds of other animal species, including some arthropods. The final chapter of the book includes “glimpses beyond”. It points, perhaps most importantly, toward Burge’s work—thus far unpublished—explaining the origins of propositional thought, including what constitutively distinguishes propositional representation from perceptual and other forms of representation. (Burge has published, in addition to the discussion in Origins of Objectivity, some preparatory work in this direction (2010a).)

The remainder of this section briefly discusses Burge’s 2020 work on perceptual warrant. This lengthy article is divided into five substantial sections. The first consists in a largely or wholly a priori discussion of the nature of epistemic warrant, including discussion of the distinction between justification and entitlement; and the nature of representational and epistemic functions and goods. Two of the most important theses defended in the first section are the following: (i) the thesis that, setting aside certain probabilistic cases and beliefs about the future, epistemic warrant certifies beliefs as knowledge—that is, if a perceptual belief (say) is warranted, true, and does not suffer from Gettier-like problems, then the belief counts as knowledge; and (ii) the thesis that epistemic warrant cannot “block” knowledge. That is to say, whatever epistemic warrant is, it cannot be such that it prevents a relevantly warranted belief from becoming knowledge. Burge uses these theses to argue for the inadequacy of various attempts at describing the nature of epistemic warrant.

The second section uses the a priori connections between warrant, knowledge, and reliability to argue against certain (internalist) conceptions of empirical warrant. The central move in the argument against epistemic internalism about empirical warrant is the thesis that warrant and knowledge require reliability in normal circumstances, but that nothing in perceptual states or beliefs taken in themselves ensures such reliability. Burge argues for the reliability requirement on epistemic warrant by an appeal to the “no-blockage” thesis—any unreliable way of forming beliefs would block those beliefs from counting as knowledge. So the argument against epistemic internalism has two central steps. First, the “no-blockage” thesis shows that reliability, at least in certain circumstances, is required for an epistemic warrant. And second, nothing that is purely “internal” to a perceiver ensures that her perceptual state-types are reliably veridical; or, therefore, that her perceptual belief-types are reliably true. Hence, internalism cannot be a correct conception of perceptual warrant.

The third section discusses differences between refuting skeptical theses, on one hand, and providing a non-question-begging response to a skeptical challenge, on the other. (In section VI of “Perceptual Entitlement” (2003c), for example, Burge explains perceptual warrant but does not purport to answer skepticism.) Burge argues that many epistemologists have conflated these two projects, with the result (inter alia) that the nature of epistemic warrant has been obscured. The fourth section argues that a common line of reasoning concerning “bootstrapping” is misconceived. Some have held that if, as on Burge’s view, empirical warrants do not require justifying reasons, then there is the unwelcome consequence that we can infer inductively from the most mundane pieces of empirical knowledge, or warranted empirical beliefs, that our perceptual belief-forming processes are reliable. Burge argues that it is not the nature of epistemic warrant that yields this unacceptable conclusion but instead a misunderstanding concerning the nature of adequate inductive inference. Finally, the fifth section argues at length against the view that conceptions of warrant like Burge’s imply unintuitive results in Bayesian confirmation theory (2020).

12. History of Philosophy

Finally, Burge has done sustained and systematic work on Frege. The work tends to be resolutely historical in focus. All but two of his articles on Frege are collected in Truth, Thought, Reason (2005). The others are Burge (2012) and (2013c). The latter article contains Burge’s fullest discussion of the relation between philosophy and history of philosophy.

The substantial introduction to Burge (2005) is by far the best overview of Burge’s work on Frege. The introduction contains not only a discussion of Frege’s views and how his collected essays relate to them, but also Burge’s most complete explanation of wherein his own views differ from Frege’s. The first essay provides a valuable, quite brief introduction to Frege and his work (2005a). The remaining essays are divided into three broad categories. The first discusses Frege’s views on truth, representational structure, and Frege’s philosophical methodology. The second category deals with Frege’s views on sense and cognitive value. Included in this category is the article that Burge believes is his philosophically most important article on Frege (1990). Finally, the third section of Burge’s collection of essays on Frege treats aspects of Frege’s rationalist epistemology. One of the articles on Frege that do not appear in Burge (2005) critically discusses an interpretation of Frege’s notion of sense advanced by Kripke; it also provides an extended discussion of the nature of incomplete understanding (2012). The other paper discusses respects in which Frege has influenced subsequent philosophers and philosophy (2013c).

Burge has also done historical work on Descartes, Leibniz, and Kant. Much of this work remains unpublished, save three articles. One traces the development and use of the notion of apriority through Leibniz, Kant, and Frege (2000). The other two discuss Descartes’s notion of mental representation, especially including evidence for and against the view that Descartes was an anti-individualist about representational states and events (2003d; 2007c).

13. Psychology

Much of Burge’s work on perception is also a contribution to the philosophy of psychology or even to the science of psychology itself (for example, 1991a; 2010; 2014a; 2014b). He was the first to introduce into philosophical discussion David Marr’s groundbreaking work on perception (Burge, 1986c). Burge himself has also published a couple of shorter pieces in psychology (2007g; 2011b).

In addition to this, Burge published a long article in Psychological Review (2018), that is not focused on perception. This article criticizes in detail the view, common among psychologists and some philosophers, that infants and nonhuman animals attribute mental states to others. The key to Burge’s argument is recognizing and developing a non-mentalistic and non-behavioristic explanatory scheme that centers on explaining action and action targets, but which does not commit itself to the view that relevant subjects represent psychological subject matters. The availability of this teleological, conative explanatory scheme shows that it does not follow, other things equal, from the fact that some infants and nonhuman animals represent actions and actors that they attribute mental states to these actors.

14. References and Further Reading

a. Primary Literature

i. Books

  • (2005). Truth, Thought, Reason: Essays on Gottlob Frege: Philosophical Essays, Volume 1 (Oxford: Oxford University Press).
  • (2007). Foundations of Mind: Philosophical Essays, Volume 2 (Oxford: Clarendon Press).
  • (2010). Origins of Objectivity (Oxford: Clarendon Press).
  • (2013). Cognition Through Understanding: Self-Knowledge, Interlocution, Reasoning, Reflection: Philosophical Essays, Volume 3 (Oxford: Clarendon Press).
  • (2021) Origins—Perception: First Form of Mind. (Oxford: Oxford University Press).

ii. Articles

  • (1972). ‘Truth and Mass Terms’, The Journal of Philosophy 69, 263-282.
  • (1973). ‘Reference and Proper Names’, The Journal of Philosophy 70, 425-439.
  • (1974a). ‘Demonstrative Constructions, Reference, and Truth’, The Journal of Philosophy 71, 205-223.
  • (1974b). ‘Truth and Singular Terms’, Noûs 8, 309-325.
  • (1975). ‘On Knowledge and Convention’, The Philosophical Review 84, 249-255.
  • (1977). ‘Belief De Re’, The Journal of Philosophy 74, 338-362. Reprinted in Foundations of Mind.
  • (1979a). ‘Individualism and the Mental’, Midwest Studies in Philosophy 4, 73-121. Reprinted in Foundations of Mind.
  • (1979b). ‘Semantical Paradox’, The Journal of Philosophy 76, 169-198.
  • (1982). ‘Other Bodies’, in A. Woodfield (ed.) Thought and Object (Oxford: Oxford University Press, 1982). Reprinted in Foundations of Mind.
  • (1984). ‘Epistemic Paradox’, The Journal of Philosophy 81, 5-29.
  • (1986a). ‘Intellectual Norms and Foundations of Mind’, The Journal of Philosophy 83, 697-720. Reprinted in Foundations of Mind.
  • (1986b). ‘Cartesian Error and the Objectivity of Perception’, in P. Pettit and J. McDowell (eds.) Subject, Thought, and Context (Oxford: Oxford University Press). Reprinted in Foundations of Mind.
  • (1986c). ‘Individualism and Psychology’, The Journal of Philosophy 95, 3-45. Reprinted in Foundations of Mind.
  • (1986d). ‘Individualism and Self-Knowledge’, The Journal of Philosophy 85, 649-663. Reprinted in Cognition Through Understanding.
  • (1990). ‘Frege on Sense and Linguistic Meaning’, in D. Bell and N. Cooper (eds.) The Analytic Tradition (Oxford: Blackwell). Reprinted in Truth, Thought, Reason.
  • (1991a). ‘Vision and Intentional Content’, in E. LePore and R. Van Gulick (eds.) John Searle and His Critics (Oxford: Blackwell).
  • (1991b). ‘Frege’, in H. Burkhardt and B. Smith (eds.) Handbook of Ontology and Metaphysics (Munich: Philosophia Verlag). Reprinted in Truth, Thought, Reason.
  • (1992). ‘Philosophy of Language and Mind: 1950-1990’, The Philosophical Review 101, 3-51. Expanded version of the portion on mind in Foundations of Mind.
  • (1993a). ‘Content Preservation’, The Philosophical Review 102, 457-488. Reprinted in Cognition Through Understanding.
  • (1993b). ‘Mind-Body Causation and Explanatory Practice’, in J. Heil and A. Mele (eds.) Mental Causation (Oxford: Oxford University Press, 1993). Reprinted in Foundations of Mind.
  • (1996). ‘Our Entitlement to Self-Knowledge’, Proceedings of the Aristotelian Society 96, 91-116. Reprinted in Cognition Through Understanding.
  • (1997a). ‘Interlocution, Perception, and Memory”. Philosophical Studies 86, 21-47. Reprinted in Cognition Through Understanding.
  • (1997b). ‘Two Kinds of Consciousness’, in N. Block, O. Flanagan, and G. Güzeldere (eds.) The Nature of Consciousness (Cambridge, MA: MIT Press). Reprinted in Foundations of Mind.
  • (1998). ‘Reason and the First Person’, in C. Wright, B. Smith, and C. Macdonald (eds.) Knowing Our Own Minds (Oxford: Clarendon Press). Reprinted in Cognition Through Understanding.
  • (1999). ‘Comprehension and Interpretation’, in L. Hahn (ed.) The Philosophy of Donald Davidson (Chicago, IL: Open Court Press). Reprinted in Cognition Through Understanding.
  • (2000). ‘Frege on Apriority’, in P. Boghossian and C. Peacocke (eds.) New Essays on the A Priori (Oxford: Oxford University Press). Reprinted in Truth, Thought, Reason.
  • (2003a) ‘Logic and Analyticity’, Grazer Philosophische Studien 66, 199-249.
  • (2003b) ‘Memory and Persons’, The Philosophical Review 112, 289-337. Reprinted in Cognition Through Understanding.
  • (2003c). ‘Perceptual Entitlement’, Philosophy and Phenomenological Research 67, 503-548.
  • (2003d). ‘Descartes, Bare Concepts, and Anti-individualism’, in M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press).
  • (2003e). ‘Mental Agency in Authoritative Self-Knowledge’, M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press).
  • (2005a). ‘Frege’, in Truth, Thought, Reason.
  • (2007a). ‘Disjunctivism and Perceptual Psychology’, Philosophical Topics 33, 1-78.
  • (2007b). ‘Predication and Truth’, The Journal of Philosophy 104, 580-608.
  • (2007c). ‘Descartes on Anti-individualism’, in Foundations of Mind.
  • (2007d). ‘Postscript: “Individualism and the mental”’, in Foundations of Mind.
  • (2007e). ‘Reflections on Two Kinds of Consciousness’, in Foundations of Mind.
  • (2007f). ‘Postscript: “Belief De Re”’, in Foundations of Mind.
  • (2007g). ‘Psychology Supports Independence of Phenomenal Consciousness: Commentary on Ned Block’, Behavioral and Brain Sciences, 30, 500-501.
  • (2009a). ‘Five Theses on De Re States and Attitudes’, in J. Almog and P. Leonardi (eds.) The Philosophy of David Kaplan (New York: Oxford University Press).
  • (2009b). ‘Perceptual Objectivity’, The Philosophical Review 118, 285-324.
  • (2010a). ‘Steps toward Origins of Propositional Thought’, Disputatio 4, 39-67.
  • (2010b). ‘Modest Dualism’, in R. Koons and G. Bealer (eds.) The Waning of Materialism (New York: Oxford University Press). Reprinted in Cognition Through Understanding.
  • (2011a). ‘Self and Self-Understanding’: The Dewey Lectures. Presented in 2007. Published in The Journal of Philosophy 108, 287-383. Reprinted in Cognition Through Understanding.
  • (2011b). ‘Border-Crossings: Perceptual and Post-Perceptual Object Representation’, Behavioral and Brain Sciences 34, 125.
  • (2012). ‘Living Wages of Sinn’, The Journal of Philosophy 109, 40-84. Reprinted in Cognition Through Understanding.
  • (2013a). ‘Reflection’, in Cognition Through Understanding.
  • (2013b). ‘Postscript: Content Preservation’, in Cognition Through Understanding.
  • (2013c). ‘Frege: Some Forms of Influence’, in M. Beaney (ed.) The Oxford Handbook of the History of Analytic Philosophy. Oxford: Oxford University Press.
  • (2014a). ‘Adaptation and the Upper Border of Perception: Reply to Block’, Philosophy and Phenomenological Research 89, 573-583.
  • (2014b). ‘Perceptual Content in Light of Perceptual Consciousness and Biological Constraints: Reply to Rescorla and Peacocke’, Philosophy and Phenomenological Research 88, 485-501.
  • (2018). ‘Do Infants and Nonhuman Animals Attribute Mental States?’ Psychological Review 125, 409-434.
  • (2019). ‘Psychological Content and Ego-Centric Indexes’, in A. Pautz and D. Stoljar (eds.) A. Pautz, Blockheads! Essays on Ned Block’s Philosophy of Mind and Consciousness (Oxford: Oxford University Press).
  • (2020). ‘Entitlement: The Basis for Empirical Warrant’, in N. Pederson and P. Graham (eds.) New Essays on Entitlement (Oxford: Oxford University Press).

b. Secondary Literature

  • Two volumes of essays have been published on Burge’s work: M. Frápolli and E. Romero (eds.) Meaning, Basic Self-Knowledge, and Mind: Essays on Tyler Burge (Stanford, CA: CSLI Publications, 2003); and M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press, 2003). The second volume is nearly unique, among Festschriften, in that Burge’s responses make up nearly half of the book’s 470 pages. Further pieces include the following:
  • An article on Burge in The Oxford Companion to Philosophy, Ted Honderich (ed.) Oxford: Oxford University Press, 1995.
  • An article on Burge, in Danish Philosophical Encyclopedia. Politikens Forlag, 2010.
  • Interview with Burge. Conducted by James Garvey, The Philosophers’ Magazine, 2013—a relatively wide-ranging yet short discussion of Burge’s views.
  • Interview with Burge. Conducted by Carlos Muñoz-Suárez, Europe’s Journal of Psychology, 2014—a discussion focused on anti-individualism and perception.
  • Article on Burge, in the Cambridge Dictionary of Philosophy, Peter Graham, 2015.
  • Article on Burge, in the Routledge Encyclopedia of Philosophy, Mikkel Gerken and Katherine Dunlop, 2018—provides a quick overview of some of Burge’s philosophical contributions.
  • Article on Burge, in Oxford Bibliographies in Philosophy, Brad Majors, 2018—contains brief summaries of most of Burge’s work, together with descriptions of a small portion of the secondary literature.

 

Author Information

Brad Majors
Email: bradmajors9@gmail.com
Baker University
U. S. A.

Persistence in Time

No person ever steps into the same river twice—or so goes the Heraclitean maxim. Obscure as it is, the maxim is often taken to express two ideas. The first is that everything always changes, and nothing remains perfectly similar to how it was just one instant before. The second is that nothing survives this constant flux of change. Where there appears to be a single river, a single person or, more generally, a single thing, there in fact is a series of different instantaneous objects succeeding one another. No person ever steps into the same river twice, for it is not the same river, and not the same person.

Is the Heraclitean maxim correct? Is it true that nothing survives change, and that nothing persists through time? These ancient questions are still at the center of contemporary metaphysics. This article surveys the main contemporary theories of persistence through time, such as three-dimensionalism, four-dimensionalism and the stage view (§ 1), and reviews the main objections proposed against them (§ 2, 3, 4).

Theories of persistence are an integral part of the more general field of the metaphysics of time. Familiarity with other debates in the metaphysics of time, universals, and mereology is here presupposed and can be acquired by studying the articles ‘Time’, ‘Universals’, ‘Properties’, and ‘Material Constitution’ in this encyclopedia.

Table of Contents

  1. Theories of Persistence
    1. The Basics
    2. Locative Theories of Persistence
    3. Non-Locative Theories of Persistence
    4. What is a Temporal Part?
    5. Theories of Persistence and Theories of Time
    6. The Persistence of Events
  2. Arguments against Endurantism
    1. The Argument from Change, a.k.a. from Temporary Intrinsics
    2. The Argument from Coincidence
    3. The Argument from Vagueness
    4. The Unintelligibility Objection
    5. Arguments against Specific Versions of Endurantism
  3. Arguments against Perdurantism
    1. The Argument from Intuition
    2. The No-Change Objection
    3. The Crazy Metaphysic Objection
    4. The Objection from Ontological Commitment
    5. The Category Mistake Argument
    6. The Unintelligibility Objection
    7. The Objection from Counting
  4. Arguments against Stage View
    1. The Argument from Intuition
    2. The No-Change Objection
    3. The Crazy Metaphysic Objection
    4. The Objection from Ontological Commitment
    5. The Objection from Temporal Gunk
    6. The Objection from Mental Events
    7. The Objection from Counting
  5. What Is Not Covered in this Article
  6. References and Further Reading

1. Theories of Persistence

This chapter presents contemporary theories of persistence from their most basic (§ 1a) to their most advanced forms (§ 1b and § 1c). It then discusses some ways of making sense of temporal parts (§ 1d), the relation between theories of persistence and theories of time (§ 1e), and the topic of the persistence of events (§ 1f).

a. The Basics

While the Heraclitean maxim denies that anything survives change and persists through time, we normally assume that some things do survive change and do persist through time. This bottle of sparkling water, for example, was here 5 minutes ago, and still is, despite its being now half empty. This notepad, for another example, will still exist tonight, even if I will have torn off some of its pages. In other words, we normally assume some things to persist through time. But before wondering whether our assumptions are right or wrong, we should wonder: what is it for something to persist? Here is an influential definition, first introduced by David Lewis (1986, 202): 

Persistence  Something persists through time if and only if it exists at various times.

So, the bottle persists through time, if it does at all, because it exists at various times—such as now as well as five minutes ago, and the notepad persists through time because it exists at various times—such as now as well as later tonight.

Lewis’ definition makes use of the notion of existence at a time. The notion is technical, but its intended meaning should be clear enough. The following intuitive gloss might help clarify it. Something exists at, and only at, those times at which it is, in some sense, present, or to be found. So, Socrates existed in 400 B.C.E. but not in 1905, while I exist in 2019, at all instants that make up 2019, but at no time before the date of my birth (on temporal existence: Sider 2001: 58-59).

Persistence through time is sometimes also alternatively called ‘diachronic identity’—literally, ‘identity across time’. The reason for this name is simple enough. If this notepad exists now and will also exist afterwards, then there is a sense in which the notepad which exists now and the notepad that will exist later on are the same and identical. In which sense are they identical? What is the kind of identity here involved?

It is useful to introduce here a fundamental distinction between numerical and qualitative identity. On the one hand, numerical identity is the binary relation that anything bears to itself, and to itself alone (Noonan and Curtis 2018). For example, I, like everything else, am numerically identical to myself and to nothing else. Superman, for another example, is numerically identical to Clark Kent and Augustus is numerically identical with the first Roman emperor. This relation is called ‘numerical identity’, for it is related in an important way with the number of entities that exist. If superman is numerically identical to Clark Kent, then they are one entity, and not two. And if superman is numerically different from batman, then they are two entities, and not one. On the other hand, qualitative identity is nothing else than perfect similarity (Noonan and Curtis 2018). If two water molecules could have exactly the same mass, electrical charge, spatial configuration, and so on, so as to be perfectly similar, then they would be qualitatively identical. (It is controversial whether two entities can ever be perfectly similar—more on this later. Still, it is not difficult to find cases of perfect similarity. For example, an entity at a time is perfectly similar to itself at the same time.)

Having distinguished qualitative and numerical identity, what is, again, the sense of identity that is involved in diachronic identity? It is numerical identity. For recall: the question was whether, say, a river is a single—thus one—entity existing at different times, or rather a series of—thus many—instantaneous entities existing one after another.

Here is a second outstanding question that concerns persistence. Suppose that the Heraclitean maxim is wrong, and things persist through time. Do all things that persist through time persist in the same way? Or are there different ways of persisting through time? The consensus is that there are in fact several ways of persisting through time. In order to appreciate this fact, it is useful to contrast two kinds of entities that are supposed to persist, in one sense or another, through time: events and material objects. On the one hand, consider events. An event is here taken to be anything that is said to occur, happen, or take place (Cresswell 1986, Hacker 1982). Examples include a football match, a war, the spinning of a sphere, the collision of two electrons, the life of a person. Changes, processes, and prolonged states, if any, are notable examples of events. On the other hand, a material object can be thought of as the subject of those events, such as the football players, the soldiers, the sphere, the electrons and the person who lives. (For more on events see: What is an Event?)

Both material objects and events, or at least some of them, seem to persist through time. We have already discussed some examples involving objects, and it is equally easy to find examples of persisting events—basically, any temporally extended event would do. However, even if both objects and events seem to persist through time, they seem to do that in two different ways. An event persists through time by having different parts at different times. For example, a football match has two halves. These halves are parts of the match. But clearly enough they are not spatial parts of the match: they are not spread across different places, but across different times. That is why such parts are called ‘temporal parts’. The way of persisting of an event, by having different temporal parts at different times, is called ‘perdurance’ (Lewis 1986: 202).

Perdurance  Something perdures if and only if it persists by having different temporal parts at different times.

Throughout this article, ‘part’ means ‘proper part’, unless otherwise specified.

On the other hand, an object seems to persist in a different way. If an object persists through time, what is present of an object at different times is not a part of it, but rather the object itself, in its wholeness or entirety. This way of persisting, whereby something persists by being wholly present at different times, is called ‘endurance’ (Lewis 1986: 202). (‘Wholly present’ here clearly contrasts with the ‘partial’ presence of an event at different times—more on this later.)          

Endurance  Something endures if and only if it persists by being wholly present at different times.

That being said, the contemporary debate on persistence focuses on material objects. In which way do they persist, if at all? A first theory, which takes the intuitions presented so far at face value, says that objects do indeed persist by being wholly present at different times, and so endure. (Endurantists include Baker (1997, 2000); Burke (1992, 1994); Chisholm (1976); Doepke (1982); Gallois (1998); Geach (1972a); Haslanger (1989); Hinchliff (1996); Johnston (1987); Lombard (1994); Lowe (1987, 1988, 1995); Mellor (1981, 1998); Merricks (1994, 1995); Oderberg (1993); Rea (1995, 1997, 1998); Simons (1987); Thomson (1983, 1998); van Inwagen (1981, 1990a, 1990b); Wiggins (1968, 1980); Zimmerman (1996).)               

Endurantism Ordinary material objects persist by being wholly present at different times; they are three-dimensional entities.

Endurantism is usually taken to be closer to common sense and favored by our intuitions. However, as we see later, endurantism does not come without problems. Due to those problems, and inspired by the spatiotemporal worldview suggested by modern physics, contemporary philosophers have also taken seriously the idea that objects are four-dimensional entities spread out both in space and time, and which divide into parts just like their spatiotemporal location does, and thus persist through time by having different temporal parts at different times, just like events do. This view is called perdurantism. (Perdurantists include Armstrong (1980); Balashov (2000); Broad (1923); Carnap (1967); Goodman (1951); Hawley (1999); Heller (1984, 1990); Le Poidevin (1991); Lewis (1986, 1988); McTaggart (1921, 1927); Quine (1953, 1960, 1970, 1981); Russell (1914, 1927); Smart (1972, 1963); Whitehead (1920).)              

Perdurantism Ordinary material objects persist by having different temporal parts at different times; they are four-dimensional entities.

Perdurantism is also known as ‘four-dimensionalism’—for perdurantism has it that objects are extended in four dimensions (this contrasts with endurantism, according to which objects are extended at most in the three spatial dimensions, and hence is also called ‘three-dimensionalism’).

Under perdurantism, what exists of me at each moment of my persistence is, strictly speaking, a temporal part of me. And each of my temporal parts is numerically different from all others.

One might be tempted to think that, as a consequence, perdurantism denies that I persist through time. This would be a mistake. While my instantaneous temporal parts do not persist—they exist at one time only—I am not any of those parts. I, as a whole person, am the temporally extended collection, or mereological sum, of all those parts. Hence, I, as a whole person, exist at different times, and thus persist. Compare this with the spatial case. I occupy an extended region of space by having different spatial parts at different places. But I am not numerically identical to those parts. I, as a whole, exist at different places in the sense that in those different places there is a part of me. That is why perdurance implies persistence through time.

We started this article with the question of whether objects persist through time. We have so far presented two theories, and both of them affirm that objects do persist through time. It is now time to introduce a third theory of persistence, the one that consists in the denial of this claim, and that has it that, in place of seemingly persisting objects, there really is a series of instantaneous stages. This theory is called the ‘stage view’, or also ‘exdurantism’. (Stage viewers include Hawley (2001), Sider (1996, 2001), Varzi (2003).)

Stage view   Ordinary material objects do not persist through time; in place of a single persisting object there really is a series of instantaneous stages, each numerically different from the others.

  The stage view is often confused with perdurantism. The reason is that many contemporary stage viewers believe in a mereological doctrine called ‘universalism’, or also ‘unrestricted fusion’. According to mereological universalism, given a series of entities, no matter how scattered and unrelated, there is an object composed of those entities (see Compositional Universalism). If we combine the stage view with universalism, we get to an ontology in which the stages compose four-dimensional objects which are just like the four-dimensional objects of the perdurantist.

However, the two views are clearly distinct. Here are a few crucial differences. (i) There is, first, a semantic difference: under perdurantism, singular terms referring to ordinary objects, such as “Socrates”, usually refer to persisting, four-dimensional objects, whereas under the stage view, singular terms referring to ordinary objects refer to one instantaneous stage (which particular stage is referred to is determined by the context). So, while under the stage view there might be four-dimensional objects, so-called ordinary objects (such as “Socrates”) are not identified with them, but rather with the stages (Sider 2001, Varzi 2003). (It should be pointed out that significant work is here done by the somehow elusive notion of ‘ordinary object’; see Brewer and Cumpa 2019.) (ii) A second crucial difference has to do with the metaphysical commitment to four-dimensional entities. While perdurantism is by definition committed to four-dimensional entities, the stage view is by definition only committed to the existence of instantaneous stages. If the stage viewer eventually believes in four-dimensional collections of those stages—and she might well not—such a commitment is not an essential part of her theory of persistence. (iii) A third interesting difference has to do with the metaphysical commitment to the instantaneous stages. While this commitment is built into the stage view, it is not built into four-dimensionalism (Varzi 2003). A four-dimensionalist might believe her temporal parts to be always temporally extended and deny the existence of instantaneous temporal parts (for example, because she believes that time is gunky. Incidentally, it is worth noting that from a historical point of view, the guiding intuition of the stage view—namely that objects do not persist through time or change—emerged much earlier than the guiding intuition of four-dimensionalism. While the former can be traced back to, if not Heraclitus, at least the academic skeptics (Sedley 1982), the latter, as far as we know, emerged no earlier than the end of the XIX century (Sider 2001).

b. Locative Theories of Persistence

Here are, again, the definitions of endurantism and perdurantism that we introduced above:           

Endurantism Ordinary material objects persist by being wholly present at different times; they are three-dimensional entities.
Perdurantism Ordinary material objects persist by having different temporal parts at different times; they are four-dimensional entities.

One can appreciate the fact that these definitions seem to mix together two aspects of persisting objects (Gilmore 2008). First, there is the mereological aspect. There, the question is whether persisting objects have temporal parts or not. Second, there is an aspect that concerns the shape and size of persisting objects. There, the question is whether persisting objects have a four-dimensional shape, and are temporally extended, or have a three-dimensional shape, and are not extended in time. How can we make sense of these two aspects? What is it for something to be three- or four-dimensional? And how can we make sense of what a temporal part really is? While the latter question is tackled in section § 1d, we shall now focus on the former question concerning shape and extension.

So, what is it for something to be three- or four-dimensional? An illuminating approach to this question—an approach that everyone who wants to work on persistence must be familiar with—comes from location theory (Casati and Varzi 1999, Parsons 2007). We shall thus focus on location first, and then come back to persistence.

Location is here taken to be a binary relation between an entity and a region of a dimension—be it space, time, spacetime—where the entity is in some sense to be found (Casati and Varzi 1999). Location is ambiguous. There is a weak sense, in which you are located at any region that is not completely free of you. In that sense, for example, reaching an arm inside a room would be enough to be weakly located in that room. But there is also a more exact sense, in which you are located at that region of space that is of your shape, size, and that is as distant to everything else as you are—roughly, the region that is determined by your boundaries (Gilmore 2006, Parsons 2007). We shall here follow standard practice and call these modes of location ‘weak location’ and ‘exact location’, respectively.

The intuitive gloss related to exact location suggests that it is interestingly linked to shape, and thus offers us a way of making a more precise sense of what is it for something to be three- or four-dimensional. To be four-dimensional simply is to be exactly located at a four-dimensional spacetime region, while to be three-dimensional is to be located at spacetime regions that are at most three-dimensional. The same gloss helps us make sense of what it is for something to be extended or unextended in time. To be extended in time is for something to be exactly located at a temporally extended spacetime region, while for something to be temporally unextended is for it to be exactly located at temporally unextended spacetime regions only (Gilmore 2006).

At this point, it might be useful to sum up the two aspects mixed together in the definitions of endurantism and perdurantism offered above. We should distinguish: (i) the mereological question of whether persisting objects have temporal parts, and (ii) the locative question of whether objects are exactly located at temporally extended, four-dimensional spacetime regions or rather at temporally unextended, three-dimensional regions only.          

Mereological endurantism Ordinary persisting objects do not have temporal parts.
Mereological perdurantism Ordinary persisting objects have temporal parts.
Locative three-dimensionalism Ordinary persisting objects are exactly located at temporally unextended regions only.
Locative four-dimensionalism Ordinary persisting objects are exactly located at the temporally extended region of their persistence only.

Let us explore locative three-dimensionalism further. In particular, we explore here two consequences of the view. First, locative three-dimensionalism has it that objects persist, thus covering a temporally extended region. But they persist by being exactly located at temporally unextended regions. This requires the persisting object to be located at more than one unextended region; more precisely, at all those unextended regions that collectively make up the spacetime region covered during their persistence. Hence, locative three-dimensionalism implies multi-location, that is, the fact that a single entity has more than one exact location (Gilmore 2007). This contrasts with the unique, four-dimensional, spatiotemporal location of an object under locative four-dimensionalism.(Two remarks are in order. First, there is logical space for other locative views as well, but we shall not consider them here. Second, these definitions make use of the notion of persistence, which can now be defined in locative terms as well. Here is a simple way of doing this. Let us define the path of an entity as the mereological sum of its exact locations (Gilmore 2006). An entity persists if its path is temporally extended.)

A second interesting consequence of the view is that, under plausible assumptions, persisting objects will not have temporal parts, for what exists of an entity at a time is the entity itself, exactly located at that time, and not a temporal part thereof. So, under plausible assumptions, locative three-dimensionalism implies mereological endurantism: if something is three-dimensional it does not have temporal parts.

Interestingly, however, being multi-located at instants is not the only way to persist without temporal parts. In principle, something might be exactly located at a four-dimensional, temporally extended spacetime region without dividing into temporal parts. This is the case if the persisting, four-dimensional object is also an extended simple, that is, an entity that is exactly located at an extended region, but is also mereologically simple, in that it lacks any parts (for more on the definition, possibility and actuality of extended simples, see Hudson 2006, Markosian 1998, McDaniel 2003, 2007a, 2007b, Simons 2004). Lacking any parts at all, the persisting object will also lack any temporal parts, thus being mereologically enduring. We shall call simplism this combination between mereological endurantism and locative four-dimensionalism (Costa 2017, Parsons 2000, 2007).     

Simplism Ordinary persisting objects are mereologically simple and exactly located at the temporally extended region of their persistence only.

To sum up, making use of some conceptual tools borrowed from location theory allowed us to make better sense of perdurantism and its claim that persisting objects are four-dimensional, temporally extended entities. Moreover, it allowed us to distinguish two forms of endurantism, namely locative three-dimensionalism according to which persisting objects are exactly located at instantaneous, three-dimensional regions of spacetime, and thus lack temporal parts, and simplism, according to which persisting objects are four-dimensional, temporally extended, mereological simples, and thus lack temporal parts.

c. Non-Locative Theories of Persistence

The previous section described two radically different ways of capturing endurantism. Interestingly enough, both of them seem to be committed to controversial claims, such as the actuality of multi-location or of extended simples. Of course, any objection against the actuality of multi-location and of extended simples counts de facto also as an objection against either form of endurantism. We cover some of these objections below. For the time being, suffice it to say that both forms of endurantism are controversial.

Some scholars have taken this result as evidence that endurantism is hopeless (Hofweber and Velleman 2011). But others have taken it as a reason to look for other ways of making sense of endurantism (Fine 2006, Hawthorne 2008, Hofweber and Velleman 2011, Costa 2017, Simons 2000a). So far, we have worked under the standard assumption that it is useful and correct to try to make sense of endurantism in locative terms, that is, under the assumption that the relation between objects and times is the one described in location theory. Some scholars take this assumption to be fundamentally misguided.

Why do they believe this assumption to be fundamentally misguided? One reason might come from intuitions embedded in natural language. Fine (2006), for instance, provides linguistic data in support of the idea that objects and events are in time in fundamentally different ways, which he calls ‘existence’ and ‘extension/location’, respectively (he also offers linguistic data in support of the idea that objects and events are in space in the same way in which events are in time). Moreover, he suggests that two radically different forms of presence might come with different mereological requirements: if something is extended/located at a region, it divides into parts throughout that region, while if something exists at an extended region, it divides into parts throughout that region. Since objects are taken to exist at times instead of being extended/located at times, they will not divide into temporal parts.

Another source of evidence from natural language comes from the attribution of temporal relations (van Fraassen 1970). The intuitive gloss for exact location required any temporally located entity to enter temporal relations. However, it is awkward to attribute temporal relations to objects (consider “Alexander is 15 years after Socrates”) and we would naturally lean towards reinterpreting such attributions as attributions of temporal relations to events (“Alexander’s birth is 15 years after Socrates’ death”). This linguistic data might suggest two intuitions. The first one is that the relation between objects and times should not be the location of location theory. The second one is that the way in which objects are in time is derivative with respect to their events: for an object to exist at a time is for it to be the subject of an event located at that time. Under such a view, the possibility of endurantism coincides with the possibility for a single object to participate in numerically different events (Costa 2017, Simons 2000a).

A different non-locative approach consists in trying to make sense of the endurantism/perdurantism distinction in terms of what is intrinsic to a time (Hawthorne 2006, Hofweber and Velleman 2011). According to this approach, something is wholly present at a time if it is intrinsic to how things are that that very object exists at it (Hawthorne 2006) or if the identity of that object is intrinsic to that time (Hofweber and Velleman 2011). These definitions of wholly present are then plugged into the classic definition of endurance: something endures if it is wholly present at each time of its persistence.

Apart from their being grounded in natural language and intuitions, such views have been motivated on the basis of the controversy of their alternatives. Since both locative forms of endurantism are controversial, these non-locative views should be taken seriously.

d. What is a Temporal Part?

A notion that plays a fundamental role in the definition of perdurantism is the notion of a temporal part. Endurantists have sometimes lamented the notion to be substantially unintelligible (van Inwagen 1981, Lowe 1987, Simons 1987). Hence, it is in the interest of perdurantists to try and clarify it (as well as in the interest of those endurantists who believe that events perdure).

What is a temporal part, such as my present temporal part, supposed to be? First of all, it should be clear that a temporal part is not simply a part that is in time. A spatial part of me, such as my left hand, is certainly not outside time, but it is not a temporal part of mine. It is not, because it is not, in a sense, big enough: a temporal part of mine at a given time must be as big as I am at that time. So, one might be tempted to define a temporal part as a part that is as big as the whole is at the time at which the part is supposed to exist. Moreover, the notion of ‘being as big as’ might be spelled out in terms of spatial location. However, this definition would not do if there are perduring entities that are not in space (such as, for example, a Cartesian mind, or a mental state conceived of as non-spatial event) or if there are parts of objects that are as big as the object is at a time without being temporal parts of it, such as, for example, the shape trope of my body conceived of as something spatially located and as a part of me (Sider 2001). (For tropes and for located properties, see: The Ontological Basis of Properties.)

Sider (2001) offers a standard definition of a temporal part:. It reads:                      

Temporal part x is a temporal part of y at t if (i) x is a part of y at t; (ii) x exists at, and only at, t, (iii) x overlaps at t everything that is part of y at t.

Let us have a look at each clause in turn. The first one simply says that temporal parts must be parts. The second one ensures that the temporal part exists at the relevant time only. The third one ensures that it includes all of y that exists at that time. (The reader might have noticed that Sider is here using the temporary, three-place notion of parthood—x is part of y at t—instead of the familiar, binary, timeless notion—x is part of y. Here, by ‘timeless’ we simply mean that the notion is not relativized to a time, and not that what exemplifies the notion is in any sense timeless, or outside time. The use of the temporary notion is conceived as a friendly gesture towards the endurantist who usually relativizes the exemplification of properties to times—more on this in § 2a. However, temporal parts might be defined by means of the binary, timeless notion as well. One just needs to replace in the previous definition every instance of the temporary notion with the binary one, and to replace the third clause as (iii*) x overlaps every part of y that exists at t. A second note concerns the fact that Sider’s definition is supposed to work for instantaneous temporal parts. A crucial question then is how, and whether, this definition could be adapted to a metaphysics in which time is gunky (see Kleinschmidt 2017).)

e. Theories of Persistence and Theories of Time

One of the central debates of contemporary metaphysics is the debate as to whether only the present exists, or rather past, present and future all equally exist (Sider 2001). The former view is called ‘presentism’, whereas the latter is called ‘eternalism’ (for more on presentism and eternalism, as well as further alternatives, see: Presentism, the Growing-Past, Eternalism, and the Block-Universe). What are the logical relations between endurantism/perdurantism and presentism/eternalism?

While the combinations of endurantism and presentism, and of perdurantism and eternalism have usually been accepted as possible (for example Tallant 2018), for a long time, it had been supposed that endurantism and eternalism were incompatible with each other. The reasons for this supposed incompatibility are difficult to track down. Summarily, here are two possible reasons. In part, this supposed incompatibility has to do with the so-called problem of temporary intrinsics. In part, it has to do with the idea that eternalism, when combined with spacetime unitism, yields a view in which persisting objects cover a four-dimensional region of spacetime, and thus are four-dimensional and divide into temporal parts (Quine 1960, Russell 1927). Such reasons are now usually discarded. We focus on temporary intrinsics later in § 2a. We have already explained that there are at least two ways in which an object might cover a four-dimensional region of spacetime by being four-dimensional and lacking temporal parts (simplism), or even without being four-dimensional themselves (locative three-dimensionalism). Apart from these locative options, we have also remarked that there are non-locative theories of persistence, and that such theories require the rejection of spacetime unitism. If unitism is successfully rejected, then the problem, if there is one at all, seems not present itself in the first place.

Can one be a perdurantist and also a presentist? A few publications have been devoted to this question, though no conclusive answer has been reached (Benovsky 2009, Brogaard 2000, Lombard 1999, Merricks 1995). On the one hand, one might believe that nothing can be composed of temporal parts if all except one of those parts (namely the past and future ones) do not exist. On the other hand, it has been suggested that one might solve the problem by means of an accurate use of tense operators: while past temporal parts do not presently make part of our ontological catalogue, they did, and maybe their past existence is enough to make them entitled to be parts of a perduring whole.

f. The Persistence of Events

Although contemporary metaphysicians focus mainly on the persistence of objects, there are also parallel debates concerning the persistence of other kinds of entities, such as tropes, facts, dimensions, and, in particular, of events (Galton 2006, Stout 2016). Events are traditionally taken to perdure, for it is intuitively the case that events—such as a football match—divide into temporal parts, such as its two halves. This claim is also accepted by several endurantists, who believe that while objects endure, events perdure. Such a view traces back at least to medieval scholasticism (Costa 2017a). But, once again, the traditional view does not come without dissenters. Contemporary scholars have defended the idea that events or, more precisely, processes endure (Galton 2006, Galton and Mizogouchi 2009, Stout 2016). One reason to believe that at least some entities that are said to be happening endure comes from the fact that we attribute change to them, and that, allegedly, genuine change requires endurance of its subject (Galton and Mizogouchi 2009: 78-81). For example, the very same process of walking might have different speeds at different times. But for change to occur, the numerically same subject, and not temporal parts thereof, must have incompatible properties at different times (heterogeneity of parts is not enough for change to occur). Hence, changing processes must endure. Defenders of enduring processes usually tend to believe that alongside enduring processes there are also perduring events, and sometimes claim that enduring processes are picked out by descriptions that make use of imperfective verbs (such as the walking that is/was/will be happening) while perduring events are picked out by descriptions that make use of perfective verbs (such as the walking that happened/will happen) (Stout 1997: 19). To learn more about the question of whether change requires the endurance of its subject, see the No Change objection against perdurantism, discussed below in § 3b. To learn more about the alleged distinction between processes and events and the related use of (im)perfective verbs, see (Steward 2013, Stout 1997).

2. Arguments against Endurantism

Endurantism has it that objects persist by being wholly present at each instant of their persistence. Thus conceived, objects persist without having temporal parts. Endurantism is usually recognized as the theory of persistence that is closest to common sense and intuition, and thus has sometimes been described as the default view, that is, the view to be held until or unless it is convincingly shown to be hopelessly problematic. So, is endurantism hopelessly problematic?

a. The Argument from Change, a.k.a. from Temporary Intrinsics

A first serious objection against endurantism which traces back to ancient philosophy (Sedley 1982) comes from change. In its simplest form, the objection sounds as follows. Change seems to require difference: if something has changed, it is different from how it was. But if it is different, it cannot be identical, on pain of contradiction. Now, endurantism requires a changing thing to be identical across change, hence, the objection goes, endurantism is false. In this simple form, the objection has a simple answer, that relies on the distinction between qualitative and numerical identity outlined in § 1a. The kind of difference required by change is qualitative difference (not being perfectly identical), and not numerical difference (being two instead of one). Hence, in a change, you might be the same as before (numerical identity) as well as different as before (qualitative difference) without this being contradictory.

This basic argument from change can evolve into two slightly more sophisticated forms. The first form aims to show that even if this analysis of change as numerical identity and qualitative difference is offered, change still results in a contradiction. For change requires a single object—Socrates, say—to have incompatible properties, such as being healthy and sick. But of course, exemplification of incompatible properties leads to a contradiction. For who is sick is not healthy, and hence the numerically same Socrates must be both healthy and not healthy (Sider 2001).

The second slightly more sophisticated form aims to show that change is incompatible with Leibniz’ law, also called the Indiscernibility of Identicals. Leibniz’s law says that numerically identical entities must share all properties. But change thus described is incompatible with Leibniz’s law, for it requires the numerically same entity—such as Socrates at one time and Socrates at another time—not to share all properties—while Socrates at one time is sick, at a later time he is not (Merricks 1994, Sider 2001).

One way to block these two more sophisticated forms consists in rejecting the two guiding principles they rely on. But while this could perhaps more lightheartedly be done with Leibniz’s law, rejecting the Law of Non-contradiction, though not impossible (see Paraconsistent Logic), is certainly not an obviously promising move.

A second way to block these two more sophisticated forms consists in bringing time into the picture. A veritable contradiction and veritable violation of Leibniz’ law would only result from the possession of incompatible properties at the same time. But the incompatible properties of a change are had at the two ends of the change, and hence at two different times.

While this move certainly sounds promising, it is not obvious how time really comes into the picture. Here are two outstanding questions. The first one has to do with the Law of Non-contradiction and Leibniz’s law. When we first introduced them, we did not mention time at all. And in contemporary logic and metaphysics, the two laws are expressed in formulas in which time seems to play no role:      

Law of Non-Contraction (LNC)  ¬ (p ∧ ¬p)
Leibniz’ law (LL)   x = y → ∀P (P↔ Py)

Do such principles require a modification in light of the claim that incompatible properties are had at different times?

The second outstanding question has to do with the claim that a changing object has incompatible properties at different times. This seems to require objects to exemplify properties at times. But how is this temporary, or temporally relative, notion of exemplification to be understood (for example, Socrates is sick at time t), especially as opposed to the timeless notion of exemplification (for example, Socrates is sick) (Lewis 1986)? (Once again, here, by “timeless” we simply mean that the notion is not relativized to a time, and not that what exemplifies the notion is in any sense timeless, or outside time.)

Let us begin with the latter question first. What is it for an object to have a property at a time—what is it for, say, Socrates to be sick at time t? To have a look at the other side of the barricade, perdurantism and the stage view seem to have very simple answers to this question. Under the stage view, temporary exemplification is to be analyzed as timeless exemplification by an instantaneous stage: Socrates is sick at t if and only if the instantaneous stage we call Socrates that exists at t is sick (Hawley 2001, Sider 1996, Varzi 2003). Under perdurantism, temporary exemplification is to be analyzed as timeless exemplification by a temporal part: Socrates is sick at t if and only if the temporal part of Socrates that exists at t is sick (Lewis 1986: 203-204, Russell 1914, Sider 2001: 56). So, under perdurantism and the stage view, temporary exemplification is analyzed as timeless exemplification, and therefore there is no need of adapting LNC or LL in any way: the original timeless reading would do.

How would an endurantist make sense of temporary exemplification—of, say, Socrates being sick at time t? We shall here consider a few options. First, notice that if presentism is true, the endurantist too might analyze it in terms of timeless exemplification (Merricks 1995). If t were present, then “Socrates is sick at t” simply would reduce to “Socrates is sick”, full stop. If t were past/future, then “Socrates is healthy at t” would reduce to “Socrates is healthy” under the scope of an appropriate tense-operator, such as: “it was 5 years ago the case that: Socrates is healthy” (for tense operators, see: The Syntax of Tempo-Modal Logic). Moreover, since we cannot infer from “t was 5 years ago the case that: Socrates is healthy” that “Socrates is healthy”, no contradiction or violation of LL follows. However, this solution requires the endurantist to buy presentism.

Second, an endurantist might interpret “Socrates is sick at t” as involving a binary relation—the relation of “being sick at” —linking Socrates and time t (Van Inwagen 1990a, Mellor 1981). This solution does not require us to make any change to the timeless formulations of LNC and LL (it just follows that the relevant instances of LNC and LL will involve relations rather than properties). And, of course, no violation of LNC or LL would follow, insofar as Socrates’ being sick and healthy would be two incompatible relations involving different relata (compare: no contradiction follows from the fact that I love Sam and I do not love Maria). However, this requires a certain deal of metaphysical revisionism. To put it in Lewis’ words, if we know what health is, we know it is a monadic property and not a relation, and we know it is intrinsic and not extrinsic (Lewis 1986: 204) (for intrinsic properties, see:  Intrinsic and Extrinsic Properties).

Third, an endurantist might interpret “at t” as an adverbial modifier: when Socrates is sick at t, he exemplifies the property in a certain way, namely t-ly (Johnston 1987, Haslanger 1989, Lowe 1988). If this view of temporary exemplification is accepted, we should also consider more carefully how the original formulations of LNC and LL should be adapted, for the exemplification they involve is temporally unmodified. The task might be more complicated than one might expect (Hawley 2001, 21f). In any case, under certain assumptions, this adverbialist solution makes it the case that change implies no violation of LNC or LL: Socrates is sick and healthy, but in two different ways—t-ly and t’-ly (compare: the fact that I am actually sitting and possibly standing does not imply a contradiction). But, once again, this involves a certain amount of revisionism. For while adverbial modifiers correspond to different ways of exemplifying an attribute, temporal modifiers seem not to correspond to different ways of exemplifying an attribute: for example, standing on Monday and standing on Tuesday seem not to be two different ways of standing.

There are other strategies that the endurantist might use to make sense of temporary exemplification. This is not the place to go through all of them. However, it is worth noting that even if all of them require a bit of revisionism, the endurantist might actually argue the kind of revisionism they involve is less nefarious than the revisionism required to reject endurantism itself (Sider 2001, 98).

b. The Argument from Coincidence

A second objection against endurantism comes from cases in which material objects seem to mereologically coincide—that is, share all parts and— – and locatively coincide— – that is, share the same location—without being numerically identical. If there are such cases, the objection goes, endurantists have a hard time making sense of them, while their alleged problematicity simply disappears if perdurantism or the stage view are assumed (Sider 2001).

What is so bad about mereological and locative coincidence? To start with locative coincidence, it just seems wrong that two numerically different material objects could fit exactly into a single region of space: instead of occupying the same place, they would just bump into each other. It might be the case that some particular kinds of microphysical particles, such as bosons, allow for this kind of co-location (Hawthorne and Uzquiano, 2011). It might also be the case that in some other possible world, with a different set of laws of nature, objects would not bump into each other, but rather pass through each other unaffected, and thus allow for co-location (Sider 2001). However, the ordinary middle-sized objects that populate our everyday life simply do not: they cannot share a same exact location.

Let us now turn to mereological coincidence. What is so bad about it? Suppose x and y share all parts at the same time. If they do, they will surely also happen to be spatially co-located. But if that is the case and they are numerically different, what could account for their numerical difference? What makes them different one from the other, if they have the same parts and the same location? Moreover, contemporary standard mereology—that is, classical extensional mereology—implies that no two objects can share all parts, a principle called ‘extensionality’ (Simons 1987; Varzi 2016).

Let us now consider two possible examples of mereological and locative coincidence. The first one is the case of a statue of Socrates and the lump of clay it is made of. As long as the statue exists, the statue and the lump of clay coincide both mereologically and locatively: they are exactly located at the same spatial region and they share all parts. And yet, there are reasons to believe they are numerically different. For instance, they have different properties. They indeed have different temporal properties: the clay, but not the statue, has the property of existing at times before the statue was created. And they seem to have different modal properties as well: only the clay, and not the statue, can continue to exist even if the clay gets substantially reshaped into, say, a statue of Plato. Since the statue and the lump of clay have different properties, we must conclude that they are numerically different, in virtue of Leibniz’s law.

A second case of coincidence without identity involves Tibbles the cat. As any other cat, Tibbles has a long fury tail. The tail is part of Tibbles just well as the rest of Tibbles—call it Tib—is. Tib is a part of Tibbles, and hence they are numerically different. But suppose that Tibbles loses her tail. It seems that both Tibbles and Tib would survive the accident. After all, cats do not die when losing their tails; and nothing actually happened to Tib when Tibbles lost her tail, so there is no reason to believe that Tib stopped existing. However, after the accident, Tibbles and Tib end up sharing the same exact location and end up sharing all parts. Hence, the case of Tibbles and Tib is yet another case of coincidence without identity.

Is it really the case that the statue is not the lump of clay, and Tibbles is not Tib? These claims might be resisted. For example, if identity is temporary—if x might be identical with y at one time and different at another—then one might say that even if before the accident Tibbles and Tib were different, after the accident they are identical (Gallois 1998, Geach 1980, Griffin 1977). However, this move does not come for free. Serious arguments have been offered to the effect that identity is not a temporary relation (Sider 2001: 165ff, Varzi 2003: 395).

A different option consists in saying that the statue is nothing else than the lump of clay as long as it possesses the property of being arranged statue-of-Socrates-wise (just like Socrates the philosopher is nothing else than Socrates who possess the property of being a philosopher, and certainly not a second person on top of Socrates). In that case, the statue and the lump of clay would not be numerically different (Heller 1990). However, unlike in the case of Socrates becoming a philosopher, it seems that when we create a statue, we have not merely changed something that existed before. Rather, it seems that we created something that did not exist before.

How do perdurantism and the stage view solve the problem of coincidence? Let us start with perdurantism. According to perdurantism, the statue and the piece of clay are four-dimensional objects composed of temporal parts. During the existence of the statue, they might well mereologically and locatively coincide. But since the lump of clay existed before, and will exist after, the statue, the lump has some temporal parts that the statue does not have. Hence, mereologically speaking they do not overall coincide (in fact, from the perdurantist, four-dimensional, perspective, the 4D statue is a part of the 4D lump of clay). Moreover, from a locative point of view, since the lump exists at times at which the statue does not, their spatiotemporal location is not the same. For sure, their spatial location might sometimes be the same; but this is as it should be: if you consider the exact spatial location of your hand, at that location, you and your hand coincide locatively. The same holds for Tibbles and Tib, for they do not mereologically coincide. Tibbles’ tail is a four-dimensional object that only Tibbles, and not Tib, contains as a part (Varzi 2003: 398). On the other hand, the stage viewer, who identifies ordinary objects with stages, will claim that after the creation of the statue, the statue and the piece of clay are numerically identical. Then, she will benefit from the flexibility of the temporal counterpart relation to make sense of the alleged different properties of the statue and the clay. The present clay will outlast the statue not because it will persist for a longer time—the statue is an instantaneous object, it does not persist—but because it has a clay-counterpart at times which are later than the times at which it has its last statue counterpart. The stage viewer will probably adopt a similar answer in the modal case as well. To illustrate, the claim that the clay, and not the statue, can survive reshaping translates into the claim that in a possible world in which the clay is reshaped, the actual clay has a clay-counterpart but not a statue-counterpart (Sider 2001: 194).

What can an endurantist say in cases of coincidence without identity? A first option could be to just bite the bullet: the statue and the piece of clay are indeed numerically different and indeed mereologically and locatively coincident. However, the endurantist will not want to just accept without qualification that different objects can thus coincide. Of course, she will agree, in normal circumstances different objects cannot thus coincide. She will then try to tell apart in a principled way the special cases that allow for coincidence from the normal cases which do not. One popular attempt to trace this difference in a principled way has to do with the notion of constitution. There is a sense, the idea goes, in which the clay constitutes the statue, and in which after the accident Tib constitutes Tibbles. These selected cases in which constitution is in play warrant the possibility—if not the necessity—of mereological and locative coincidence. This endurantist solution to the problem of coincidence is sometimes called the ‘standard account’ (Burke 1992, Lowe 1995). Of course, the standard account does not come for free. It requires one to adopt a theory of mereology different from classical extensional mereology, and a theory of location that allows for co-location, and this might seem to be a drawback in itself. Moreover, a proponent of such a view still has to tell a story on what she takes constitution to be. A much-discussed option is to make sense of constitution in terms of mutual parthood: the statue is part of the clay, and the clay is part of the statue (we are here using the technical notion of proper or improper part, which has numerical identity as a limit case; see Mereological Technicalities). Apart from requiring a substantial revision of even the most endurantist-friendly theories of mereology, appealing to mutual parthood is not yet enough to make sense of constitution. Mutual parthood is symmetrical while friends of constitution take constitution to be asymmetrical: the statue is constituted by the clay, but not vice versa (Sider 2001: 155-156). Contemporary neo-aristotelianism might come to the rescue in answering this question (Fine 1999; Koslicki 2008): constitution might be defined in terms of grounding (for example, one might say that the existence or nature of the clay grounds the existence or nature of the statue) or in hylomorphic terms (the statue is a compound of matter and form, and the clay is its matter).

Further endurantist solutions, to mention a few, include taking identity to be temporary (Gallois 1998, Geach 1980), embracing mereological essentialism (namely the view that changing parts results in the end of persistence; this would help with the case of Tibbles, but not with the case of the clay, which does not necessarily change its parts when arranged into a statue; see Burke 1994, Chisholm 1973, 1975, van Cleve 1986, Sider 2001, Wiggins 1979), or mereological nihilism (namely the view that there are mereologically atomic—that is, partless—objects, so that most if not all of the entities involved in the cases are not part of one’s ontological catalogue (see van Inwagen 1981, 1990a, Rosen e Dorr 2002, Sider 2013).

Apart from trying to respond to the objection, an endurantist could also launch the ball back in the opposite camp and argue that the solution proposed by the perdurantist does not apply in all cases. In the original cases, coincidence was only temporary: there were times at which the two objects did not coincide, either because one did not yet exist (the statue) or because one had a part that the other did not have (Tibbles and her tail). But what about cases in which coincidence is permanent? Consider for example the case in which an artist creates both the statue and the lump of clay at the same time and later on destroys them at the same time. In such a case, the perdurantist’s solution seems to be precluded, for the statue and the piece of clay will share all their temporal parts, so they will end up mereologically and spatiotemporally coinciding (Gibbard 1975, Hawley 2001, Mackie 2008, Noonan 1999). When confronted with such a case, a perdurantist might be forced to accept one of the endurantist’s solutions, and thus will not be allowed anymore to declare her position better off with respect to endurantism. Notice, though, that the perdurantist might actually reply that permanent coincidence does indeed result in numerical identity. After all, if coincidence is permanent, we have lost one of the two reasons to believe that the statue and the piece of clay are numerically different—namely that they existed at different times. Moreover, as regards the difference in modal properties, the perdurantist might just accept the aforementioned solution: the claim that the clay, and not the statue, can survive reshaping translates into the claim that in a possible world in which the clay is reshaped, the actual clay, numerically identical to the statue, has a clay-counterpart but not a statue-counterpart (Hawley 2001). Finally, notice that the problem of permanent coincidence is no problem at all for the stage viewer, who did not appeal to a difference in temporal parts between the statue and the piece of clay to explain coincidence away (Sider 2001).

c. The Argument from Vagueness

A third objection against endurantism comes from the phenomenon of temporal vagueness. Suppose a table is gradually mereologically decomposed: slowly, from top to bottom, one by one, each of the atoms composing it is taken away until, finally, nothing of the table remains. At the end of the process, the table does not exist anymore. So, it must have ceased to exist at some time. But which time? Even if we might have a rough idea of when it happened, it is much more difficult to tell the precise moment in which the table ceased to exist. Recall that we are removing from the table one atom after the other. The removal of which atom is responsible for the disappearance of the table? And how far away must the atom be to count as removed? It seems really hard to give a precise answer to these questions. The case of the disappearance of the table seems somehow to be vague or indeterminate.

How should we make sense of these ubiquitous cases of temporal vagueness or indeterminacy? One option could be to say that the kind of indeterminacy here involved is merely epistemic. This amounts to saying that there is a clear-cut instant at which the table stops existing, and that our inability to determine which one is due to our ignorance of the facts. There is a definitive atom which, once removed, is responsible for the disappearance of the table. Our puzzlement comes simply from the fact that we do not know which one it is. Though some scholars are happy to defend this epistemic option, others find it odd to insist that there must be a precise atom the removal of which results in the disappearance of the table. And that there is a precise distance of the atom from the rest of the table to make it count as removed. Why is it really that atom as opposed to, say, the immediately previous one? What is it so special about that atom that makes the table stop existing? And what is so special about the given distance to be enough to make the atom count as removed? After all, if you look at what remains of the table after the removal of that atom, you would probably be unable to tell any significant difference from what was there before the removal.

A second option could be to say that the kind of indeterminacy here involved does not have to do with our epistemic profile, but rather with the world itself. The reason why it is so difficult to identify a sharp cut-off point at which the table stops existing is that there is no fact of the matter about what this point is. While at some earlier and later times the table definitely does or does not exist, there are some times at which it simply is indeterminate whether the table still exists. Philosophers have always had a hard time in trying to understand ontic or worldly indeterminacy. For a long time, the standard option has been simply to reject this option as impossible (Dummett 1974; Russell 1923; Sider 2001).

However, if the indeterminacy here involved is neither epistemic nor ontic, what is it? Interestingly enough, perdurantism offers a clear way out from this dilemma. The perdurantist will believe that there is a series of four-dimensional entities involved in the case of the disappearing chair. A first four-dimensional entity includes temporal parts up to the point at which the first atom is removed, a second four-dimensional entity includes temporal parts up to the point at which the second atom is removed, and so on until we get to a four-dimensional entity that includes temporal parts up to the point at which only one atom of the table remains. Given this metaphysical picture, the question of the instant at which the table stops existing translates into the question of which of those four-dimensional entities is picked out by the term “table”. While a perdurantist might still say that the kind of indeterminacy here involved is epistemic or ontic, she could also say that it neither has to do with our epistemic limitations nor with the world itself. Rather, she could say that the problem arises because the term “table” is vague. Although the term is used in everyday circumstances, we simply have not made a decision as to how it should work in special circumstances such as the one that we are discussing here. That is where our puzzlement comes from. This kind of indeterminacy results from a mismatch between our language and the world and is therefore semantic in nature.

The endurantist might accept the alleged oddity that comes with interpreting these cases of indeterminacy as either epistemic or ontic and try to live with it. While endurantists have traditionally had a preference for the epistemic option, renewed interest in ontic indeterminacy—due for example to attempts to take canonical interpretations of quantum mechanics at face value—might make the second option a live one as well (Williams and Barnes 2011, Wilson and Calosi 2018). It has also been remarked that the endurantist might in principle mimic the perdurantist solution, along the following lines. The endurantist might posit in place of a single enduring table a series of coinciding enduring objects, each of which ceases to exist slightly later than the other. Such objects will have temporal boundaries that coincide with the nested temporal parts of the perdurantist solution, but unlike them will endure instead of perduring. Having this series of enduring objects in place, the question of the instant at which the table stops existing might translate into the question of which of those enduring entities is picked out by the term “table”. Thus, also for the endurantist this kind of indeterminacy will turn out to be semantic (Haslanger 1994). What can be said about this mimicking strategy? At first, one might be baffled by the sheer number of enduring, coinciding, and table-like entities that the solution requires. However, an endurantist might respond that the number of entities is no more than the one required by the perdurantist solution. In any case, while in the case of the perdurantist the position of this series of entities is part of the view itself, in the case of the endurantist it seems to be a mere strategy to solve the problem of vagueness, and thus it would not be surprising if perdurantists would consider it ad hoc.

d. The Unintelligibility Objection

Endurantism has it that persisting objects are wholly present at each time of their persistence. But what is it for something to be wholly present a time? If no account of this crucial notion is given, endurantism itself remains not properly defined. Moreover, if no account of the notion is possible at all—that is, if we cannot make sense of whole presence—then endurantism itself will turn out to be an unintelligible doctrine. And admittedly endurantists have no easy time in spelling out what whole presence really amounts to (Sider 2001).

Hence, again, what is it for x to be wholly present at time t? It might mean that:

(1)        at time t, x has all of its parts.

But what does it mean to say that x has all of its parts? Are we talking about all the parts that x has at t? Or rather about all the parts that x had, has, and will ever have? In both cases, the endurantist is in trouble. In the former case, (1) becomes

(2)        at time t, x has all the parts that it has at t.

However, this hardly identifies the endurantist solution alone. The perdurantist too will believe that at any given time, a four-dimensional entity has all the parts it has at that time. Given that the endurantist intended her view to be different from the perdurantist one, this was not what the endurantist had in mind when saying that persisting objects are wholly present at different times. In the latter case, (1) becomes:

(3) at time t, y has all the parts that it had, has or will ever have.

However, recall that according to endurantism persisting objects are supposed to be wholly present at each time of their persistence. If whole presence is defined as in (3), this will imply that objects will never gain or lose parts. Which seems again to mischaracterize endurantism, which was supposed to be compatible with mereological change.

We should point out that in interpreting (1) as (2) or (3) we have switched from an apparently timeless notion of parthood (x is part of y) to a temporary one (x is part of y at time t). The move is a straightforward one for an endurantist to make. Usually, endurantists want their properties or relations—or at least the contingent ones—to be exemplified temporarily. However, at least some endurantists, those who are also presentists, might resist this switch and stick to the timeless notion of parthood. They might simply say that x is wholly present just in case it has all the parts it has, full stop (Merricks 1999). Whether or not this solution works in a presentist setting, it can hardly be applied in a non-presentist one.

Another option might be to argue that to be wholly present simply means to lack any proper temporal parts. This move sounds promising. However, it is not totally uncontroversial, for it has been argued that in special cases an endurantist might want her enduring objects to have proper temporal parts. Suppose, for instance, that an artist creates a bronze statue of Socrates by mixing copper and tin into the mold and then, unsatisfied with the result, destroys the statue by separating tin and copper again, so that the statue and the bronze will begin and cease to exist at the same times. Suppose, further, that the bronze and the statue are numerically different from each other (for reasons why they should be, see § 2b). The bronze might be taken to be a part of the statue (a proper part, insofar as it is different from the whole), but it will mereologically coincide with it during its existence. In this somehow tortuous scenario, even if the bronze and the statue might be conceived as enduring, the bronze will count as a temporal part of the statue at the interval of their persistence. For have a look back at the definition of temporal parts given before:       

Temporal part    x is a temporal part of y at t if (i) x is a part of y at t; (ii) x exists at, and only at, t; (iii) x overlaps at t everything that is part of y at t. 

Indeed, (i) the piece of bronze is a part of the statue that (ii) exists only at the interval for the persistence of the statue, and that (iii) overlaps everything that is part of the statue and exists at that time.

What lesson should we learn from this particular case? According to Sider (2001), a defender of the unintelligibility charge against endurantism, the conclusion to be drawn is that an endurantist might want her enduring objects to have, at least sometimes, proper temporal parts. And that consequently that endurantism cannot simply be the doctrine that objects persist without having proper temporal parts. In principle, one might be tempted to draw a different lesson, that is, that Sider’s definition of temporal parts is unsuccessful and that the notion of a temporal part should be defined in a different way.

In any case, it should be noted that so far, we have tried to characterize the notion of whole presence in mereological terms. However, the reader shall recall that in § 1b we distinguished two aspects which are mixed together in the canonical definition of endurantism offered above. Once again, we should distinguish (i) the mereological question of whether persisting objects have temporal parts, and (ii) the locative question of whether objects are exactly located at temporally extended, four-dimensional spacetime regions or rather at temporally unextended, three-dimensional regions only. So far, in trying to define whole presence in mereological terms, we have assumed that the notion pertained to the mereological question, rather than the locative one. On the other hand, if whole presence is to be characterized in locative terms, the task does not seem to be too difficult (Gilmore 2008, Parsons 2007, Sattig 2006). For example, under the view that we called locative three-dimensionalism, whole presence simply translates as exact location: a persisting object is wholly present at each instant of its persistence in the sense that it is exactly located at each instantaneous time or spacetime region of its persistence.

e. Arguments against Specific Versions of Endurantism

In § 1b and § 1c, we characterized several different versions of locative and non-locative endurantism. Each of them helped characterize better what the endurantist might have had in mind. However, each of them is subject to specific objections, which we here review summarily.

First, we have defined locative three-dimensionalism, according to which persisting objects are exactly located at temporally unextended regions only. This form of endurantism is committed to the possibility of multi-location, that is, to the possibility of a single entity having more than one exact location. Multi-location has been put to work in several contexts, in helping to make sense not only of endurantism, but also of Aristotelian universals and property exemplification, to mention only a few cases. Still, several scholars take multi-location to be problematic, either because it implies contradictions (Ehring 1997a), or because it is at odds with the very notion of an exact location (Parsons 2007), or because it creates specific problems when applied to the case of persistence (Barker and Dowe 2005). Moreover, locative three-dimensionalism is prima facie committed to the existence of instants of time, which cannot be the case if time is gunky (see Leonard 2018).

Second, we have defined simplism, according to which persisting objects are mereologically simple and exactly located at the temporally extended region of their persistence. Simplism is committed to the possibility of extended simples, that is, the possibility that something without any proper parts can be located at an extended region. Extended simples have enjoyed a fair share of popularity and have been argued to be a possibility which flows from recombinatorial considerations (McDaniel 2007b, Saucedo 2011, Sider 2007), from quantum mechanics (Barnes and Williams, 2011) and from string theory (McDaniel 2007a). Still, some scholars look at extended simples with a distrustful stare, because they think that dividing into parts is part of the nature of extension (Hofweber and Velleman 2011), because extended simples are excluded by our best theories of location (Varzi and Casati 1999), or because specific reasons given in favour of the possibility of extended simples are unsuccessful.

Third, we have introduced non-locative versions of endurantism. These versions usually assume that there are two radically different ways of being in a dimension, that objects are in space in a radically different way with respect to the one in which they are in time, and that these two different ways explain why objects divide into spatial but not into temporal parts. Such views are immune from the specific problems of locative three-dimensionalism and of simplism. Still, they have been argued to come with specific drawbacks of their own. In particular, they seem to be at odds with spacetime unitism (see § 1e). Indeed, under spacetime unitism, regions of time and regions of space are simply spatiotemporal regions of some sort. So, it seems that if anything holds a relation to a region of space, it cannot fail to hold the same relation to some region of time as well (Hofweber and Lange 2017).

3. Arguments against Perdurantism

Perdurantism has become a popular option. However, it does not come without its own drawbacks. This section briefly reviews arguments to the effect that that it offends against our intuitions (§ 3a), it makes change impossible (§ 3b), it is committed to mysterious and yet systematic cases of coming into existence ex nihilo (§ 3c), it is ontologically inflationary (§ 3d), it involves a category mistake (§ 3e), it does not make sense (§ 3f), and it has a problem with counting (§ 3g).

a. The Argument from Intuition

Endurantists and their foes alike often agree that endurantism is closer to common sense beliefs, or more intuitive, than perdurantism. Moreover, some philosophers believe that common sense beliefs or intuition should be taken seriously when doing philosophy. This often translates into the idea that such intuitions or beliefs should be preserved as much as possible, that is, until eventually proven false or at least significantly problematic (Sider 2001). Presumably, this is also why endurantism is sometimes considered the champion view, and that the burden of proof in the persistence debate lies on the perdurantist side of the debate (Rea 1998). Now, has endurantism been proven false or significantly problematic? The previous section reviewed several arguments to this effect and registered that several endurantists remain unconvinced. They would therefore conclude that perdurantism is unmotivated and, since it is the challenger view, should be rejected.

We shall not here tackle the question of whether endurantism has been proven false (see § 2 for this). Rather, we focus on other possible ways in which the perdurantist might respond to this specific challenge.

First of all, though, we should wonder: why is endurantism supposed to be more intuitive than perdurantism? What aspects of perdurantism are supposed to be that counter-intuitive? Perdurantism implies that when seeing a tree or talking with a friend, what you have in front of you is not a whole tree or a whole person, but rather only parts of them. It also implies that objects are extended in time just like they are extended in space and a bit like an event is supposed to be. These mereological and locative consequences of perdurantism are supposed to be counter-intuitive: intuitively, we would say that what we have in front of us in the cases described are a whole tree and a whole person, and that we are not extended in time like we are in space, or like events are supposed to be.

Clearly enough, one option for the perdurantist is simply to reject the idea that in philosophy intuitions or common sense should have the weight the endurantist is here proposing. What an endurantist calls “intuitions” a perdurantist might insist are nothing more than unwarranted biases. However, we do not discuss this option here. Tackling the general question of the role of intuition in philosophy goes beyond the scope of this article (for an introduction to the topic, see Intuition).

A second option consists in pointing out that while perdurantism does indeed have counter-intuitive consequences, endurantism is not immune from counter-intuitiveness too. For example, we have already mentioned that several popular versions of endurantism are committed to claims—as the claim that things can have more than one exact location or that extended simples are possible (see § 2e)—which might arguably be taken to be counter-intuitive.

A third option consists in pointing out that even if intuition should play a role in philosophy, the kind of evidence that it offers might be biased, for it might be based on our misleading vantage point on reality. In particular, it might be argued that our endurantist intuitions are based on the fact that human beings commonly experience reality a time after a time. However, if spacetime unitism and eternalism are true, a more veritable perspective would be one that would allow us to perceive the whole of spacetime in a single bird’s-eye view. Were we able to see the whole of spacetime in a single bird’s-eye view, our intuitions might be different, and we might rather be led to believe persisting objects to be spatiotemporally extended, and to see their instantaneous “sections” with which human beings are usually acquainted, as parts of them. In that case, our usual condition would be reminiscent of that of the inhabitants of Flatland, who perceive the passage of a three-dimensional sphere on their plane of perception as the sudden expansion and contraction of a bi-dimensional circle. Once again, here we shall not tackle the question of whether eternalism and spacetime unitism are true (for an introduction to the topic, see Gilmore, Costa, Calosi 2016).

b. The No-Change Objection

A second objection traditionally marshalled against perdurantism is that it makes change impossible (Geach 1972, Lombard 1986, Mellor 1998, Oderberg 2004, Sider 2001, Simons 1987; 2000a). But change quite obviously occurs everywhere and everywhen. Hence, perdurantism is false.

Why would perdurantism make change impossible? Change requires difference and identity. In order for a change to occur, the argument goes, something must be different, that is, must have incompatible properties, but must also be identical, that is, must be one and the same thing. The identity condition is important, for we would not normally call a change a situation in which two numerically different things have incompatible properties. For example, we would not call a change a situation in which an apple is red and a chair is blue. However, the perdurantist account of change (§ 2b) seems committed to invariably violate the identity condition. Under perdurantism, when a change occurs, it is not the numerically same thing which has the incompatible properties. Rather, the incompatible properties are had by numerically different temporal parts of said thing. For example, if a hot poker becomes cold, it is not the persisting poker itself which is hot and cold. Rather, two numerically different temporal parts of it are hot and cold.

Is it really the case that perdurantism violates the identity condition? For sure, under perdurantism, the incompatible properties are had by numerically different temporal parts: an earlier part of the poker is hot, a later one is cold. However, can we not say that the persisting thing has them too: the perduring poker itself is hot and cold? After all, we call red a thing even if not all, but only some, of its parts are red. It is crucial here to stop and wonder what we might mean that the perduring poker itself is hot and cold. One straightforward option would be to say that the poker itself literally is hot and cold, just like its different temporal parts are. However, this is implausible. After all, one of the main motivations for being a perdurantist consists in saying that it is impossible for the numerically same poker to be hot and cold, for it would violate Leibniz’s Law or even the Law of Non-contradiction (§ 2b). Hence, when a perdurantist says that the perduring poker itself is hot and cold she must mean something different. Presumably, she means that the poker is hot insofar as it has hot parts and is cold insofar as it has cold parts. However, if this is what the perdurantist really means, she would presumably be violating the difference condition. For change requires the same subject to have incompatible properties. Whereas having hot parts and having cold parts are not incompatible properties.

A second and popular move consists in rejecting the identity condition. Change does not require one and the same thing to have incompatible properties. At least in some cases, different things would do too (Sider 2001). However, foes of perdurantism would insist that it is not possible to give up the identity condition so lightly. They would insist, for example, that having parts with incompatible properties is insufficient for change. For example, a single poker would not change for the simple fact of having hot parts and cold parts: mereological heterogeneity is not change. Perdurantists might concede that mereological heterogeneity is not always change, but specify that under certain circumstances, it is. In particular, mereological heterogeneity is change in cases where incompatible properties are had by different temporal parts of a single thing.

Some endurantists remain unconvinced by this proposed amendment to the identity condition. They would say, for example, that since temporal parts are numerically different from each other, under perdurantism there is no change, but only replacement.  At this point, perdurantists have at least two options. The first one is simply to disagree: change is a particular kind of replacement. The second one consists in giving up on change: if change really requires the original identity condition, then let it be: philosophy has taught us that where we believed there to be change, there really only is replacement (Simons 2000b; Lombard 1994).

c. The Crazy Metaphysic Objection

A third objection against perdurantism is that it is a “crazy metaphysic”, for it involves systematic and yet mysterious cases of coming into existence. The objection refers here to the fact that, under perdurantism, new temporal parts of a single thing come into (and go out of) existence continuously. As Thomson famously puts it:

[perdurantism] seems to me a crazy metaphysic (…). [It] yields that if I have had exactly one bit of chalk in my hand for the last hour, then there is something in my hand which is white, roughly cylindrical in shape, and dusty, something which also has a weight, something which is chalk, which was not in my hand three minutes ago, and indeed, such that no part of it was in my hand three minutes ago. As I hold the bit of chalk in my hand, new stuff, new chalk keeps constantly coming into existence ex nihilo. That strikes me as obviously false (Thomson 1983, 213).

Under perdurantism, these cases of coming into being really are systematic. But what does it mean to say that they are crazy or mysterious? It might mean that they do not make sense (for this option, see the unintelligibility objection in § 3e). But there is another option which is worth exploring. According to this option, mystery has to do with the absence of an indispensable explanation. If perdurantism is true, the objection goes, there are systematic cases of coming into existence. These cases cry out for an explanation: how is it that these new things come into existence? Where do they come from? However, perdurantism seems to be unable to offer an explanation for these cases. Under perdurantism, the systematic coming into being of new and new temporal parts is a brute fact, of which there is no explanation.

First, we shall wonder: is perdurantism really unable to offer an explanation for these cases of coming into existence? Thomson seems to be persuaded that it cannot. If perdurantism is true, these temporal parts do not come from a source which might explain their appearance. In her words, they come into existence ex nihilo. But is this really the case? What does it mean that a new temporal part of a thing comes into existence ex nihilo, from nothing? Does it mean that nothing existed before the temporal part? Certainly not: other temporal parts of the thing existed before the appearance of that particular temporal part. Does it mean that the coming into existence of the temporal part is an event which has no cause? Again, this seems to be implausible. If perdurantists take causation seriously (for if they do not, the objection would not apply in the first place, see (Russell 1913)), some perdurantists would say that there is a causal connection between temporal parts of a single thing: the later ones are caused to exist by the previous ones (Heller 1990, Oderberg 1993, Sider 2001). Endurantists might disagree here. For example, they might believe that later temporal parts cannot be caused to exist by the previous ones, for (immediate) causation requires simultaneity of cause and effect (see Huemer and Kovitz 2003, Kant 1965).

We have discussed how a perdurantist might try to offer an explanation for the continuous coming into existence of new temporal parts. But is it really the case that the coming into existence of new temporal parts really requires an explanation? In that connection, perdurantists usually follow two lines of reasoning. First, they argue that the succession of new temporal parts as we move through time is analogous to the succession of new spatial parts as we move through space. And since we do not think there is anything mysterious in the latter case, so we should have the same attitude in the former one as well (Heller 1990, Varzi 2003). This argument from analogy gains plausibility especially under a unitist view of spacetime. However, one might argue that the analogy fails, for example because causation unfolds diachronically over time and not synchronically through space, so we have a reason not to expect there to be an explanation in the spatial case and to require an explanation in the temporal one instead. The second line of reasoning takes the form of a tu quoque. Thomson believes that the continuous coming into existence of new temporal parts requires an explanation. But is the continuous existence of an enduring object not equally mysterious? How is it that an enduring object continues to exist instead of ceasing to exist? If the endurantist’s continuous existence is no mystery, so also is the continuous coming into existence of new temporal parts proposed by the perdurantist (Sider 2001, Varzi 2003).

d. The Objection from Ontological Commitment

One criterion that has been sometimes employed in order to evaluate metaphysical doctrines is Ockham’s razor, according to which a theory should refrain from making commitments if such commitments are not necessary to its theoretical success. One particular kind of commitment is ontological commitment, that is, the commitment of a theory to the existence of entities of kinds thereof. According to Ockham’s razor, this commitment is to be avoided if possible, and any theory which is less ontologically committed is, ceteris paribus, preferable with respect to one which has more ontological commitment (see The Razor).

Now, it might be noted that perdurantism is committed to the existence of a higher number of entities with respect to both endurantism and the stage view. Perdurantism is more ontologically committed than endurantism, for on top of a single persisting thing, it is committed to the existence of a series of numerically different temporal parts thereof. Perdurantism is also more ontologically committed than the stage view. Indeed, unlike perdurantism, the stage view is not necessarily committed to the existence of the perduring mereological sums of the instantaneous stages. If perdurantism is indeed more ontologically committed than endurantism and the stage view, the question is whether this commitment is really necessary. This question is of course discussed in § 2 and § 4. However, more generally, the perdurantist might wish to reject Ockham’s razor—for what reasons do we have to believe that the world is not more complex than our simplest theories? —or to ride the wave of contemporary metaphysicians which simply downplays the importance of ontological commitment, and suggests that the fundamental question of metaphysics is not what there is, but rather what is fundamental, or what grounds what (Schaffer 2009). Yet another response on behalf of the perdurantist is based on the distinction between quantitative and qualitative parsimony (Lewis 1973; 1986). A metaphysical system is more quantitatively parsimonious the fewer entities it acknowledges, while it is more qualitatively parsimonious the fewer ontological categories it introduces. Offending against quantitative parsimony is often considered to be less problematic, if at all, than offending against qualitative parsimony. And indeed, one might say, perdurantism offends against quantitative, but not qualitative, parsimony, for each temporal part of a material object is itself a material object.

e. The Category Mistake Argument

Perdurantism has it that persisting objects have temporal parts. This makes objects similar to events, for events too are also usually taken to have temporal parts. Because of this similarity, perdurantists have sometimes presented as a consequence of their view that objects and events are entities of the same kind, and the difference between events and objects is, at best, one of degree of stability (Broad 1923, Quine 1970). In the words of Nelson Goodman (1951, 357): “a thing is a monotonous event, an event is an unstable thing”.

Are events and objects entities of the same kind? Critics of perdurantism have sometimes argued that they are not, and that conflating objects and events would result in a serious category mistake. Perdurantism, which is committed to this mistake, would therefore need to be rejected. This is the category mistake argument against four-dimensionalism (Hacker 1982, Mellor 1981, Strawson 1959, Wiggins 1980).

What reasons are there to believe that events and objects belong to different ontological categories? For example, it has been pointed out that while objects are said to exist, events are said to happen, or take place (Cresswell 1986, Hacker 1982). This linguistic difference is sometimes said to be a reflection of an ontological one, that is, that objects and events enjoy different modes of being. Moreover, while objects exist at times and are at places, events are supposed to be at places and times. Once again, this linguistic difference is supposed to reflect an ontological one, that is, that objects and events relate to space and time in radically different ways (Fine 2006). Furthermore, objects do not usually allow for co-location, at least not to the extent in which events do (Casati and Varzi 1999, Hacker 1982). Finally, it is sometimes said that the spatial boundaries of events are usually vaguer than those of objects (what are the spatial boundaries of a football match?), whereas the temporal boundaries of events are usually less vague than those of objects (Varzi 2014).

A first way to resist this argument is to insist that conflating objects and events is no category mistake. Putative differences between objects and events will then either be considered irrelevant when it comes to metaphysics—for example because they are merely linguistic differences which do not reflect any underlying significant difference in reality—or in any case not enough to imply that objects and events belong to different ontological categories. After all, presumably, not all differences between kinds of entities are supposed to make them entities of a different kind (Sider 2001).

On the other hand, if a perdurantist is persuaded that conflating objects and events would be a category mistake, she could simply reject the claim that perdurantism implies that objects are events or vice versa. Perdurantism is the claim that objects have one feature that is usually—and not universally—attributed to events, that is, having temporal parts. And sharing some features is not a sufficient condition to belonging to the same ontological category. After all, entities of other kinds, such as time intervals or spacetime regions, are usually taken to have temporal parts without being events.

f. The Unintelligibility Objection

Some endurantists believe that perdurantism is not (only) false, but utterly unintelligible. According to this possible objection, perdurantism is a “mirage based on confusion” (Sider 2001, 54), a doctrine which makes “no sense” (Simons 1987, 175) or which is, at best, “scarcely intelligible” (Lowe 1987, 152). In the trenchant words of Peter van Inwagen:

I simply do not understand what [temporal parts of ordinary objects] are supposed to be, and I do not think this is my fault. I think that no one understands what they are supposed to be, though of course plenty of philosophers think they do. (van Inwagen 1981, 133)

In response to this objection, David Lewis (1986) famously stated that if one is unable to understand a view, one should not debate about it. Colorful as it is, Lewis’ stance misfires. The point of the objection is not that the objector has not understood perdurantism, but rather that perdurantism itself is unintelligible. Lewis’ point would apply in case where the objector was simply admitting her epistemic limitations. But the objector is not making a point about herself. Rather, she is making a point about the view itself, saying that it does not make sense. (Is it possible for something to be false and also not to make sense? Several scholars have indeed endorsed the view that some claims, such as contradictions or category mistakes, are false and do not make sense. But this view might be attacked.)

What is it precisely that is supposed not to make sense in perdurantism? Is it the notion of a temporal part itself? This is hardly the crux of the problem, since many endurantists claim that the notion itself, when applied to events, makes perfect sense (Lowe 1987). The unintelligibility of the view should rather come from some other aspect of the view. But if so, wherefrom? One option consists in saying that the unintelligibility comes from the fact that perdurantism is committed to a category mistake, and category mistakes, or at least some of them, are unintelligibile (for a discussion see § 3e). A second option might have to do with mereology. Indeed, Sider (2001), who takes the objection seriously, considers that the problem might lie in the fact that the notion of a temporal part is usually defined in terms of the timeless notion of parthood—x is part of y. Rather, endurantists tend to use the temporary notion of parthood—x is part of y at t. Sider suggests that maybe the sense of unintelligibility comes from the fact that perdurantists tend to use a mereological notion that endurantists take to be unintelligible—or to yield unintelligible claims when applied to everyday material objects. If Sider’s diagnosis is correct, then his definition of temporal parts in terms of temporary parthood discussed before (§ 1d) seems to take care of it.

g. The Objection from Counting

The objection from counting is traditionally presented as an objection against perdurantism and in favor of the stage view. The semantic difference between the two views is of particular importance here. Recall that the two views disagree about the reference of expressions referring to ordinary objects. Under perdurantism, expressions referring to ordinary objects, such as “Socrates”, refer to persisting, four-dimensional objects, whereas under the stage view, expressions referring to ordinary objects refer to one instantaneous stage (which particular stage is referred to is determined by the context).

Let us consider again the case of the statue and the piece of clay (§ 2b). Under perdurantism, both of them are four-dimensional entities, and their apparent coincidence boils down to their sharing some temporal parts. In particular, at any time in which the statue exists, there is an instantaneous statue-shaped entity that is both a temporal part of the statue and a temporal part of the piece of clay. Now suppose that at that particular time someone asks the question: how many statue-shaped objects are there? Intuitively, we would like to answer that there is only one. And this is the answer given by the stage view. For the stage view takes ordinary expressions such as “statue-shaped object” to refer to instantaneous stages, and there is only one of them that exists at that time. On the other hand, perdurantism counts by four-dimensional entities. And since that particular instantaneous stage is a temporal part of two ordinary objects, the statue and the piece of clay, perdurantism implies that there are in fact two statue-shaped objects there at that time. Hence, perdurantism, unlike the stage view, yields unwelcome results as regards the number of entities involved in such cases. This is the argument from counting against perdurantism (Sider 2001).

A possible answer consists in saying that in that particular context the predicate “statue-shaped object” does indeed refer to two four-dimensional entities, the statue and the piece of clay, but that we count them as one because they are, in a sense, identical at the time of the counting (Lewis 1976). In saying so, we are using an apparently time-relative notion of identity—x is identical to y at t—instead of the usual timeless one—x is identical to y. What does that mean? A four-dimensionalist would define the time-relative notion in terms of the timeless one: x is identical to y at t if the temporal part of y at t is identical to the temporal part of y at t. Stage theorists will probably remain unconvinced by this move for, they would insist, counting can only be done by identity. In Sider’s words: “I doubt that this procedure of associating numbers with objects is really counting. Part of the meaning of ‘counting’ is that counting is by identity; ‘how many objects’ means ‘how many numerically distinct objects’ (…). Moreover, the intuition that [there is just one statue-shaped object at that time] arguably remains even if one stipulates that counting is to be identity” (Sider 2001, 189).

4. Arguments against Stage View

This section reviews arguments against the stage view, to the effect that it goes against our intuitions (§ 4a),  it makes change impossible (§ 4b), it is committed to mysterious and yet systematic cases of coming into existence ex nihilo (§ 4c), it is ontologically inflationary (§ 4d), it is incompatible with temporal gunk (§ 4e), it is incompatible with our mental life (§ 4f) and it has problems with counting (§ 4g).

a. The Argument from Intuition

In § 3a we discussed the argument from intuition against perdurantism. A similar argument has been proposed against the stage view as well. While the details of the present argument are somewhat different from the previous one, its general structure remains the same. The general idea is that closeness to intuitions or common sense constitutes a theoretical advantage that a view might have. And, the objector says, both endurantism and perdurantism are closer to intuitions than the stage view.

Why is the stage view supposed to be especially counter-intuitive? Presumably, the aspect of the stage view which offends the most against our intuitions is the fact that it denies persistence. Indeed, while endurantism and perdurantism agreed on the fact that some ordinary objects persist, either by enduring or by perduring, the stage view denies that ordinary objects persist. In place of a single persisting object, the stage view posits a series of numerically different instantaneous stages.

In order to tackle this objection, the stage viewer might decide to deploy some of the generic strategies outlined in § 3a. First, the stage viewer might insist that intuitions are no more than biases and thus deny that that intuitions place any disadvantage on the stage view. Second, the stage viewer might believe that the disadvantage exists, but is nevertheless outweighed, either by the fact that other views are intrinsically counter-intuitive too (see again § 3a), or by the fact that the other views have been proven false or at least significantly problematic.

Here, however, we focus on a fourth and more specific strategy available to the stage viewer. The strategy consists in arguing that the intuition that is supposed to disfavor the stage view does not really disfavor it. It is indeed true, the stage viewer would say, that we commonly have beliefs such as “I was once a child”. The critic of the stage viewer takes them to imply the persistence of the self, for how could I have been a child without existing in the past? But this, the stage viewer says, is a mistake. In fact, we could make sense of beliefs such as “I was once a child” just as well by means of the counterpart relation: “I was once a child” is true if a past counterpart of mine is a child. In other words, those beliefs are undetermined with respect to the question of whether things exist at more than one time (Sider 2001).

A possible reply is that the strategy might not be applied to all putative cases of commonsensical beliefs involving the past. Consider for example a tenseless statement of cross-time identity such as “I am identical to a young child”, in which I affirm my identity with my previous self. This statement cannot be taken care of in terms of counterparts. The stage viewer’s rejoinder might here be that these beliefs are perhaps too technical to be common sense or that, in any case, what really matters is that the stage viewer is able to make sense of and to validate cognate statements that are framed in terms which are much more mundane, such as “I was once a child” (Sider 2001).

b. The No-Change Objection

In § 3b we discussed the no-change objection against perdurantism. The objection was that change requires the numerical identity of the subject of change before and after the process of change. There, we discussed the option of amending this identity requirement. Change does not require that the subject before and the subject after the change be identical. They just need to be temporal parts of a single thing. The stage viewer might adopt this strategy to suit her needs. Change does not require that the subject before and the subject after the change be identical. They just need to be related by the counterpart relation. Some endurantists remain unconvinced by the perdurantist amendment. We might reasonably expect them to be unconvinced by the amendment proposed by the stage viewer too. Since the relevant stages are numerically different from each other, under the stage view there is no change, but only replacement. The stage viewer’s rejoinder might be either to insist that change is a particular kind of replacement or to give up on change and insist that there is nothing bad in saying that where we believed there to be change, there really is replacement.

c. The Crazy Metaphysic Objection

Section 3c reviewed an argument against perdurantism to the effect that it involved systematic and yet mysterious cases of coming into existence. The stage view is subject to a similar objection. Just like perdurantism requires the systematic coming into existence of new temporal parts, so the stage view requires the systematic coming into existence of new instantaneous stages. And if perdurantism did not have a plausible explanation for this systematic coming into existence, neither does the stage view.

However, it should also be noted that the stage viewer can apply the exact same strategies there proposed on behalf of the perdurantist. The stage viewer might insist that there indeed is an explanation for the coming into existence of new temporal parts. Their coming into existence is caused by the previous stages (Varzi 2003). Or she might argue that the systematic coming into existence is not mysterious after all, for it is no more mysterious than the succession of spatial parts through space, and no more mysterious than the continuous existence of an enduring object through time (Sider 2001; Varzi 2003).

d. The Objection from Ontological Commitment

Section § 3d reviewed an argument from ontological commitment against perdurantism. Its guiding principle was that unnecessary ontological commitments should be avoided and, therefore, any theory which is less ontologically committed is, ceteris paribus, preferable with respect to one which has more ontological commitment.

This kind of argument seems to disfavor the stage view with respect to endurantism. Indeed, instead of a single enduring thing, the stage view posits a myriad of numerically different instantaneous stages. However, this kind of argument does not disfavor the stage view with respect to perdurantism. Indeed, often the ontological commitments of the stage view and of perdurantism are perfectly aligned. Indeed, because of their commitment to mereological universalism, many stage viewers believe in the existence of four-dimensional aggregates on top of instantaneous stages (see § 1a). However, the stage view is not committed to four-dimensional aggregates by definition. So, depending on further metaphysical parameters, it might turn out that a stage viewer’s overall metaphysical view ends up being less ontologically committed than perdurantism.

In order to block this kind of argument, the stage viewer might adopt the usual strategies already described on behalf of the perdurantist. In particular, she might argue that the further ontological commitments of the stage view is fully justified because of the failures of endurantism (§ 2), or she might argue that a philosopher should not be scared to make all the ontological commitments that she sees fit, for what reasons do we have to believe that the world is not more complex than our simplest theories? Finally, she could ride the wave of contemporary metaphysicians which simply downplays the importance of ontological commitment and suggests that the fundamental question of metaphysics is not what there is, but rather what is fundamental, or what grounds what (Schaffer 2009).

e. The Objection from Temporal Gunk

When introducing the stage view, we pointed out that unlike perdurantism and endurantism, its very definition commits it to the existence of instantaneous entities. This might be a drawback of the stage view, in case time turns out to be gunky, that is, if it turns out that every temporal region can be divided into smaller temporal regions, and thus, temporal instants turn out not to exist (Arntzenius 2011, Whitehead 1927, Zimmerman 2006).

We do not here focus on the question of whether there exist temporal instants at all. Instead, we shall briefly remark that, as it stands, the argument implicitly assumes that if there are no instants, there cannot be instantaneous entities. This assumption might be taken to follow from a series of principles of location, most notably the principle of exactness, according to which anything that is in some sense in a dimension must also have an exact location in that dimension. Now, located entities share shape and size with their exact locations. Hence, if exactness is true, instantaneous entities do indeed require the existence of instants in order to exist. However, exactness has come under attack on different grounds, one of which concerns indeed the possibility of pointy entities in gunky dimensions (Gilmore 2018, Parsons 2001). Hence, it seems in principle possible for a stage viewer to uphold her view even if she takes time to be gunky.

f. The Objection from Mental Events

There is an objection often proposed against the stage view which concerns in particular the persistence of subjects of mental states. The stage view implies that ordinary objects, persons included, do not persist through time. However, some mental processes and states seem not to be possibly performed or possessed by instantaneous entities. For example, we say that people reflect on some ideas, make decisions, ponder means, act, fall in love, change their mind. All those mental events take time, and thus cannot possibly be possessed by instantaneous stages. (Brink 1997, Hawley 2001, Sider 2001, Varzi 2003).

The stage viewer will typically reply that acting, reflecting, pondering, making decisions and so on do not require a persisting subject. For example, they might insist that, say, acting is something that can be possessed by a stage in virtue of its instantaneous state as well as in virtue of its relations to its previous stages, provided that the previous stages possess their appropriate mental properties (Hawley 2001, Sider 2001, Varzi 2003). Alternatively, the stage viewer might insist that there are indeed extended mental events such as acting or pondering, but that such mental states do not have one single subject, but rather a series of subjects which succeed themselves one after the other. For each of them, to be acting is to be the subject of an instantaneous temporal part of a temporally extended event of acting. In any case, the stage viewer will concede that her view, unlike endurantism and perdurantism, is incompatible with the idea that such mental events are temporally extended and are possessed by a single subject.

g. The Objection from Counting

Section § 3g discussed an objection against perdurantism to the effect that it delivered the wrong counting results in cases of mereological coincidence. To the question “how many statue-shaped objects are there?”, asked at a time in which the piece of clay and the statue mereologically coincide, the perdurantist has to answer that there are two, whereas the stage viewer can give the intuitive answer that there is only one. However, while considerations about counting in cases of coincidence seem to favor the stage view, similar considerations in cases which are far more common seem to disfavor the stage view over its rivals, and endurantism in particular. Suppose Sam remains alone in a room for an hour. How many people have been in that room during that hour? Intuitively, we would like to answer that there has been only one. And this is the answer given by endurantism. For endurantism takes Sam to be one single persisting object that exists through the hour. On the other hand, the stage view takes ordinary expressions such as “person” to refer to instantaneous stages, and there is such an instantaneous stage of Sam for each instant making up the hour. Hence, the stage view, unlike endurantism, yields unwelcome results as regards the number of entities involved in such cases (Sider 2001). (How does perdurantism fare with this objection? It depends on whether it counts temporal parts of persons as persons. If it does (and it usually does, see § 2b), perdurantism is subject to the same objection.)

Suppose that the stage viewer is concerned with this problem and takes intuitions about counting seriously (and she arguably should, if she endorses the argument from counting in favor of her view presented in § 3g). In that case, the stage viewer has at least three options. The first one consists in saying that in that particular context the predicate “person” does indeed apply to several instantaneous stages, but that we count them as one because they are counterparts of each other. This option is subject to a rejoinder which was already employed in § 3g against the Lewisian solution to the problem of counting in favor of the stage view. Indeed, the present option suggests that sometimes we count by counterparthood and not by identity. This offends against the view that “part of the meaning of ‘counting’ is that counting is by identity” (Sider 2001, 189). A second option is available to the stage viewer who believes in the existence of four-dimensional sums of instantaneous stages. This stage viewer might claim that in the present context the predicate “person” applies to one single four-dimensional object instead of the instantaneous stages. In so doing, the stage viewer is adopting an unorthodox view which mixes the stage view and perdurantism, in which reference of ordinary terms such as “person” is flexible: sometimes they pick out instantaneous stages (as in the stage view), sometimes they pick out four-dimensional sums thereof (as in perdurantism). A third and final option consists in taking domains of counting to be restricted to entities existing to the time of utterance, or restricted in some other suitable way (Viebahn 2013).

5. What Is Not Covered in this Article

This section lists several aspects and issues concerning the metaphysics of persistence that are not covered in this article. Each of them is complemented with some references so as to guide readers in their exploration.

When it comes to the characterization of the views and of the debate, it is worth noting that some philosophers have tried to characterize the endurantism/perdurantism dispute in terms of explanation (Donnelly 2016; Wassermann 2016), while others have argued that the dispute is not substantial, but rather merely verbal (Benovsky 2016; Hirsch 2007; McCall and Lowe 2003; Miller 2005). It is also worth noting that apart from a few introductory words, not much has been covered about the history of the metaphysics of persistence (Carter 2011; Costa 2017; 2019; Cross 1999; Helm 1979).

When it comes to arguments for and against the various metaphysics of persistence, a couple of traditional arguments against perdurantism have not been covered in § 3, namely the modal argument (Heller 1990; Jubien 1993; van Inwagen 1990a; Shoemaker 1988; Sider 2001) and the rotating disk argument (Sider 2001). Moreover, it is important to note that several arguments have been drawn from physics for and against theories of persistence presented in this article, among which figure several arguments against endurantism, namely the shrinking chair argument (Balashov 2014; Gibson and Pooley 2006; Gilmore 2006; Sattig 2006), the explanatory argument (Balashov 1999; Gibson and Pooley 2006; Gilmore 2008; Miller 2004; Sattig 2006), the location argument (Gibson and Pooley 2006; Gilmore 2006; Rea 1998; Smart 1972), the superluminar objects argument (Balashov 2003, Gilmore 2006, Hudson 2005; Torre 2015), the invariance argument  (Balashov 2014; Calosi 2015; Davidson 2014) as well as an argument from quantum mechanics against perdurantism (Pashby 2013; 2016).

6. References and Further Reading

  • Armstrong, D. M., 1980, “Identity Through Time”, in Peter van Inwagen (ed.), Time and Cause. Dordrecht: D. Reidel, 67–78.
  • Arntzenius, F., 2011, “Gunk, Topology, and Measure” The Western Ontario Series in Philosophy of Science, 75: 327–343.
  • Arntzenius, F., 2011, “The CPT theorem”, in The Oxford Handbook of Philosophy of Time, eds. Craig Callender, Oxford: Oxford University Press, 634-646.
  • Baker, L. R., 1997, “Why Constitution is not Identity”, Journal of Philosophy, 94: 599–621.
  • Baker, L. R., 2000, Persons and Bodies, Cambridge: Cambridge University Press.
  • Balashov, Y. 1999, “Relativistic Objects”, Noûs 33(4), 644-662.
  • Balashov, Y. 2000, “Enduring and Perduring Objects in Minkowski Space-Time”, Philosophical Studies, 99, pp. 129–166.
  • Balashov, Y., 2003, “Temporal Parts and Superluminar Motion”, Philosophical Papers 32, 1-13.
  • Balashov, Y., 2014, “Relativistic Parts and Places: A Note on Corner Slices and Shrinking Chairs”, in Calosi, C. and Graziani, P. (eds.), Mereology and the Sciences, Springer, 35-51.
  • Barker, S. and P. Dowe, 2003, “Paradoxes of Multi-Location”, Analysis, 63: 106–114.
  • Barker, S. and P. Dowe, 2005, “Endurance is Paradoxical”, Analysis, 65: 69–74.
  • Barnes, E. J. and J. R. G. Williams, 2011, “A Theory of Metaphysical Indeterminacy”, Oxford Studies in Metaphysics, vol. 6.
  • Benovsky, J., 2009, “Presentism and persistence”, Pacific Philosophical Quarterly, 90 (3):291-309.
  • Braddon-Mitchell, D. and K. Miller, 2006, “The Physics of Extended Simples”, Analysis, 66: 222–226.
  • Benovsky, J., 2016, Meta-metaphysics. On Metaphysical Equivalence, Primitiveness and Theory Choice, Springer.
  • Brewer, B. and Cumpa, J. (eds.), 2019, The Nature of Ordinary Objects, Cambridge: Cambridge University Press.
  • Brink, D. O., 1997, “Rational Egoism and the Separateness of Persons”, in J. Dancy (ed.) Reading Parfit, Oxford: Blackwell: 96–134.
  • Broad, C. D., 1923, Scientific Thought, London: Routledge and Kegan Paul.
  • Brogaard, B., 2000, “Presentist Four-Dimensionalism”, Monist, 83: 341–56.
  • Burke, M., 1992, “Copper statues and pieces of copper: a challenge to the standard account”, Analysis, 52: 12-17.
  • Burke, M., 1994, “Preserving the Principle of One Object to a Place: A Novel Account of the Relations among Objects, Sorts, Sortals and Persistence Conditions”, Philosophy and Phenomenological Research, 54: 591–624.
  • Calosi, C., 2015, “The Relativistic Invariance of 4D shapes”, Studies in History and Philosophy of Science 50, 1-4.
  • Carnap, R., 1967, The Logical Structure of the World, (trans.) George, R. A., Berkeley: University of California Press.
  • Carter, J., 2011, “St. Augustine on Time, Time Numbers, and Enduring Objects” Vivarium 49: 301–323.
  • Casati, R. and Varzi, A., 1999, Parts and Places, Cambridge, MA: MIT Press.
  • Casati, R. and Varzi, A., 2014, “Events”, The Stanford Encyclopedia of Philosophy (Winter 2015 Edition), Edward N. Zalta (ed.).
  • Chisholm, R. M., 1973, “Parts as Essential to their Wholes”, Review of Metaphysics, 26: 581–603.
  • Chisholm, R. M., 1975, “Mereological Essentialism: Some Further Considerations”, The Review of Metaphysics, 28 (3):477-484.
  • Chisholm, R. M., 1976, Person and Object, La Salle (IL): Open Court.
  • Cleve, J., 1986, “Mereological Essentialism, Mereological Conjunctivism, and Identity Through Time”, Midwest Studies in Philosophy, 11 (1):141-156.
  • Costa, D., 2017, “The Transcendentist Theory of Persistence”, Journal of Philosophy, 114 (2):57-75.
  • Costa, D. 2017a, “The Limit Decision Problem and Four-dimensionalism”, Vivarium 55, 199-216.
  • Costa, D. 2019, “Was Bonaventure a Four-dimensionalist?”, British Journal for the History of Philosophy 28(2), 393-404.
  • Cresswell, M. J., 1986, “Why Objects Exist but Events Occur”, Studia Logica, 45: 371–375; reprinted in Events, pp. 449–453.
  • Crisp, T. M., and D. P. Smith, 2005, “’Wholly Present’ defined”, Philosophy and Phenomenological Research 71(2): 318-344.
  • Cross, R., 1999, “Four-dimensionalism and Identity Across Time: Henry of Ghent vs. Bonaventure”, Journal of the History of Philosophy 37: 393–414.
  • Davidson, M., 2014, “Special Relativity and the Intrinsicality of Shape”, Analysis, 74, 57-58.
  • Donnelly, M., 2016, “Three-Dimensionalism”, in Davis, M. (ed.), Oxford Handbook of Philosophy Online, Oxford University Press.
  • Dummett, M., 1975, “Wang’s Paradox”, Synthese, 30: 301–24.
  • Ehring, D., 1997, Causation and Persistence, New York: Oxford University Press.
  • Ehring, D., 2002, “Spatial Relations between Universals” Australasian Journal of Philosophy, 80(1): 17–23.
  • Fine, K., 1999, “Things and their Parts”, Midwest Studies in Philosophy 23 (1), 61-74.
  • Fine, K., 2006, “In Defense of Three-Dimensionalism”, The Journal of Philosophy, 103 (12): 699–714.
  • Gallois, A., 1998, Occasions of Identity, Oxford: Clarendon Press.
  • Galton, A. P., 2006, “Processes as continuants”. In J. Pustejovsky & P. Revesz (eds), 13th International Symposium on Temporal Representation and Reasoning (TIME 2006: 187). Los Alamitos, CA: IEEE Computer Society.
  • Galton, A. P., and Mizoguchi, R., 2009, “The water falls but the waterfall does not fall: New perspectives on objects, processes and events”, Applied Ontology, 4 (2): 71-107.
  • Geach, P. T., 1972, Logic Matters, Oxford: Blackwell.
  • Geach, P. T., 1980, Reference and Generality, Ithaca, NY: Cornell University Press.
  • Gibbard, A., 1975, “Contingent Identity”, Journal of Philosophical Logic, 4(2):187-221.
  • Gibson, I. and Pooley, O. 2006. “Relativistic Persistence”, Philosophical Perspectives 20 (1), 157-198.
  • Gilmore, C., 2006, “Where in the Relativistic World Are We?”, Philosophical Perspectives, 20: 199–236.
  • Gilmore, C., 2007, “Time Travel, Coinciding Objects, and Persistence,” in Dean Zimmerman, ed., Oxford Studies in Metaphysics, vol. 3, New York: Oxford University Press, pp. 177–98.
  • Gilmore, C., 2008, “Persistence and Location in Relativistic Spacetime”, Philosophy Compass, 3.6: 1224–1254
  • Gilmore, C., Costa, D., and Calosi, C., 2016, “Relativity and Three Four-Dimensionalisms”. Philosophy Compass 11, no. 2: 102–120.
  • Goodman, N., 1951, The Structure of Appearance, Cambridge (MA): Harvard University Press.
  • Griffin, N., 1977, Relative Identity, New York: Oxford University Press.
  • Hacker, P. M. S., 1982, “Events, Ontology and Grammar”, Philosophy, 57:477–486; reprinted in Events, pp. 79–88.
  • Haslanger, S., 1989, “Endurance and Temporary Intrinsics”, Analysis, 49: 119–25.
  • Haslanger, S., 1994, “Humean Supervenience and Enduring Things”, Australasian Journal of Philosophy 72, 339-59.
  • Haslanger, S., 2003, “Persistence Through Time”, in Loux, M. and Zimmerman, D. (eds.) The Oxford Handbook of Metaphysics, Oxford: Oxford University Press
  • Hawley, K., 1999, “Persistence and Non-Supervenient Relations”, Mind, 108: 53–67.
  • Hawley, K., 2001, How Things Persist, Oxford: Oxford University Press.
  • Hawthorne, J. and G. Uzquiano, 2011, “How Many Angels Can Dance on the Point of a Needle? Transcendental Theology Meets Modal Metaphysics”, Mind, 120: 53–81.
  • Heller, M., 1984, “Temporal Parts of Four Dimensional Objects”, Philosophical Studies, 46: 323-34.
  • Heller, M., 1990, The Ontology of Physical Objects, Cambridge: Cambridge University Press.
  • Helm, P., 1979, “John Edwards and the Doctrine of Temporal Parts” Archiv für Geschichte der Philosophie, 61: 37–51.
  • Hinchliff, M., 1996, “The Puzzle of Change”, Philosophical Perspectives, 10: 119-136.
  • Hirsch, E., 2007, “Physical-object ontology, verbal disputes, and common sense”, Philosophy and Phenomenological Research 70(1), 67-97.
  • Hofweber, T. and D. Velleman, 2011, “How to Endure”, Philosophical Quarterly, 61: 37–57.
  • Hofweber, T., & Lange, M., 2017, “Fine’s fragmentalist interpretation of special relativity” Nous, 51(4), 871–883.
  • Hudson, H., 2000, “Universalism, Four-Dimensionalism and Vagueness”, Philosophy and Phenomenological Research, 60: 547–60.
  • Hudson, H., 2005, The Metaphysics of Hyperspace, Oxford: Oxford University Press.
  • Hudson, H., 2006, “Simple Statues”, Philo 9: 40-46.
  • Huemer M., Kovitz B, 2003, “Causation as simultaneous and continuous” The Philosophical Quarterly, 53, 556–565.
  • Johnston, M., 1987, “Is There a Problem about Persistence?”, Proceedings of the Aristotelian, 61: 107-135.
  • Jubien, M., 1993, Ontology, Modality, and the Fallacy of Reference, Cambridge: Cambridge University Press.
  • Kleinschmidt, S., 2017, “Refining Four-Dimensionalism”, Synthese, 194: 4623-4640.
  • Kant, I., 1965, Critique of Pure Reason, orig. 1781, trans. N. Kemp Smith. New York: Macmillan Press.
  • Koslicki, K., 2008, The Structure of Objects, Oxford: Oxford University Press.
  • Leonard, M., 2018, “Enduring Through Gunk”, Erkenntnis, 83: 753-771.
  • Le Poidevin, R., 1991, Change, Cause and Contradiction, Basingstoke: Macmillan.
  • Lewis, D. K., 1986, On the Plurality of Worlds, Oxford: Blackwell.
  • Lewis, D. K., 1988, “Re-arrangement of Particles: Reply to Lowe”, Analysis, 48: 65–72.
  • Lewis, D., 1976, “Survival and Identity”, in Amelie Rorty (ed.), The Identities of Persons, Berkeley, CA: University of California Press, 117–40. Reprinted with significant postscripts in Lewis’s Philosophical Papers vol. I, Oxford: Oxford University Press.
  • Lombard, L. B., 1999, “On the alleged incompatibility of presentism and temporal parts”, Philosophia, 27 (1-2): 253-260.
  • Lombard, L. B., 1986, Events: A Metaphysical Study, London: Routledge.
  • Lombard, L. B., 1994, “The Doctrine of Temporal Parts and the ‘No-Change’ Objection”, Philosophy and Phenomenological Research, 54.2: 365–72.
  • Lowe, E. J., 1987, “Lewis on Perdurance versus Endurance”, Analysis, 47: 152–4.
  • Lowe, E. J., 1988, “The Problems of Intrinsic Change: Rejoinder to Lewis”, Analysis, 48: 72-7.
  • Lowe, E. J., 1995, “Coinciding objects: in defence of the ‘standard account’”, Analysis, 55(3), 171–178.
  • Mackie, P., 2008, “Coincidence and Identity”, Royal Institute of Philosophy Supplement, 62: 151-176.
  • Markosian, N., 1998, “Brutal Composition”, Philosophical Studies 22(3): 211-249.
  • Markosian, N., 2004, “Simples, Stuff and Simple People”, The Monist, 87: 405-428.
  • McCall, S. and Lowe, J., 2006, “The 3D/4D controversy: a storm in a teacup”, Noûs 40(3), 570-578.
  • McDaniel, K., 2003, “Against MaxCon Simples”, Australasian Journal of Philosophy, 81: 265-275.
  • McDaniel, K., 2007a, “Brutal Simples”, in D. Zimmerman (ed.), Oxford Studies in Metaphysics, 3: 233–265.
  • McDaniel, K., 2007b, “Extended Simples”, Philosophical Studies, 133: 131–141.
  • McTaggart, J. M. E., 1921, The Nature of Existence, I, Cambridge: Cambridge University Press.
  • McTaggart, J. M. E., 1927, The Nature of Existence, II, Cambridge: Cambridge University Press.
  • Mellor, D. H., 1981, Real Time, Cambridge: Cambridge University Press.
  • Mellor, D. H., 1998, Real Time II, London: Routledge.
  • Merricks, T., 1994, “Endurance and Indiscernibility”. Journal of Philosophy, 91: 165–84.
  • Merricks, T., 1995, “On the incompatibility of enduring and perduring entities”, Mind, 104 (415): 521-531.
  • Merricks, T., 1999, “Persistence, Parts, and Presentism”, Noûs 33, 421-438.
  • Miller, K., 2004, “Enduring Special Relativity”, Southern Journal of Philosophy 42, 349-70.
  • Miller, K., 2005, “The Metaphysical Equivalence of Three and Four Dimensionalism”, Erkenntnis 62 (1), 91-117.
  • Noonan, H. and Curtis, B., 2018, “Identity”, The Stanford Encyclopedia of Philosophy (Summer 2018 Edition), Edward N. Zalta (ed.).
  • Noonan, H., 1999, “Identity, Constitution and Microphysical Supervenience”, Proceedings of Aristotelian Society, 99: 273-288.
  • Oderberg, D., 1993, The Metaphysics of Identity over Time. London/New York: Macmillan/St Martin’s Press.
  • Oderberg, D., 2004, “Temporal Parts and the Possibility of Change”, Philosophy and Phenomenological Research, 69.3: 686–703.
  • Parsons, J., 2000, “Must a Four-Dimensionalist Believe in Temporal Parts?” The Monist, 83(3): 399–418.
  • Parsons, J., 2007, “Theories of Location”, in D. Zimmerman (ed.), Oxford Studies in Metaphysics, pp. 201-32.
  • Pashby, T., 2013, “Do Quantum Objects Have Temporal Parts?”, Philosophy of Science 80(5), 1137-47.
  • Pashby, T., 2016, “How Do Things Persist? Location in Physics and the Metaphysics of Persistence”, Dialectica 70(3), 269-309.
  • Quine, W. V. O., 1953, “Identity, Ostension and Hypostasis”, in his From a Logical Point of View, Cambridge, MA: Harvard University Press, 65–79.
  • Quine, W. V. O., 1960, Word and Object, Cambridge, Mass.: MIT Press.
  • Quine, W. V. O., 1970, Philosophy of Logic, Englewood Cliffs (NJ): Prentice-Hall.
  • Quine, W. V. O., 1981, Theories and Things, Cambridge, MA: Harvard University Press.
  • Rea, M., (ed.), 1997, Material Constitution, Lanham, MD: Rowan & Littlefield.
  • Rea, M., 1995, “The Problem of Material Constitution”, Philosophical Review, 104: 525–52.
  • Rea, M., 1998, “Temporal Parts Unmotivated”, Philosophical Review, 107: 225–60.
  • Rosen, G. and Dorr, C., 2002, “Composition as Fiction”, in Gale, R., (ed.), The Blackwell Guide to Metaphysics, Oxford: Blackwell, pp. 151-174.
  • Russell, B., 1914. Our Knowledge of the External World, London: Allen & Unwin Ltd.
  • Russell, B., 1923, “Vagueness”, Australasian Journal of Philosophy and Psychology, 1: 84–92.
  • Russell, B., 1927, The Analysis of Matter, New York: Harcourt, Brace & Company.
  • Sattig, T., 2006, The Language and Reality of Time, Oxford: Oxford University Press.
  • Saucedo, R., 2011, “Parthood and Location”, in K. Bennett and D. Zimmerman (eds.), Oxford Studies in Metaphysics, 6: 223–284.
  • Schaffer, J., 2009, “On What Grounds What”, in Chalmers, D., D. Manely, and R. Wasserman (eds.), Metametaphysics, pp. 347–283.
  • Seadley 1982, “The Stoic Criterion of Identity”, Phronesis, 27 (3): 255-275.
  • Shoemaker, S., 1988, “On What There Are”, Philosophical Topics, 26: 201-23.
  • Sider, T., 1996, ‘All the World’s a Stage’, Australasian Journal of Philosophy, 74: 433–53.
  • Sider, T., 2001, Four-Dimensionalism, Oxford: Oxford University Press.
  • Sider, T., 2007, “Parthood”, The Philosophical Review, 116: 51–91
  • Sider, T., 2013, “Against Parthood”, in Bennett, K. and Zimmerman, D.W., (ed.), Oxford Studies in Metaphysics, vol. 8, Oxford: Oxford University Press, pp. 237-93.
  • Simons, P., 1987, Parts: A Study in Ontology, Oxford: Clarendon Press.
  • Simons, P., 2000a, “How to Exist at a Time When You Have No Temporal Parts,” The Monist, 83 (3): 419–36.
  • Simons, P., 2000b, “Continuants and Occurrents”, Proceedings of the Aristotelian Society, Supplementary Volume 74: 59–75.
  • Simons, P., 2004, “Extended Simples: A Third Way Between Atoms and Gunk”, The Monist, 87: 371-84.
  • Smart, J. J. C., 1963, Philosophy and Scientific Realism, London: Routledge & Kegan Paul.
  • Smart, J. J. C., 1972, “Space-Time and Individuals”, in Richard Rudner and Israel Scheffler, eds., Logic and Art: Essays in Honor of Nelson Goodman, New York: Macmillan Publishing Company, pp. 3–20.
  • Steem, K. I., 2010, “Threedimentionalist Semantic Solution to the Problem of Vagueness”, Philosophical Studies 150 (1): 79-96.
  • Steward, H., 2013, “Processes, Continuants, and Individuals”, Mind, 122 (487): 781-812.
  • Stout, R., 1997, “Processes”, Philosophy, 72: 19–27.
  • Stout, R., 2016, “The Category of Occurrent Continuants”, Mind, 125 (497): 41-62.
  • Strawson, P. F., 1959, Individuals: An Essay in Descriptive Metaphysics, London: Methuen.
  • Thomson, J. J., 1983, “Parthood and Identity Across Time”, Journal of Philosophy, 80: 201–20.
  • Thomson, J. J., 1998, “The Statue and the Clay”, Noûs, 32: 148–73.
  • Torre, S., 2015, “Restricted Diachronic Composition and Special Relativity”, British Journal for the Philosophy of Science, 66(2), 235-55.
  • van Fraassen, B. C., 1970, An Introduction to the Philosophy of Time and Space, Columbia University Press.
  • van Inwagen, P., 1981, “The Doctrine of Arbitrary Undetached Parts”, Pacific Philosophical Quarterly, 62: 123–137.
  • van Inwagen, P., 1988, “How to Reason about Vague Objects”, Philosophical Topics, 16: 255–84.
  • van Inwagen, P., 1990a, “Four-Dimensional objects”, Nous, 24 (2): 245-255.
  • Van Inwagen, P., 1990b, Material Beings, Ithaca, NY: Cornell University Press.
  • van Inwagen, P., 2000, “Temporal parts and identity through time”, Monist, 83 (3): 437-459.
  • Varzi, A., 2003, “Naming the Stages”, Dialectica, 57: 387–412.
  • Varzi, A., 2007, “Promiscous Endurantism and Diachronic Vagueness”, American Philosophical Quarterly, 44: 181-189.
  • Varzi, A., 2016, “Mereology”, The Stanford Encyclopedia of Philosophy (Spring 2019 Edition), Edward N. Zalta (ed.).
  • Viebahn, E., 2013, “Counting Stages”, Australasian Journal of Philosophy, 91: 311-324.
  • Wassermann, R., 2016, “Theories of Persistence”, Philosophical Studies 173, 243-50.
  • Whitehead, A. N., 1920, The Concept of Nature, Cambridge: Cambridge University Press.
  • Wiggins, D., 1968, “On Being in the Same Place at the Same Time”, Philosophical Review, 77: 90–5.
  • Wiggins, D., 1979, “Mereological Essentialism”, Grazer Philosophische Studien, 7: 297-315.
  • Wiggins, D., 1980, Sameness and Substance, Oxford: Basil Blackwell.
  • Wilson, J. M., 2013, “A determinable-based account of metaphysical indeterminacy”, Inquiry, 56: 359-385.
  • Wilson, J., Calosi, C., 2019, “Quantum metaphysical indeterminacy”, Philosophical Studies, 176: 2599–2627.
  • Zimmerman, D., 1996, “Could Extended Objects Be Made Out of Simple Parts? An Argument for ‘Atomless Gunk’”, Philosophy and Phenomenological Research, 5 (1): 1–29.
  • Zimmerman, D., 2006, Oxford Studies in Metaphysics, Volume 2, New York: Oxford University Press.

 

Author Information

Damiano Costa
Email: damiano.costa@usi.ch
University of Italian Switzerland (Universita’ della Svizzera Italiana, University of Lugano)
Switzerland

Associationism in the Philosophy of Mind

Association dominated theorizing about the mind in the English-speaking world from the early eighteenth century through the mid-twentieth and remained an important concept into the twenty-first. This endurance across centuries and intellectual traditions means that it has manifested in many different ways in different views of mind. The basic idea, though, has been constant: Some psychological states come together more easily than others, and one factor in explaining this connection is prior pairing.

Authors sometimes trace the idea back to Aristotle’s brief discussion of memory and recollection. Association got its name—“the association of ideas”—in 1700, in John Locke’s Essay Concerning Human Understanding. British empiricists following Locke picked up the concept and built it into a general explanation of thought. In the resulting associationist tradition, association was a relation between imagistic “ideas” in the trains of conscious thought. The rise of behaviorism in the early twentieth century brought with it a reformulation of the concept. Behaviorists treated association as a link between physical stimuli and motor responses, omitting any intervening “mentalistic” processes. However, they still treated association just as centrally as the empiricist associationists. In later twentieth-century and early twenty-first-century work, association is variously treated as a relation between functionally defined representational mental states such as concepts, “subrepresentational” states (in connectionism), and activity in parts of the brain such as neurons, neural circuits, or brain regions. As a relation between representational states, association is viewed as one process among many in the mind; however, as a relation between subrepresentational or neural activities, it is again often viewed as a general explanation of thought.

Given this variety of theoretical contexts, associationism is better viewed as an orientation or research program rather than as a theory or collection of related theories. Nonetheless, there are several shared themes. First, there is a shared interest in sequences of psychological states. Second, though the laws of association vary considerably, association by contiguity has been a constant. The idea of association by contiguity is that each pairing of psychological states strengthens the association between them, increasing the ease with which the second state follows the first. In its simplest form, this can be thought of as akin to a footpath: Each use beats and strengthens the path. Third, this carries with it a more general emphasis on learning and a tendency to posit minimal innate cognitive structure.

The term “association” can refer to the sequences of thoughts themselves, to some underlying connection or disposition to sequence, or to the principle or learning process by which these connections are formed. This article uses the term to refer to underlying connections unless otherwise specified, as this is the most common use and the one that unites the others.

This article traces these themes as they developed over the years by presenting the views of central historical figures in each era, focusing specifically on their conception of the associative relation and how it operates in the mind.

Table of Contents

  1. The Empiricist Heyday (1700-1870s)
    1. John Locke (1632-1704)
    2. David Hume (1711-1776)
    3. David Hartley (1705-1757)
    4. The Scottish School: Reid, Stewart, and Hamilton
    5. Thomas Brown (1778-1820)
    6. James Mill (1773-1836) and John Stuart Mill (1806-1876)
    7. Alexander Bain (1818-1903)
    8. Themes and Lessons
  2. Fractures in Associationism (1870s-1910s)
    1. Herbert Spencer (1820-1903)
    2. Early Experimentalists: Galton, Ebbinghaus, and Wundt
    3. William James (1842-1910)
    4. Mary Whiton Calkins (1863-1930)
    5. Sigmund Freud (1856-1939)
    6. G. F. Stout (1860-1944)
    7. Themes and Lessons
  3. Behaviorism (1910s-1950s)
    1. Precursors: Pavlov, Thorndike, and Morgan
    2. John B. Watson (1878-1958)
    3. Edward S. Robinson (1893-1937)
    4. B. F. Skinner (1904-1990)
    5. Edwin Guthrie (1886-1959)
    6. Themes and Lessons
  4. After the Cognitive Revolution (1950s-2000s)
    1. Semantic Networks
    2. Associative Learning and the Resorla-Wagner Model
    3. Connectionism
  5. Ongoing Philosophical Discussion (2000s-2020s)
    1. Dual-Process Theories and Implicit Bias
    2. The Association/Cognition Distinction
  6. Conclusion
  7. References and Further Reading

1. The Empiricist Heyday (1700-1870s)

Associationism as a general philosophy of mind arguably reached its pinnacle in the work of the British Empiricists. These authors were explicit in their view of association as the key explanatory principle of the mind. Associationism also had a massive impact across the intellectual landscape of Britain in this era, influencing, for instance, ethics (through Reverend John Gay, Hume, and John Stuart Mill), literature, and poetry (see Richardson 2001).

Association in this tradition was called upon to solve two problems. The first was to explain the sequence of states in the conscious mind. The thought here is that there are some reliable patterns to the sequences which must be explained. These were explained by the “laws of association.” The basic procedure was, first, to identify sequences or patterns in sequence. Hobbes’s discussion of “mental discourse” demonstrates this interest, inspiring later associationist theories of mind and providing a famous example:

For in a discourse of our present civil war, what could seem more impertinent than to ask (as one did) what was the value of a Roman penny? Yet the coherence to me was manifest enough. For the thought of the war introduced the thought of the delivering up the king to his enemies; the thought of that brought in the thought of the delivering up of Christ; and that again the thought of the 30 pence which was the price of treason; and thence easily followed that malicious question; and all this in a moment of time, for thought is quick. (Leviathan, chapter 3)

Once the sequences have been identified, the next step is to classify them by the relations between their elements. For example, two ideas may be related by having been frequently paired, or may be similar in some way. This section and the next use “suggestion” to refer to particular incidents of sequence and “association” to refer to the underlying disposition. Secondly, some authors took the same relation to explain the generation of “complex” ideas out of “simple” ideas, often viewed as a kind of psychological atom. The empiricist project requires explaining how all knowledge could be generated from experience. This was perhaps the most common way of doing so, though it was not universal.

Associationist authors then show how associations of the various sorts that they posit can or cannot explain various phenomena. For example, belief may be treated as simply a strong association. Abilities like memory, imagination, or even sometimes reason can be treated as simply different kinds of associative sequence. As empiricists, most eschew innate knowledge and tend to limit innate mental structure relative to competing traditions, though the claim that the mind is truly a blank slate would oversimplify. Their opponents in the Scottish school, for example, treat each of these as manifesting distinct, innate faculties, and posit innate beliefs in the form of “principles of common sense.”

a. John Locke (1632-1704)

John Locke laid the groundwork for empiricist associationism and coined the term “association of ideas” in a chapter he added to the fourth edition of his Essay Concerning Human Understanding (1700). He sets up the Cartesian notion of innate ideas as a primary opponent and asserts that experience can be the only source of ideas, through two “fountains” (book 2, chapter 1): “sensation,” or experience of the outside world, and “reflection,” or experience of the internal operations of our mind. He distinguishes between “simple” ideas, such as the idea of a particular color, or of solidity, and “complex” ideas, such as the idea of beauty or of an army. Simple ideas are “received” in experience through sensation or reflection. Complex ideas, on the other hand, are created in the mind by combining two or more simple ideas into a compound.

In his chapter on association of ideas (book 2, chapter 33), Locke emphasizes the ways that different ideas come together. As he puts it:

Some of our ideas have a natural correspondence and connexion one with another: it is the office and excellency of our reason to trace these . . . Besides this, there is another connexion of ideas wholly owing to chance or custom. Ideas that in themselves are not all of kin, come to be so united in some men’s minds, that . . . the one no sooner at any time comes into the understanding, but its associate appears with it.

His discussion in this chapter focuses on the connections based on chance or custom and describes them as the root of madness. Associations as described here are formed by prior pairings and strengthened passively as habitual actions or lines of thought are repeated.

Thus, despite the significance of his work in setting the stage for later associationists, Locke does not treat association as explaining the mind in general. He treats it as a failure to reason properly, and his interest in it is not only explanatory but normative. For these reasons, some have questioned whether one ought to treat Locke as an associationist, on the thinking that associationists viewed association as the central explanatory posit in the mind (for example, see Tabb 2019). Where one lands on this question seems to depend on the use of the term. After all, Locke’s description of the formation of complex ideas by combining simple ideas was counted as a kind of association by many later associationists. The key, for Locke, is that association is a passive process, while the mind is more active in other processes. The passive nature of association will return as a criticism of associationism; see also Hoeldtke (1967) for a discussion of the history of this line of thought in British psychiatry.

b. David Hume (1711-1776)

David Hume presented arguably the first attempt to understand thought generally in associative terms. He first lays out these views in A Treatise of Human Nature (1739) and then reiterates them in An Enquiry Concerning Human Understanding (1748). According to Hume, the trains of thought are made up of ideas, which are basically images in the mind. Simple ideas, such as a specific color, taste, or smell, are copies of sensory impressions. Thoughts in general are built from these simple ideas by association.

He begins his discussion of association in the Enquiry:

It is evident that there is a principle of connexion between the different thoughts or ideas of the mind, and that, in their appearance to the memory or imagination, they introduce each other with a certain degree of method and regularity. (Enquiry, section III)

His use of the term is not limited to irrationality and madness, as Locke’s was, but it is applied to the trains of thought generally. He questions what relations might explain the observed regularities and claims that there are three: resemblance, contiguity in time or place, and cause or effect. He mentions contrast or contrariety as another candidate in a footnote (footnote 4, section III), but rejects it, arguing it is a mixture of causation and resemblance. Association also explains the combination of simple ideas into complex ideas.

Hume’s inclusion of “cause or effect” as one of the primary categories of association might be thought incongruous with his general view on causality. While the best understanding of association by cause or effect has been controversial, Hume treats it as an independent principle of association, and it can be understood as such, and not, for example, as just a strong association by contiguity. He argues that we gain the impression of a causal power by coming to expect, in the imagination, the effect with the presentation of the cause. As a general matter, he suggests that we cannot feel the relations between sequential ideas, but we can uncover them with imagination and reasoning, though these relations may be different from the factors responsible for association.

Just how generally Hume applied his conception of association may also be subject to interpretation. On the one hand, his discussions of induction, probability, and miracles in the Enquiries suggest that he views association, or habit, as the sole basis of our reasoning about the world, and as such, a normatively adequate means for doing so. On the other hand, he arguably posits several other principles of mind throughout his work. For example, he often treats the imagination as a separate capacity, and he discusses several moral sentiments that would seem to require separate principles. He also expresses uncertainty in the completeness of his list of laws of association. Moreover, he characteristically avoids claims about the ultimate foundation of human nature. In the Treatise, he says: “as to its [association’s] causes, they are mostly unknown, and must be resolv’d into original qualities of human nature which I pretend not to explain” (pg. 13). It may be that, despite its centrality in his philosophy, Hume did not view association as a bedrock principle or cause of thought, though that view later became common, due in large part to the work of David Hartley.

c. David Hartley (1705-1757)

Hartley’s Observations on Man (1749) was published just after Hume’s Enquiry, though he claimed to have been thinking about the power of association for about 18 years. Hartley’s discussion of association is more focused and sustained than Hume’s because of his explicitly programmatic goals. Following Newton’s axiomatization of physics, Hartley sought to axiomatize psychology on the twin principles of association and vibration. Vibrations, in Hartley’s system, are the physiological counterpart of associations. As association carries the mind from idea to idea, vibrations in the nerves carry sensations to the brain and through it. He references physical vibrations as causing mental associations (pg. 6), but then expresses dissatisfaction with this framing and uncertainty on the exact association-vibration relation (pp. 33-34).

The idea is that external stimuli act on nerves, inciting infinitesimally small vibrations in invisible particles of the nerve. These vibrations travel up the nerves, and upon reaching the brain, cause our experience of sensations. If a particular frequency or pattern of vibration is repeated, the brain gains the ability to incite new vibrations like them. This is, effectively, storing a copy of the idea for later thought. These ‘ideas of sensation’ are the elements from which all others are built. Ideas become associated when they are presented at the same time or in immediate succession, meaning that the first idea will bring the second to mind, and, correspondingly, their vibrations in the brain will follow in sequence.

Hartley, like Hume, viewed association as both the principle by which ideas came to follow one another and by which simple ideas were combined into complex ideas: A complex idea is the end point of the process of strengthening associations between simple ideas. Unlike Hume, though, Hartley only posited association by contiguity and did not allow for any other laws of association.

He was also, as noted, explicit in his goal of capturing psychology with the principle. He argues that supposed faculties like memory, imagination, and dreaming, as well as emotional capacities like sympathy, are merely applications of the associative principle. He also emphasized associations between sensations, ideas, and motor responses. For instance, the tuning of motor responses by association explains how we get better at skilled activities with practice. He recognizes that the resulting picture is a mechanical picture, but he does not see this as incompatible with free will, appropriately conceived.

Hartley’s most important contribution is the very project of describing an entire psychology in associative terms. This animated the associationist tradition for the next hundred years or so. In setting up his picture, he was also the first to connect association to physiological mechanisms. This became important in the work of the later empiricist associationists, and in reformulations of associative views after the cognitive revolution discussed in section 4.

d. The Scottish School: Reid, Stewart, and Hamilton

The Scottish Common Sense School, led by Thomas Reid (1710-1796) and subsequently Dugald Stewart (1753-1828) and William Hamilton (1788-1856), was the main competition to associationism in Britain. Their views are instructive in articulating the role and limits of the concept, as well as in setting up Brown’s associationism, discussed below. The Scottish School differed from the associationists in two main ways. Firstly, they took humans to be born with innate knowledge, which Reid called “principles of common sense.” Secondly, they argued for a faculty psychology: They took the mind to be endowed with a collection of distinct “powers” or capacities such as memory, imagination, conception, and judgment. The associationists, in contrast, usually treated these as different manifestations of the single principle of association. Nevertheless, the Scottish School did provide a role for associations.

Reid takes the train of conscious thoughts to be an aggregate effect of the perhaps numerous faculties active at any given time (Essays on the Intellectual Powers of Man, Essay IV, chapter IV). He does allow that frequently repeated trains might become habitual. He treats habit, then, as another faculty that makes these sequences easier to repeat. Associations, or dispositions for certain trains to repeat, are an effect of the causally prior faculty of habit.

Stewart reverses the causal order between association and habit (see Mortera 2005). For Stewart, association is a distinct operation of the mind, which produces mental habits. Association plays a more important role in his system than in Reid’s. He does retain other mental faculties, though, which are responsible for at least the first appearance of any particular sequence in thought. The mistake the associationists make, on his view, is in thinking that they have traced all mental phenomena to a single principle (1855, pp. 11-12). He admits it is possible that philosophers may someday discover the ultimate principle of psychology but doubts that the associationists have done so. Stewart is responding specifically to Joseph Priestly, who edited a famous abridged edition of Hartley’s work.

William Hamilton’s contributions to the concept of association are less direct. He provides the first history of the concept of association of ideas in his notes on The Works of Thomas Reid (1872, Supplemental Dissertation D). Hamilton’s own views also inspired later work by John Stuart Mill in his Examination of Sir William Hamilton’s Philosophy (1878).

e. Thomas Brown (1778-1820)

Thomas Brown occupies a unique position in the history of associationism. His main work, Lectures on the Philosophy of the Human Mind (1820), was published after his death at the age of 43. On the one hand, he is a student of the Scottish School, having studied under Dugald Stewart. On the other hand, he was an ardent associationist, reducing all of the supposed faculties to association. Brown explicitly casts his project as one of identifying and classifying the sequences of “feelings” in the mind, which was his general term for mental states, including ideas, emotions, and sensations.

Arguably, his philosophy of mind is more Humean than Hume’s, in that he extends Hume’s arguments against necessary connections between cause and effect in the world to the mind. He argues that an association is not a “link” between ideas that explains their sequence; it is the sequence itself. The idea of an associative link is vacuous and explanatorily circular. Brown actually argues for the term “suggestion” over “association,” though he uses the terms interchangeably when he fears no misinterpretation (Lecture 40). He differentiates two kinds of suggestion: simple suggestion, in which feelings simply follow in sequence, and relative suggestion, in which the relationship between sequential ideas is felt as well. Simple suggestion is responsible for capacities like memory and imagination, while relative suggestion allows capacities like reason and judgment.

Brown also differs from the standard associationist picture in that he, like Reid, embraces innate knowledge, which he calls “intuitive beliefs.” His prime example is belief in personal identity over time. Another is that “like follows like,” which can serve as the basis for the associating principle. He expresses an expectation that all associations will eventually be shown to be instances of association by contiguity, but does not think this has been shown yet. He thus finds it best to “avail ourselves of the most obvious categories” of contiguity, similarity, and contrast (Lecture 35).

Brown introduces several “secondary” laws of association, which can help predict which of any particular associations are likely to be followed in any given case (Lecture 37). He lists nine, including liveliness of feelings associated, frequency with which they had paired, recency, and differences arising from emotional context. While members of subsequent lists changed, the introduction of secondary laws of association may have been Brown’s most enduring legacy.

In common with those associationists above, Brown emphasizes a role for association in the formation of complex ideas out of simple ideas. However, he views ideas as states of the mind itself, not objects in the mind—a mistake he attributes primarily to Locke. As a result, he argues that it is metaphysically impossible that complex ideas are literally built of simple ideas, since the mind can only occupy one state at a time. He does argue that it is useful to think of simple ideas as continuing in a “virtual coexistence” in complex ideas, but the focus here is an historical/etiological story of how complex ideas came to be, rather than a literal decomposition.

Despite his idiosyncratic views, Brown identified his position as associationist, and it was accepted as such by the tradition. Though his work has been largely forgotten, it was very influential in the United Kingdom and United States in the years following its publication. Brown’s place in the associationist tradition strains standard interpretations of the tradition and what, if anything, unites it. After all, he denies the central associationist posit, the associative link, and allows innate knowledge.

f. James Mill (1773-1836) and John Stuart Mill (1806-1876)

James Mill’s view rivals Hartley’s as a candidate prototypical associationist picture of mind. Mill presents his views in his Analysis of the Phenomena of the Human Mind (originally published 1829, cited here from 1869; this edition includes comments from John Stuart Mill and Alexander Bain).

Like Hartley, James Mill argues that contiguity is the only law of association. Specifically, James Mill argues that similarity is just a kind of contiguity. The claim is that we are used to seeing similar objects together, as sheep tend to be found in a flock, and trees in a forest. In his editorial comments in the 1869 edition, John Stuart Mill calls this “perhaps the least successful attempt at a generalisation and simplification of the laws of mental phenomena, to be found in the work” (pg. 111). For his part, James Mill does not attribute much significance to the question, saying: “Whether the reader supposes that resemblance is, or is not, an original principle of association, will not affect our future investigations” (pg. 114).

In discussing the associative relation itself, James Mill distinguishes synchronous and successive association. Some stimuli are experienced simultaneously, as in those emanating from a single object, and others successively, as in a sequence of events. The resulting ideas are associated correspondingly. Synchronous ideas arise together and themselves constitute complex ideas. Thus, a complex idea, in James Mill’s system, is a literal composite of simpler ideas. Successively associated ideas will arise successively. Of successive association, James Mill remarks that it is not a causal relation, though he does not elaborate on what he means by this (pg. 81). He describes three different ways that the strength of an association can manifest: “First, when it is more permanent than another: Secondly, when it is performed with more certainty: Thirdly when it is performed with more facility” (pg. 82). Adapting some of Brown’s secondary laws, he argues that strength is caused by the vividness of the associated feelings and frequency of the association.

James Mill reduces the various “active” and “intellectual” powers of the mind to association. He limits his discussion of association to mental phenomena, though he recognizes the significance of physiology for motor movements and reflexes. For instance, conception, consciousness, and reflection simply refer to the train of conscious ideas itself. Memory and imagination are particular segments of the trains. Motives are associations between actions and positive or negative sensations which they produce. The will is also reduced to an association between various ideas and muscular movements. Thus, even the active powers are mechanistic. Belief is just a strong association. Ratiocination, as in syllogistic reasoning, simply chains associations. Consider the syllogism: “All men are animals: kings are men : therefore kings are animals” (pg. 424). This produces the compound association “kings – men – animals.” For James Mill, this compound association includes an intermediate that remains in place, but is simply passed over so quickly it can be imperceptible and appear to simply be “kings – animals”; much in the same way that complex ideas still include all of the simpler ideas. This sets up a noteworthy disagreement between James and his son, John Stuart Mill.

John Stuart Mill argues, against his father, that complex ideas are new entities, not mere aggregates of simple ideas, and that intermediate ideas can drop out of sequences like that above. In general, John Stuart Mill analogizes the association of ideas to a kind of chemistry, where a new compound has new properties separate from its constituent elements (A System of Logic, chapter IV). In James Mill’s view of association, ideas retain their identity in combination, like bricks in a wall.

John Stuart Mill’s views on association are spread through several texts (see Warren 1928 pp.95-103 for a summary of his views), and his psychological aspirations are not as imperial or systematic as his father’s. This is evident partly in his lack of a sustained treatment, but also in the phenomena he does not attribute to association. For instance, he does not treat induction as an associative phenomenon, breaking with Hume (see A System of Logic). Similarly, breaking with his father, he does not view belief as simply a strong association, arguing that it must include some other irreducible element (notes in James Mill’s Analysis, pg. 404). When John Stuart Mill does allude to a systematic development of association, he usually defers to our next subject, Alexander Bain.

g. Alexander Bain (1818-1903)

Alexander Bain presents a sophisticated version of empiricist associationism. His main work on the topic comes in The Senses and the Intellect (originally published 1855, cited here from 3rd ed., 1868). Bain’s early work was developed and published with significant help from his close friend and mentor, J. S. Mill, but became a standard.

Bain differs most from previous associationists in the role he grants to instincts. By “instincts,” he means reflex actions, basic coordinated movement patterns such as walking and simple vocalization, and the seeds of volition (the potential for spontaneous action). This discussion is unique, first, in that he separates these out from the domain explained by association. He takes instincts to be “primordial,” inborn, and unlearned. Second, he opens his text with a discussion of basic neuroanatomy and function and explains instincts largely by appeal to the structure of the nervous system and the flow of nervous energy. This discussion was aided in part by recent progress in physiology, but also by an avowed interest in bringing physiology and psychology in contact.

Bain, nonetheless, takes association to be the central explanatory principle for phenomena belonging to the intellect. By “intellect,” he has in mind phenomena one might call thought, such as learning, memory, reasoning, judgment, and imagination. When he switches to his discussion of the intellect, his physiological discussions drop out, and his method is entirely introspective. As Robert Young notes: “his work points two ways: forward to an experimental psychophysiology, and backward to the method of introspection” (1970, pg.133).

Bain never makes any distinction between simple and complex ideas, and he discusses association in successive terms. He also does not restrict association to ideas and argues that the same principles can combine, sequence, and modify patterns of movement, emotions, sensations, and the instincts generally.

He admits three fundamental principles of association: similarity, contiguity, and contrast. Contiguity is the basic principle of memory and learning, while similarity is the basic principle of reasoning, judgment, and imagination. Nonetheless, the three are interdependent in complex ways. For instance, similarity is required for contiguity to be possible: Similarity is required for us to recognize that this sequence is similar enough to a former sequence for them to both strengthen the same association by contiguity. The principle of contrast has a more complex role. On the one hand, it is fundamental to the stream of consciousness in the first place. We would not recognize changes in consciousness as changes without this principle. As such, we cannot be conscious of anything as something without recognizing that there is something else it is not: If red were the only color, we would simply not be conscious of color. The other principles would be impossible. Nonetheless, it can also drive sequences, but only when properly scaffolded by similarity or contiguity. Similarity is necessary for association by contrast because contrast is always within a kind, and similarity is necessary for recognition of that original kind; he notes, “we oppose a long road to a short road, we do not oppose a long road to a loud sound” (1868, pg. 567). In many particular cases, contrast can be driven by contiguity, as contrasting concepts are frequently paired: up and down, pain and pleasure, true and false, and so on. Experiences of contrast themselves, he notes, often arouse emotional responses, as in works of poetry and literature. In other work, however, Bain does not seem to find the question of whether contrast is a separate principle of association to be all that interesting, since transitions based on contrast are very rare, and many instances of contrast-based associations are in fact based in contiguity (1887).

He discusses two other kinds of association: compound association and constructive association (in his first edition, he lists these as additional principles of association, but drops that categorization by the third). Compound association includes the ways associations can interact. For instance, if there are several features present that all remind us of a friend, all of those associative strengths can combine to make it more likely that we think of the friend. He groups imagination and creativity under “constructive association,” an active process of combining ideas, as in imagination, creativity, and the formation of novel sentences.

h. Themes and Lessons

Surveying these views uncovers significant diversity, even among the “pure” associationists found in the empiricist tradition. Most abstractly, the authors differed in their metaphysics. Brown was an avowed dualist. Hartley expresses uncertainty on the mind/brain relation but posited a physiological counterpart to association. Hume and Reid refused to speculate on metaphysics. Precursors include George Berkeley, an idealist, and Thomas Hobbes, a materialist.

The topics of debate within associationism itself included, first, the proposed list of laws of association. While all of the authors mentioned took association by contiguity to be among them, Hume included resemblance and cause or effect, Brown and Bain included similarity and contrast, and Hartley and James Mill included no others. It is common to view associationism as defined by the reliance on association by contiguity. While contiguity was generally posited, this is an oversimplification. It misses not only the diversity in laws posited, but also by the attitude authors take towards those laws. Many central associationists, including Hume, Brown, James Mill, and Bain, either describe their classification to be provisional, or express some willingness to defer. Overall, Stewart’s discussion of the question of how far one traces the causal/explanatory thread captures the general situation. The starting point is observed sequences of conscious thought, and the question is how far one can go in finding the principles that explain those sequences.

Authors also disagreed on whether the process, force, or principle combining simple ideas into complex ideas (simultaneous association) was the same as that producing the sequences of ideas through the mind (successive association). All of the theorists discussed here accept successive association, while simultaneous association is more controversial. Brown disavows simultaneous association, while Bain simply ignores it. Even proponents of simultaneous association disagree on how it operates, as evidenced in John Stuart Mill’s disagreement with his father on “mental chemistry.” Questions like this, about how more complex ideas are formed, remain at issue(for example, see Fodor and Pylyshyn 1988 and Fodor 1998). The formation of abstract ideas was a particularly difficult version of this problem through much of the tradition; it is much easier to see how ideas formed through sensory impressions can refer to concrete objects. Simultaneous association could provide an answer according to which abstract ideas include all of the particulars, while others take abstract ideas to simply include a particular feature, or simply a name for a feature, by, for instance, examining a feeling of similarity between two ideas of particulars.

Finally, there is disagreement in what psychological elements associations are supposed to hold between. Discussion of association often latches onto Locke’s term “association of ideas,” ignoring views that take stimuli and motor movements (most of the authors above, including arguably Locke himself as he describes a visual context improving a dance; Essay, book 2, chapter 33, section 16), reflexes, and instincts (Bain) to be associable in just the same way. Even when discussing association as a relation between ideas, there is disagreement on the nature of ideas and their relationship to mind. For instance, Brown criticizes Locke for treating ideas as independent objects in the mind, rather than states of the mental substance.

The diversity in associationist views suggests that associationism is better viewed as a research program with shared questions and methods, rather than a shared theory or set of theories (Dacey 2015). Such an approach makes better sense of similarities and differences in the views. Hume, Hartley, and James Mill make good prototypes for associationism, but one misses much if one takes any particular author to speak for the tradition as a whole.

2. Fractures in Associationism (1870s-1910s)

In the late nineteenth and early twentieth centuries, the associationist tradition began to fracture. Several factors combined to shape this overall trend. Important changes in the intellectual landscape included the arrival of evolutionary theory, the rise of experimental psychology—bringing with it psychology’s separation from philosophy as a field—and increasing understanding of neurophysiology. At the same time, several criticisms of the pure associationist philosophies became salient. Through this era, the basic conception of association was still largely preserved from the previous one: It is a relation between internal elements of consciousness. By this time, materialism had largely taken over, and most authors here view association as having some neural basis, even if association itself is a psychological relation.

Associationism fractured in this era because the trend was to disavow the general, purely associationist program described in the last section, even if authors still saw association as a central concept. Thus, while associationism lost a shared outlook and purpose, there was still much progress made in testing the possibilities and limits of the concept of association.

a. Herbert Spencer (1820-1903)

Herbert Spencer’s philosophy was framed by a systematic worldview that placed evolutionary progress, as he conceived it, at its core. His psychology was no different. His Principles of Psychology was first published in 1855, four years before On the Origin of Species, but was substantially rewritten by the third edition, which is the focus here (1880, cited here from 1899). By this point, the work had been folded into his System of Synthetic Philosophy, a ten-volume set treating everything from physics to psychology to social policy (Principles of Psychology became volumes 4 and 5). Spencer’s conception of evolution was quite different from later views. Firstly, Spencer believed in the inheritance of acquired traits. Secondly, and partly as a result of this, Spencer viewed evolution as a universal force for progress; species literally get better as they evolve.

The basic units of consciousness for Spencer are nervous shocks, or individual bursts of nervous activity. Thus, the atoms in his picture are much smaller than what we might usually call thoughts or ideas, and all of the psychological activities he describes are assumed to be localizable within the nervous system. Spencer distinguishes between “feelings” proper and relations between feelings. Feelings include what would previously have been called sensations and ideas, as well as emotions. They can exist in the mind independently. Relations are felt, in that they are present in consciousness, but they can only exist between two feelings. For instance, we might feel a relation of relative spatial position between objects in a room as we scan or imagine the scene. Both feelings and relations are associable.

The primary kind of association is that between a particular feeling and members of its same kind. Thus, similarity is the fundamental law of association, both with feelings and relations. A particular experience of red will revive a feeling corresponding to other red feelings. Spencer seems to think that the resulting “assemblages” do not constitute new feelings, effectively siding with James Mill over John Stuart. “Revivability” varies with the vividness of the reviving feeling, the frequency with which feelings have occurred together, and with the general “vigor” of the nervous tissues. This last variable includes the particular claim that a long time spent contemplating one subject will deplete resources in the corresponding bits of brain tissue, making related ideas temporarily less revivable. Relations are generally more revivable, and so more associable, than feelings. Relations can, themselves, aggregate into classes, and revive members of the class. As a result, many relations may arise in mind between two feelings, though some, perhaps most, of these will pass too quickly to be noticed.

Spencer takes the laws of association to simply be manifestations of certain relations between feelings, which are actually associated based on similarity. For instance, he takes association by contiguity to be a relation of “likeness of relation in Time or in Space or in both” (267), which is just a kind of similarity. He does not seem to see any problem in making this claim, while still asserting frequency of co-occurrence as an independent law of revivability above. Moreover, when two feelings arrive in sequence in the mind, they are always mediated by at least two relations: one of difference, as the feelings must not be identical, and one of coexistence or sequence.

Spencer claims to have squared empiricist and rationalist views of mind using evolution (pg. 465). He combines the law of frequency with his view on the heritability of acquired traits to argue that associations learned by members of one generation can be passed on to the next. The empiricists are right that knowledge comes from learning, but the rationalists are right that we are individually born with certain frameworks of understanding the world. In early animals, simple reflexes were so combined to create more flexible instincts. Some relations in the world, like those of space and time, are so reliably encountered that their inner mental correspondents are fixed through evolutionary history. Thus, human beings are born with certain basic ideas, like those of space and time. The resulting view is one in which thought is structured by association, but associations are accrued across generations (see Warren 1928, pg. 132).

b. Early Experimentalists: Galton, Ebbinghaus, and Wundt

Francis Galton (1822-1911), Darwin’s polymath cousin, published the first experiments on association under the title Psychometric Experiments in 1879. He ran his experiments on himself; the method was to work through a list of 75 words, one by one, and record the thoughts each suggested and the time it took to form each associated thought clearly. He did so four different times, in different contexts at least a month apart. He reports 505 associations over 600 seconds total, for a rate of about 50 associations per minute. Of the 505 ideas formed, 289 were unique, with the rest repetitions. He emphasizes that this demonstrates how habitual associations are. He notes that ideas connected to memories from early in his life were more likely to be repeated across the four presentations of the relevant word. This he takes to show that older associations have achieved greater fixity.

Among his pioneering studies on memory, Hermann Ebbinghaus (1850-1909) tested capacity for learning sequences of nonsense syllables, arguably the first test of the learning of associations (1885). He found, using himself as his subject, that the number of repetitions required to learn a sequence increased with the length of the sequence. He also found that rehearsing a sequence 24 hours before learning it brought savings in learning. The savings increased with increasing number of rehearsal repetitions.

Though the first experimental psychology labs were established in Germany, where the concept of association never reached the significance it had in Britain, association remained a target of early experiments, directly or indirectly (see Warren 1928, chapter 8 for a fuller survey; see also sections on Calkins and Thorndike below). These studies established association as a controllable, measurable target for experiment, even among those who did not subscribe to associationism as a general view. This role arguably sustained association as a central concept of psychology into the twenty-first century.

Wilhelm Wundt (1832-1920) provides perhaps the most complete theoretical treatment of association among the early experimentalists (1896, section 16; 1911, chapter 3). While association plays an important role in his system, he objects that associationists leave no place for the will among the passive processes of association. Thus, he distinguishes the passive process of combination he calls association and an active process of combination he calls “apperception” These ideas were developed into structuralism in America by Wundt’s student E. B. Titchener.

c. William James (1842-1910)

William James is not generally considered an associationist, and he attacks the associationists at several points in his Principles of Psychology (originally published 1890, cited here from 1950). However, at the close of his chapter on association (chapter XIV), he professes to have maintained the body of the associationist psychology under a different framing. His framing is captured as follows:

Association, so far as the word stands for an effect, is between THINGS THOUGHT OF—it is THINGS, not ideas, which are associated in the mind. We ought to talk of the association of objects, not of the association of ideas. And so far as association stands for a cause, it is between processes in the brain—it is these which, by being associated in certain ways, determine what successive objects shall be thought. (pg. 554)

James notes here an ambiguity in the term “association”; that between association as an observed sequence of states in the conscious mind (an effect) and association as the causal process driving those sequences. His handling of each side of the ambiguity highlights, in turn, his major criticisms of associationist psychologies before him.

His claim that we ought to talk of association of objects rather than association of ideas stems from his criticism of the associationist belief that the stream of consciousness is made up of discrete “ideas.” James shares with the associationists an emphasis on the stream of consciousness: He takes it to be the first introspective phenomenon of analysis for psychology (chapter IX). However, his introspective analysis of the stream of consciousness reveals it to be too complicated to be broken up into ideas. There are two main reasons for this: First, he notes, ideas are standardly treated as particular entities that are repeatedly revived across time: My idea of “blue” is the same entity now as it was 5 years ago. In contrast, James notes that the totality of our conscious state is always varied. Some of these differences come from external conditions, such as the current illumination of a blue object, or different sounds present, temperatures, and so on. Other differences come internally, including particular moods, varying emotional significance to a particular object, and previous thoughts fading away. He even suggests that organic differences in the brain, like blood flow, might influence our experience of some thought at different times.

His second concern is that consciousness does not present breaks, as one would expect when transitioning between discrete ideas. Rather, consciousness is continuous. Thoughts arise and fade, but they overlap, sometimes attitudes persist in the background, and he insists there is always a feeling present, even if some are transient and difficult to name. Thus, he prefers the term “streams of consciousness” to “trains of thought.”

The association of ideas presents a false view because conscious states are not discrete, and they are never revived in exactly the same way. Both mistakes share one major cause: the fact that we name and identify representational states by the objects that they represent. It is the common referent in the world that makes us think that the idea itself is the same each time, ignoring the nuance of particular experiences. Similarly, we focus on these ideas, ignoring the feelings that bridge them and persist through them. Thus, these problems are solved by shifting to association of objects. This, however, is just a description of the stream of consciousness, and cannot explain it.

James believes that looking at association as a brain process can explain the streams of thought while still respecting the nuances of consciousness just discussed. This claim depends on his view of habit, which he treats as a physiological, even generally physical, fact (chapter IV). Actions often repeated become easier. He explains that channels for nerve discharge become worn with use, just as a path is worn with use, or a paper creased in folding.

Thus, brain processes become associated in the sense that processes frequently repeated in sequence will tend to come in sequence. At any given moment, there are many processes operating behind a particular conscious state: Some processes will have to do with a thought we are considering, some with moods, some with emotional states, and some with ongoing perception as we think. Each of these will, in some way, contribute to the set of thoughts and feelings that come next. This, James held, could explain the various, multifaceted sequences of thought. The various feelings present are not literal “parts” of any conscious state, as in the common associationist picture of complex ideas. Even so, different feelings can potentially influence the direction of the stream of consciousness at any given point because each is attended by brain processes which are separable, and which actually direct the stream. This also allows active processes, like attention and interest, to contribute to guiding the stream of consciousness, even if they are, in effect, operating through habit.

A natural question would be how we know which of any candidate set of thoughts will come next. James discusses some factors much like Brown’s “secondary laws” above, including interest, recency, vividness, and emotional congruity. This is the question taken up by Mary Whiton Calkins.

d. Mary Whiton Calkins (1863-1930)

Mary Whiton Calkins was both the first woman president of the American Psychological Association and the first woman president of the American Philosophical Association. She was a student of James, and despite his enthusiastic support, she was refused her PhD from Harvard because of her gender. This did not prevent her from an influential career and many years as a faculty member at Wellesley College. Her description of association in her textbook (1901) largely follows James’s. However, Calkins was much more interested in experimental methods than him.

She was particularly interested in the question, “What one of the numberless images which might conceivably follow upon the present percept or image will actually be associated with it?” (1896, pg. 32), taking this to be the key to making concrete predictions about the stream of consciousness, and even perhaps to control problematic sequences. In so doing, she targets what had elsewhere been called the secondary laws: frequency, vividness, recency, and primacy. In a paired-associate memory task, she finds frequency to be by far the most significant factor. She finds this surprising, as she takes introspection to indicate that recency and vividness are just as important. She sees this result as significant for training and correcting associative sequences.

e. Sigmund Freud (1856-1939)

Sigmund Freud’s relationship to associationism is most evident in two aspects of his work. First, Freud outlined a thoroughly associationist picture of the mind and brain in his early and unpublished Project for a Scientific Psychology (written 1895, published posthumously in 1950). Second is his invention of the method of free association.

In the Project, Freud conceives of the nervous system as a network of discrete, but contacting, neurons, through which flows a nervous energy he calls “Q.” As neurons become “cathected” (filled) with Q, they eventually discharge to the next downstream neurons. The ultimate discharge of Q results in motor movements, which is how we actually release Q energy. In the central neurons, responsible for memory and thought, there is a resistance at the contact barrier. There is no such resistance at the barriers of sensory neurons. Learning occurs because frequent movements of Q through a barrier will lower its resistance. He identifies this as association by contiguity (pg. 319). Thus, the neurophysiological picture is also a psychological picture, and these basic processes are associative.

In addition, Freud adds two other systems. First is a class of neurons that respond to the period of activity in other neurons. These are able to track which perceptions are real and which are fantasy or hallucination, because stimuli coming in through the senses have characteristic periods. Second is the ego. In this work, the ego is simply a pattern of Q levels distributed across the entire network. By maintaining this distribution, the ego prevents any one neuron or area from becoming too heavily cathected with Q, which would result in erratic thought and action because of the resulting violent discharge. The role of the ego is thus inhibitory. Together, these additional systems control the underlying associative processes in ways that allow rational thought.

Freud never published this work and abandoned most of the details. Nonetheless, it arguably previews the basic underlying theories of much of his later work (as noted by the editor of the standard edition of Freud [Vol 1. pp. 290-292] and Kitcher 1992; see Sulloway 1979 for discussion of continental associationist influences on Freud). The thinking would go that breakdowns in rationality, as in dreaming or pathology, come when basic processes like association operate uncontrolled.

Regardless of exactly how it fits in his overall theoretical framework, his invention of the method of free association deserves note as well. Freud began using free association in the 1890s as an alternative to hypnosis. The patient would lie in a relaxed but waking state and simply discuss thoughts as they came freely to mind. The therapist would then analyze the sequence of thoughts and attempt to determine what unconscious thoughts or desires might be directing them. In later versions, patients are asked to keep in mind a starting point of interest or are presented a particular word or image to respond to. Free association was massively influential, and it remains the core psychoanalytic method (and has also been used in mapping semantic networks; see section 4). It also takes associative processes to operate in the unconscious, another view that would be revived later (see section 5).

f. G. F. Stout (1860-1944)

G. F. Stout continues the trend of criticizing associationism while allowing a significant role for association in his Manual of Psychology (1899). A prominent British philosopher and psychologist at the turn of the century, Stout taught, at different times, at Cambridge (including students G. E. Moore and Bertrand Russell), Aberdeen, Oxford, and St. Andrews. He accepts association as a valuable story for the reproduction of particular elements of consciousness, but he argues that there is an independent capacity for generating new elements. He specifically attacks John Stuart Mill and his analogy of mental chemistry (1899, book I, chapter III). According to Stout, Mill was right that complex ideas are not mere aggregates of simple ideas, but failed to recognize that this means that a new idea must be generated: The new idea had aggregates of associated simple ideas as precursors, not as parts—previewing the work of the Gestalt psychologists. He claims that Mill’s attempt to include the simple ideas in complex ideas as in chemical combination is a desperate attempt to save the theory from a fatal flaw.

Stout does grant association a significant part in the reproduction of ideas in the train of thought. There, as well, he provides a novel interpretation (book IV, chapter II). Specifically, he argues that association by contiguity should be rephrased as “contiguity of interest.” This means that only those elements that are interesting—at the time, based on goals, intentions, and other states—will be associated, and uninteresting elements will be dropped. He takes this to be the sole law of association. Apparent associations by similarity are in fact associations by contiguity of interest, because similar objects will have some aspects that are identical, and these aspects drive the suggestion. He also addresses the question of which of several competing associations will actually lead thought. He mentions Brown’s secondary laws as factors, but he takes the most important to be the “total mental state,” or the “general trend of psychical activity,” such that factors like intentions or background desires are usually decisive.

Finally, he argues that the process of ideational construction is active at all times and does not merely generate new ideas. It also modifies ideas as they are revived. Ideas take on new relations to other ideas. They may be seen in a different light, with different aspects emphasized based on differences in context, as well as in mental state and interests. Ideas are, in a real sense, remade as they are revived.

g. Themes and Lessons

The proliferation of interpretations of association through this era demonstrates the decline of the pure empiricist versions of the view. Nonetheless, the empiricist conception remains prominent. Authors who disavow that position still hold views substantially similar to it. Those working to refine the concept are still working from an empiricist starting point: Associations hold between conscious states, and contiguity and similarity remain the most common laws of association. Compared with the associationists described in the previous section, the diversity of views in this section is greater by a quantitative, rather than qualitative, degree.

Nonetheless, these authors do not proclaim their adherence to associationism, and many expressly disavow it. Worries about the theory itself center on its atomism—treating simple ideas as discrete indivisible units that are reified in thought—and its passive, mechanical depiction of mind. More general trends include increasing knowledge in related fields such as evolutionary theory, neurophysiology, and experimental psychology. Evolutionary theory poses a challenge to associationist empiricism, as it allows a mechanism for innate ideas. Neurophysiology and experimental psychology both contributed to the fracturing of associationism, partly because progress on each came at the time from the continent, where there was less interest in a general associationist picture than in the United Kingdom. Nonetheless, each development supported a role for association. At least superficially, the network of neural connections looks a lot like the network of associated ideas. And associations make good experimental targets because they are easy to induce and test.

It does not seem that associationism must stand or fall with any of these challenges or developments singly, as there are views broadly consistent with each in the previous section. Rather, these problems persisted and compiled at the same time as new ideas from other fields allowed researchers to step out of the old paradigm and cast about for new formulations of the old idea. The general picture, then, is of a concept losing its role as the single core-concept of psychology and philosophy of mind, but nonetheless retaining several important roles. The development that finally brought this particular associationist tradition to an end, the rise of behaviorism, returned association to its central position.

3. Behaviorism (1910s-1950s)

Behaviorism arose in America as a reaction to the introspective methods that had dominated psychology to that point. Most of the authors listed above built their systems entirely from introspection. Even the experimentalists mostly recorded introspective reports, often using themselves as the only subject. The behaviorists did not see this as a reliable basis for a scientific psychology. Science, as they saw it, only succeeded when it studied public, observable phenomena that could be recorded, measured, and independently verified. Introspection is a private process, which is not independently verifiable or objectively measurable.

The result of adopting this viewpoint was a complete change in the conceptual basis of psychology, as well as in its methodology and theory. Behaviorists abandoned concepts like “ideas” and “feelings,” and the notion that the stream of consciousness was the primary phenomenon of psychology. Some even denied the phenomenon of consciousness itself. What they did not abandon, however, was the concept of association. In fact, association regained its role as the central concept of psychology, now reimagined as a relation between external stimuli and responses rather than internal conscious states. Even the law of association by contiguity was co-opted.

a. Precursors: Pavlov, Thorndike, and Morgan

Ivan Pavlov’s (1849-1936) famous work provided what would be a core phenomenon and some of the basic language of the behaviorists. Pavlov (1902) was interested in the physiology of the digestive system of dogs and the particular stimuli which elicit salivation. In the course of his studies, he observed that salivation would occur as the attendant who usually fed the animal approached. He noted a difference between “unconditional reflex,” as when salivation occurs due to a taste stimulus, and a “conditional reflex,” as when salivation occurs due to the approaching attendant (1902, pg. 84). Pavlov was able to show that a stimulus as arbitrary as a musical note or a bright color could cause salivation if paired frequently with food. He notes that the effect is only caused when the animal is hungry, and that it seems important that the unconditional reflex is tied to a basic life process. His account of the phenomenon is characteristically physiological:

It would appear as if the salivary centre, when thrown into action by the simple reflex, because a point of attraction for influences from all organs and regions of the body specifically excited by other qualities of the object. (pg. 86)

This phenomenon came to be known as “classical conditioning.” As Pavlov presciently remarks: “An immeasurably wide field for new investigation is opened up before us” (pg. 85). In subsequent work, Pavlov (1927) further explores these processes, including inhibitory processes such as extinction, conditioned inhibition, and delay.

Edward Thorndike (1874-1949) explicitly targeted the processes of association in animals (1898). He laments that existing work tells us that a cat will associate hearing the phrase “kitty kitty” with milk, but does not tell us the actual sequence of associated thoughts, or “what real mental content is present” (pp. 1-2). To test this objectively, he placed animals in a series of puzzle boxes with food visible outside. Most were cats, but he also experimented with dogs and chicks. Escape, and thus food, required unlocking the door using one or more actions such as pulling a string, pressing a lever, or depressing a paddle. If they did not escape within a certain time limit, they would be removed without food.

As Thorndike describes it, animals placed in the box first perform “instinctive” actions like clawing at the bars and attempting to squeeze through the gaps. Eventually, the animal will happen upon the actual mechanism and accidentally manipulate it. Once some action is successful, the animal will associate it with the stimulus of the inside of the box. This association gradually strengthens with repeated presentation, as shown by learning curves of animals more rapidly escaping with sequential trials, which came to be known as operant, or instrumental, conditioning. He argues that this must be explained with associations between an idea or sense impression and an impulse to a particular action, rather than the “association of ideas,” as ideas themselves are inert (pg.71). He expresses the belief that animals have conscious ideas but remains officially agnostic, and he emphasizes that humans are not merely animals plus reason; human associations are different from animal associations as well. Thus, he arrives at the basic idea that he later restated under the name “the law of effect”:

Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond. (1911, pg. 244)

While the name “law of effect” has stuck, it is worth noting that in his dissertation (1898) and his textbook (1905 pp. 199-203), Thorndike simply calls it the “law of association.”

Lloyd Morgan (1852-1936) also discusses “the association of ideas” in nonhuman animals. However, his most significant contribution to the use of the concept is indirect, through a methodological principle that came to be known as his “Canon”:

In no case may we interpret an action as the outcome of the exercise of a higher psychical faculty, if it can be interpreted as the outcome of the exercise of one which stands lower in the psychological scale. (Morgan 1894, pg. 53)

The behaviorists took Morgan’s Canon to encourage positing minimal mental processes. More generally, associative processes are usually thought to be among the “lowest,” or “simplest,” processes available. This means that an associative explanation will be preferred until it can be ruled out; a practice that remains (see sections 4 and 5).

b. John B. Watson (1878-1958)

Watson rung in the behaviorist era with his paper Psychology as the Behaviorist Views It (1913). In that work, he attacks the introspective method and claims about conscious feelings or thoughts. As he develops the view (1924/1930), he says that all of psychology can be reframed in terms of stimulus and response. The connection between them is a “reflex arc” of neural connections running from the sense organ to the muscles and glands necessary for a response. Watson thus identifies each stimulus with specific physical features, and each response with specific physiological changes or movements. This came to be known, following Tolman (1932), as the “molecular” definition of behavior, distinct from the “molar” definition, which characterizes behaviors more abstractly; purposively (intentionally), or as a pattern of specific excitations and movements.

Watson applies the same system to humans and to nonhuman animals. He takes infants to be born with only a small stock of simple reflexes, or “unconditioned” stimulus-response pairs—nothing that could properly be called instinct. These basic reflex patterns are modified by conditioning. In conditioning, the new conditioned stimulus either “replaces” the original unconditioned stimulus as a cause of the response, like the musical notes in Pavlov’s experiments, or a new response is conditioned to an existing stimulus, as when one becomes afraid of a dog that had been previously seen as friendly. As these conditioned changes compound, stimulus-response sets can be coordinated in the ways that allow sophisticated behaviors in humans. He backs this up using experiments with infants, such as his ethically fraught Little Albert experiment: Watson conditioned a fear response to a white rat in 11-month-old Albert by making a loud noise every time the rat was produced (1924/1930, pp. 158-164).

Though Watson does not cast his own view in associative terms, his stimulus-response psychology effectively places association back at the center of psychology, and offhand references to association suggest he recognizes some connection. Even setting aside the specific points that S-R connections operate like associations, and classical conditioning like association by contiguity, Watson’s behaviorism shares with associationism an empiricist, anti-nativist orientation and an ideal of basing psychology on a single principle.

c. Edward S. Robinson (1893-1937)

Edward S. Robinson’s work Association Theory To-Day (1932) argues that associations themselves are the same in both behaviorism and the older associationist tradition. The difference is what answer one gives to the question, “What is associated?” Associationism had been rejected in large part because it was taken to be a relation between mentalistic ideas. Robinson takes this to be unfair, pointing to the diversity of views in earlier associationists. Robinson was far from the first to note the role of association in behaviorism (the earliest paper he cites as arguing along these lines is Hunter 1917; see also Guthrie 1930, discussed below), but he presents a systematic attempt to import previously existing associationist machinery to behaviorism.

An association is still an association, according to Robinson, whether it holds between ideas, stimuli and responses, or neural pathways. He adopts the generic term “psychological activities” to capture all of these, saying that association is a disposition of some activities to instigate particular others. He tentatively adopts a “molar” view of psychological activities over Watson’s molecular view because he doesn’t think existing research has actually shown associations between particular physiological activities. Thus, he argues that the relevant activities must be described at a more abstract level. Robinson does rely on behavioral evidence but does not proclaim the behaviorist rejection of all mentalistic postulates. He takes it to be an open empirical question which activities will be associated in the most effective version of the theory.

Robinson goes on to discuss several laws of association, describing how each should be viewed and summarizing relevant experimental findings. Contiguity, the first, is apparent in conditioning. He attributes the second, assimilation, to Thorndike’s observation that a person will give the same response when presented with sufficiently similar situations (pp. 81-82). Robinson denies this is the same as association by similarity proper, but it is the same basic role Bain gives similarity. Others include frequency, duration, context, acquaintance, composition, and individual differences. He takes the actual associative strength to be a sum of all of these features, lamenting the overemphasis on contiguity itself.

d. B. F. Skinner (1904-1990)

Skinner, like Watson, does not frame his understanding of behaviorism in terms of association. Nonetheless, his work is noteworthy for placing reinforcement at the center of learning. The focus here is on his early career. Skinner studied operant conditioning using an apparatus in which a rat would press a lever to receive food. The food, in this case, reinforces the action of pressing the lever. In Skinner’s view, reinforcement is necessary for operant learning. While this basic idea was known as part of Thorndike’s law of effect, it was not widely believed that effects could reinforce behavioral causes until Skinner. He went on to study reinforcement itself, especially the effects of various schedules of reinforcement (1938).

Skinner differentiated operant conditioning from Pavlovian, or classical, conditioning based on the sequences of stimulus/response (1935). Operant conditioning requires a four-step chain involving two reflexes: from a stimulus (sight of the lever) to an action (pressing the lever), which then causes another stimulus (food, the reinforcer) to a final action (eating/salivating). In Pavlovian-style experiments, a stimulus (for example, a light) switches from triggering an arbitrary reflex (such as orienting towards the light) to triggering a reflex relevant to the reinforcer (such as salivation if food is the reinforcer). Reinforcement is necessary for both; it simply plays a different role. Thinking in associative terms, different types of conditioning are differentiable by structure of associations. But this again modifies the conception of the process of association. Simple contiguity is not enough, one of the stimuli involved must also play the role of reinforcer.

Later, Skinner abandoned the stimulus-response framing of operant conditioning, arguing that the action (lever press) need not be viewed as a direct response to a stimulus (seeing the lever). To explain behavior in such a case, one must look back to the history of reinforcement, rather than any particular eliciting stimulus (1978). Skinner generally opposed private mentalistic posits, but his views on this were not always clear or consistent. He did, like Watson, treat behavior as the only legitimate target of study, retain a generally empiricist picture of mind, and take the view to apply generally. He was able to show that “shaping” techniques based on operant conditioning could train animals to complete sophisticated tasks, and he took this to apply to humans as well (1953), including with regard to language (1957) and even society (1976).

e. Edwin Guthrie (1886-1959)

Edwin Guthrie argues that the core phenomenon of conditioning is just association by contiguity, which he views as the single principle of learning. He states the principle as such: “Stimuli acting at a given instant tend to acquire some effectiveness toward the eliciting of concurrent responses, and this effectiveness tends to last indefinitely” (1930, pg. 416). He goes on to argue that various empirical phenomena of learning, including even forgetting and insight, “may all be understood as instances of a very simple and very familiar principle, the ancient principle of association by contiguity in time” (1930, pg. 428). He later builds on this conception by arguing that stimuli to which animals pay attention will become associated. He takes this to be the actual action by which reinforcers work, dissatisfied by Skinner’s seemingly circular definition of the term “reinforcer.” He presents the new version in simplified form as follows: “What is being noticed becomes a signal for what is being done” (1959, pg. 186).

Guthrie takes the focus on behavior to be an abstraction intended to make psychology empirically tractable, in the same way that physics models frictionless planes. As such, his behaviorism could be seen as less extreme than Watson or Skinner, but perhaps more so than Robinson.

f. Themes and Lessons

Across behaviorist views, association remains the core concept. As in the previous section, though, some authors explicitly take on the associationist mantle while others ignore it. Also as above, there is a diversity in views on the actual structure of associations, how they develop, and what is taken to be associated. Skinner (1945) captured perhaps the largest division: that between the radical behaviorists and the methodological behaviorists. This division is easily cast in terms of their views on association. The radical behaviorists, exemplified by Watson and Skinner, aim to eliminate mentalistic concepts; association can allow this, via the minimal connection between stimulus and response. The methodological behaviorists, exemplified here by Guthrie and Robinson, take the emphasis on behavior to be a methodological abstraction or simplification necessary for scientific progress. By implication, association itself is an abstract relation, which in principle can subsume various possible mechanisms, rather than excluding them.

4. After the Cognitive Revolution (1950s-2000s)

As cognitivism came to dominate in the mid-twentieth century, association took up various roles in different literatures. The rise of cognitivism brought two key changes in psychology generally. First, internal mental states returned. However, these states were generally viewed as functionally defined representational states rather than as imagistic ideas, as in the empiricist associationists. Second, cognitivism views the mind in broadly computational terms. Cognitivists take many psychological processes, called “cognitive processes,” to be algorithms that operate by applying formal rules to symbolic representational states, perhaps in a manner similar to language. Cognitive processes are often contrasted with associative processes, setting up a general view in which association is one kind of psychological process among many. Association is thought to be limited, in particular, because it is too simple to account for complex, rational thought (see Dacey 2019a). Learning by contiguity cannot differentiate which experienced sequences reflect real-world relations and which are mere accidents. Associative sequences in thought do not allow flexible application; they must be rigidly followed. Thus, associative processes are usually posited in simpler systems, like nonhuman animals, or the human unconscious. However, as connectionist computational strategies began to bear fruit, some treated these as a new, revitalized form of general associationism.

This section discusses three research programs that each treat associations in different ways and collectively capture the main threads of late twentieth- and early twenty-first-century thought on association.

a. Semantic Networks

The first program represents semantic memory—memory for facts—as a network of linked concepts. Retrieval or recall of information in such a model is described by activation spreading through this network. When activation reaches some critical level, the information is retrieved and available for use or report. This program got its formal start in the late 1960s with work by Ross Quillian and Allen Collins (Collins and Quillian 1969), and subsequently John R. Anderson (1974) and Elizabeth Loftus (Collins and Loftus 1975). The general idea is that different patterns of association explain facts about information retrieval, such as when it succeeds or fails, and how long it takes. John Anderson generalized the basic idea as part of his Human Associative Memory (HAM) model (Anderson and Bower 1973) and Adaptive Control of Thought (ACT) model and its descendants (Anderson 1996). In more specific circumstances, this basic strategy has been applied in a number of phenomena where information is accessed automatically, including: cued recall, priming (McNamara 2005), word association task responses, false memory (Gallo 2013), reading comprehension (Ericsson and Kintsch 1995), creativity (Runco 2014), and implicit social bias (Fazio 2007, Gawronski and Bodenhausen 2006; see also section 5).

Spreading activation in a network manifests one side of the standard associative story. The difference from previous traditions is that associations relate concepts or propositions, and these networks usually include a possibility of subcritical activation of a concept that can facilitate later retrieval. These models rarely say anything explicitly about learning, but they sometimes carry implications for learning. Often, links are not taken to represent any particular relation, signifying only the disposition to spread activation. This is taken to indicate that the links are learned through a process like association by contiguity, which cannot encode meaningful real-world information. However, sometimes links are labeled with a meaningful relationship between concepts, which would imply a learning process capable of tracking that relation. In addition, some models that emerged out of related research, such as Latent Semantic Analysis (LSA) (Landauer and Dumais 1997) and Bound Encoding of the Aggregate Language Environment (BEAGLE) (Jones and Mewhort 2007), extract semantic information (for example, semantic similarity) about words in a linguistic corpus based on clustering patterns with other words.

b. Associative Learning and the Resorla-Wagner Model

Work on learning proceeded largely separately from the work on semantic networks just described. After the cognitive revolution, conditioning effects remained a representative phenomenon of basic learning processes. They were, again, re-described. Since the associations were taken to be formed between internal mental representations, conditioning was subsumed under the heading of “contingency learning” or “associative learning”: the learning of relations between events that tend to co-occur. “Associative learning” is sometimes used in this literature to refer to this phenomenon, regardless of what mechanism is taken to produce it. In this literature, human and nonhuman animal research have long informed one another. However, the orientation can depend on the subjects. It has long been accepted that humans have complex cognitive processes running in parallel with any simple associative processes (Shanks 2007). The question in the human literature is often whether purely associative models can explain any human learning. Research on animal minds is still heavily influenced by Morgan’s Canon (section 3.a). As a result, associative explanations have been heavily favored. Thus, the question is often whether nonhuman animals have any processes that cannot be described in associative terms.

The Rescorla–Wagner model (1972) has dominated much of this research, either by itself or through its various modifications and descendants. This model includes a “prediction” that is made when the antecedent cue is produced. Associative strength is either increased or decreased based on whether that prediction is borne out. For instance, if an animal has a strong association between a cue and a target, the animal will expect the target once the cue is presented. If the target does not follow, the associative strength is reduced. This presents a different conception of association from those encountered so far, as a prediction-error process, contrasted with the footpath notion of contiguity and with reinforcement (Rescorla 1988; see also Danks 2014, pg. 20, arguing that the prediction itself is not usually taken realistically). It also makes the Rescorla-Wagner model more successful at predicting various phenomena in contingency learning than previous conceptions of association. For instance, it predicts the fact that existing associations can block new associations from forming (Miller, Barnet, and Graham 1995). The computational precision and simplicity of associative models like the Rescorla-Wagner model are a major draw, and they have been further supported by neural evidence of prediction-error tracking in the brain (Schultz, Dayan, and Montague 1997).

However, one can also complicate models like this in various ways. Some models allow interactions between existing associations during learning (Dickinson 2001). Others allow interactions between association and other processes, like attention or background knowledge (Pearce and Macintosh 2010, Dickinson 2012, Thorwart and Livesey 2016). Finally, one can also model interference between associations at retrieval, as in the SOCR (Sometimes-Competing Retrieval) model (Stout and Miller 2007).

Even with these complicated types of models, critics have argued that simple associative stories cannot capture the complexity of associative learning. For instance, some argue that the processes responsible for human associative learning must be propositional (Mitchell, DeHouwer, and Lovibond 2009). Gallistel has been perhaps the most prominent opponent of associative theories of learning in animals generally, arguing that the processes responsible must be symbolic (Gallistel 1990, Gallistel and Gibbon 2002).

c. Connectionism

The arrival of connectionism as a major theory of mind in the 1980s was hailed as a revolution by many of its proponents (Rumelhart, McClelland, and PDP research group 1986). Connectionist models perform especially well in various kinds of categorization tasks. They are a kind of spreading activation model in which activation spreads through sequential layers of nodes. Though there were important precursors, especially Hebb (1949) and Rosenblatt (1962), connectionism came into its own when new techniques allowed much more computationally powerful three-layer networks. These networks include a “hidden” layer between “input” and “output” layers. The revolutionary claims of connectionism are usually based on the idea that the hidden layer represents information in a distributed manner, as a pattern of activation across multiple nodes. Thus, nodes are treated as “subrepresentational” units of information that also presumably correspond to something in the brain, such as neurons, sets or assemblies of neurons, or brain regions (Smolensky 1988). This is also thought to be a realistic view of representation in the brain, which is likely distributed. Unlike the other research programs discussed in this section, which take association to describe one kind of processing among many, connectionism, at least initially, purported to provide a general model of mind.

Connectionism has been treated as a version of associationism by both proponents (Bechtel and Abrahamsen 1991, Clark 1993) and opponents (Fodor and Pylyshyn 1988). This is because it implements a kind of spreading activation, as well as the fact that connectionist networks are able to learn—something symbolic systems struggle with. While the emphasis on learning aligns with a generally empiricist approach, the specific mechanism matters for what, exactly, to make of this. Perhaps the most common process, backpropagation, is not usually thought to be realistic. Another common process, Hebbian learning, implements a version of association by contiguity (Hebb 1949). This is treated as more biologically plausible, but models implementing it are less powerful.

These networks modify the treatment of association by providing another set of answers to the question of what is associated. In this case, it is subrepresentational units or parts of the brain. While neural level stories have attended association throughout its history (see above sections on Hartley, Freud, and Watson; see also Sutton 1998 for discussion of similarities between connectionism and these historical views), they are usually secondary to a psychological-level story. Connectionists, in contrast, actually attempt to model neural-level phenomena.

In many networks, the number of hidden-layer nodes is chosen somewhat arbitrarily, and the network is tuned in whatever way gets the input-output mappings right. The question of what each node might represent in the brain is secondary, complicating their interpretation as actual models of the mind/brain. Arguably, later work during this period split between two approaches. Many researchers simply explore the framework as a computational tool, up to and including deep learning. These researchers are not primarily concerned with accurate modeling of brain processes, though they may view their models as “how-possibly” models (see Buckner 2018 for such a discussion of deep learning models and abstraction). Computational neuroscientists, on the other hand, generally start with neural information like single unit recordings, and model specific neural circuits, networks, or regions.

5. Ongoing Philosophical Discussion (2000s-2020s)

This section briefly surveys two debates that brought the concept of association back under philosophical scrutiny. These debates take place largely in the frameworks outlined in the last section.

a. Dual-Process Theories and Implicit Bias

One of the most philosophically important implications of early twenty-first-century work in psychology, especially social psychology, was the finding that much of our behavior is driven, or heavily influenced, by unconscious processes. Theorists generally captured these findings with Dual-Process theories, which separate the mind into two systems or processing types. Type 1 processing is fast, effortless, uncontrolled, and unconscious, while Type 2 processing is slow, effortful, controlled, and conscious. It is often the case that association is considered to be among the processes in Type 1, but Type 1 is also sometimes treated as associative in general (Kahneman 2011, Uhlmann, Poehlman, and Nosek 2012). This stronger claim is controversial (Mandelbaum 2016), but it is often implicit in discussions of unconscious processing.

The conception of association involved largely stems from the semantic network program described above. These authors, however, tend to emphasize the simplicity of associative processing, and so take onboard an associative account of learning as well. Thus, at stake is not just how one thinks about the mechanisms of unconscious processing, but how they relate to one’s agency and responsibility. It is often thought that unconscious processes cannot produce responsible action because they are associative and as such are too inflexible to produce responsible action (Levy 2014). How one understands and attributes associative models and associative processes is, as a result, significant for the conclusions one draws from this work (Dacey 2019b).

b. The Association/Cognition Distinction

The second discussion has occurred in relation to work in comparative animal psychology. In that literature, many debates are centered on whether the process responsible is associative or cognitive, with association gaining a default status due to Morgan’s Canon. As a result, associative processes are usually thought to be ubiquitous and sometimes can even potentially explain seemingly complex behavior (see Heyes 2012). Some authors have attacked the associative or cognitive framing as unproductive (Buckner 2011, Smith, Couchman, and Beran 2014, Dacey 2016). It remains an empirical question whether psychological processes cluster in ways that support a distinction between associative and cognitive processes. Nonetheless, there are reasons to reframe associative models as operating at either a lower, neural level (Buckner 2017) or a higher, more abstract level (Dacey 2016). Either move would, in principle, allow associative models and cognitive models to be applied to the same process, dissolving the problematic dichotomy.

6. Conclusion

Association is one of the most enduring concepts in the history of theorizing about the mind because it is one of the most flexible and one of the most powerful. The basic phenomena seem clear and indisputable: Some thoughts follow easily in sequence, and frequency of repetition is one reason for this. The models that formalize and articulate this insight seem capable of capturing many psychological phenomena. What this means is disputed and much less clear. There are questions pertaining to the specific mechanisms behind these phenomena, how many phenomena can be explained in these terms, what the associations are, and what is associated. The various views discussed above present very different answers to these questions.

7. References and Further Reading

  • Anderson, J. R. (1974). Retrieval of Propositional Information from Long-Term Memory. Cognitive Psychology, 6(4), 451-474.
  • Anderson, J. R. (1996). ACT: A Simple Theory of Complex Cognition. American Psychologist, 51(4), 355.
  • Anderson, J. R., and Bower, G. H. (1973). Human Associative Memory. Washington, D. C.:V. H. Winston and Sons.
  • Aristotle (2001). Aristotle’s On the Soul and On Memory and Recollection. J. Sachs (Trans.). Santa Fe: Green Lion Press.
  • Bain, A. (1868). The Senses and the Intellect. 3rd ed. London: Longman’s, Green, and Co.
  • Bain, A. (1887). On ‘Association’-Controversies. Mind, 12(46), 161-182.
  • Bechtel, W., and Abrahamsen, A. (1991). Connectionism and the Mind: Parallel Processing, Dynamics, and Evolution in Networks. Oxford: Blackwell Publishing.
  • Buckner, C. (2011). Two Approaches to the Distinction between Cognition and ‘Mere Association’. International Journal of Comparative Psychology, 24(4).
  • Brown, T. (1820). Lectures on the Philosophy of the Human Mind. Edinburgh: W. and C. Tait.
  • Buckner, C. (2017). Understanding Associative and Cognitive Explanations in Comparative Psychology. The Routledge Handbook of Philosophy of Animal Minds. Oxford: Routledge, 409-419.
  • Buckner, C. (2018). Empiricism without Magic: Transformational Abstraction in Deep Convolutional Neural Networks. Synthese, 195(12), 5339-5372.
  • Calkins, M. W. (1896). Association (II.). Psychological Review, 3(1), 32.
  • Calkins, M. W. (1901). An Introduction to Psychology. London: The Macmillan Company.
  • Clark, A. (1993). Associative Engine: Connectionism, Concepts, and Representational Change. Cambridge MA: MIT Press.
  • Collins, A. M., and Loftus, E. F. (1975). A Spreading-Activation Theory of Semantic Processing. Psychological Review, 82(6), 407.
  • Collins, A. M., and Quillian, M. R. (1969). Retrieval Time From Semantic Memory. Journal of Verbal Learning and Verbal Behavior, 8(2), 240-247.
  • Dacey, M. (2015). Associationism without Associative Links: Thomas Brown and the Associationist Project. Studies in History and Philosophy of Science Part A, 54, 31–40.
  • Dacey, M. (2016). Rethinking Associations in Psychology. Synthese, 193(12), 3763-3786.
  • Dacey, M. (2019a). Simplicity and the Meaning of Mental Association. Erkenntnis, 84(6), 1207-1228.
  • Dacey, M. (2019b). Association and the Mechanisms of Priming. Journal of Cognitive Science, 20(3), 281-321.
  • Danks, D. (2014). Unifying the Mind: Cognitive Representations as Graphical Models. Cambridge, MA: MIT Press.
  • Dickinson, A. (2001). Causal Learning: An Associative Analysis. The Quarterly Journal of Experimental Psychology, 54B(1), 3-25.
  • Dickinson, A. (2012). Associative Learning and Animal Cognition. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603), 2733–2742.
  • Ebbinghaus, H. (1885). 1913. Memory: A Contribution to Experimental Psychology.
  • Ericsson, K. A., and Kintsch, W. (1995). Long-Term Working Memory. Psychological Review, 102(2), 211.
  • Fazio, R. (2007). Attitudes as Object-Evaluation Associations of Varying Strength. Social Cognition, 25(5), 603–637.
  • Fodor, J. A. (1998). Concepts: Where Cognitive Science Went Wrong. Oxford: Oxford University Press.
  • Fodor, J. A., and Pylyshyn, Z. W. (1988). Connectionism and Cognitive Architecture: A Critical Analysis. Cognition, 28(1-2), 3-71.
  • Freud, S. (1953-1964). The Standard Edition of the Complete Psychological Works of Sigmund Freud (J. Strachey and A. Freud Eds.), 24 vols. London: The Hogarth Press and the Institute of Psycho-Analysis.
  • Includes the Project for a Scientific Psychology in Volume 1.
  • Gallistel, C. R. (1990). The Organization of Learning. Cambridge, MA: The MIT Press.
  • Gallistel, C. R., and Gibbon, J. (2002). The Symbolic Foundations of Conditioned Behavior. n. p.: Psychology Press.
  • Gallo, D. (2013). Associative Illusions of Memory: False Memory Research in DRM and Related Tasks. n. p.: Psychology Press.
  • Galton, F. (1879). Psychometric Experiments. Brain, 2(2), 149-162.
  • Gawronski, B., and Bodenhausen, G. V. (2006). Associative and Propositional Processes in Evaluation: An Integrative Review of Implicit and Explicit Attitude Change. Psychological Bulletin, 132(5), 692.
  • Guthrie, E. R. (1930). Conditioning as a Principle of Learning. Psychological Review, 37(5), 412.
  • Guthrie, E. (1959). Association by Contiguity. in Psychology: A Study of a Science. Vol. 2: General Systematic Formulations, Learning, and Special Processes. S. Koch (ed.). New York: McGraw Hill Book Company.
  • Hartley, D. (1749/1966). Observations on Man. Gainesville, FL: Scholars’ Facsimiles and Reprints.
  • Hebb, D. O. (1949). The Organization of Behavior. New York: Wiley.
  • Heyes, C. (2012). Simple Minds: A Qualified Defence of Associative Learning. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603), 2695-2703.
  • Hobbes, T. (1651/1991). Leviathan, R. Tuck (ed.). Cambridge: Cambridge University Press.
  • Hoeldtke, R. (1967). The History of Associationism and British Medical Psychology. Medical History, 11(1), 46-65.
  • A history of associationism focusing on psychiatric applications.
  • Hume, D. (1739/1978). A Treatise of Human Nature. L. A. Selby-Bigge, and P. H. Niddich (eds.), Oxford: Clarendon Press.
  • Hume, D. (1748/1974), Enquiries concerning Human Understanding and concerning the Principles of Morals. L. A. Selby-Bigge (ed.). Oxford: Clarendon Press.
  • Hunter, W. S. (1917). A Reformulation of the Law of Association. Psychological Review, 24(3), 188.
  • James, W. (1890/1950). The Principles of Psychology. New York: Dover Publications.
  • Jones, M. N., and Mewhort, D. J. (2007). Representing Word Meaning and Order Information in a Composite Holographic Lexicon. Psychological Review, 114(1), 1.
  • Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.
  • Kitcher, P. (1992). Freud’s Dream: A Complete Interdisciplinary Science of Mind. Cambridge, MA: MIT Press.
  • Landauer, T. K., and Dumais, S. T. (1997). A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. Psychological Review, 104(2), 211.
  • Levy, N. 2014. Consciousness and Moral Responsibility. New York: Oxford University Press.
  • Locke, J. (1700/1974). An Essay concerning Human Understanding. Peter H. Nidditch (ed.). Oxford: Clarendon Press.
  • Mandelbaum, E. (2016). Attitude, Inference, Association: On the Propositional Structure of Implicit Bias. Noûs, 50(3), 629-658.
  • McNamara, T. P. (2005). Semantic Priming: Perspectives from Memory and Word Recognition. n. p.: Psychology Press.
  • Mill, J. (1869) An Analysis of the Phenomena of the Human Mind. (A. Bain and J. S. Mill Eds.). London: Longmans, Green and Dyer.
    • This edition includes comments from both Alexander Bain and John Stuart Mill.
  • Mill, J. S. (1963-91). The Collected Works of John Stuart Mill. J. M. Robson. (Gen. Ed.) 33 vols. Toronto: University of Toronto Press.
  • Miller, R. R., Barnet, R. C., and Grahame, N. J. (1995). Assessment of the Rescorla–Wagner Model. Psychological Bulletin, 117(3), 363–386.
  • Mitchell, C. J., De Houwer, J., and Lovibond, P. F. (2009). The Propositional Nature of Human Associative Learning. Behavioral and Brain Sciences, 32(2), 183-198.
  • Morgan, C. Lloyd. (1894). An Introduction to Comparative Psychology. London: Walter Scott.
  • Mortera, E. L. (2005). Reid, Stewart and the Association of Ideas. Journal of Scottish Philosophy, 3(2), 157-170.
  • Pavlov, I. P. (1897/1902). The Work of the Digestive Glands. W. H. Thompson (Trans.). London: Charles Griffin and Company.
  • Pavlov, I. P. (1927). Conditional Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex. G. V. Anrep (Trans.). London: Oxford.
  • Pearce, J. M., and Mackintosh, N. J. (2010). Two Theories of Attention: A Review and a Possible Integration. Attention and Associative learning: From Brain to Behaviour. Oxford: Oxford University Press.
  • Rapaport, D. (1974). The History of the Concept of Association of Ideas. New York: International Universities Press, Inc.
    • This history focuses on the prehistory of the idea of association, applying the term somewhat more broadly than the authors themselves do.
  • Reid, T. (1872). The Works of Thomas Reid, D. D. W. Hamilton (ed.). Edinburgh: MacLaghlan and Stewart.
    • Includes Essays on the Intellectual Powers of Man and William Hamilton’s history of association, discussed here.
  • Rescorla, R. A. (1988). Pavlovian Conditioning: It’s Not What You Think it Is. American Psychologist, 43(3), 151.
  • Rescorla, R. A., and Wagner, A. R. (1972). A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement. In A. H. Black and W. F. Prokasy (eds.), Classical Conditioning II (pp. 64–99). New York: Appleton-Century-Crofts.
  • Richardson, A. (2001) British Romanticism and the Science of the Mind. Cambridge: Cambridge University Press.
  • Robinson, E. S. (1932). Association Theory To-day: An Essay in Systematic Psychology. New York: The Century Co.
  • Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Washington: Spartan Books.
  • Rumelhart, D. E., McClelland, J. L., and PDP Research Group (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations. Cambridge, MA: MIT Press.
  • Runco, M.A. (2014). Creativity: Theories and Themes: Research, Development, and Practice. Amsterdam: Academic Press.
  • Schultz, W., Dayan, P., and Montague, P. R. (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593-1599.
  • Shanks, D. R. (2007). Associationism and Cognition: Human Contingency Learning at 25. The Quarterly Journal of Experimental Psychology, 60(3), 291-309.
  • Skinner, B. F. (1935). Two Types of Conditioned Reflex and a Pseudo Type. Journal of General Psychology, Vol. 13, 1: 66-77.
  • Skinner, B. F. (1938). The Behavior of Organisms. New York: Appleton-Century-Crofts, Inc.
  • Skinner, B. F. (1945). The Operational Analysis of Psychological Terms. Psychological Review, 52, 270-277, 291-294.
  • Skinner, B. F. (1953). Science and Human Behavior. London: Collier Macmillan Publishers.
  • Skinner, B. F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts, Inc.
  • Skinner, B. F. (1976). Walden two. Indianapolis: Hackett Publishing.
  • Skinner, B. F. (1978). The Experimental Analysis of Behavior (A History). In B. F. Skinner (ed.), Reflections on Behaviorism and Society (pp.113-126). Englewood Cliffs, NJ: Prentice-Hall.
  • Smith, J. D., Couchman, J. J., and Beran, M. J. (2014). Animal Metacognition: A Tale of Two Comparative Psychologies. Journal of Comparative Psychology, 128(2), 115.
  • Smolensky, P. (1988). On the Proper Treatment of Connectionism. Behavioral and Brain Sciences, 11(1), 1-23.
  • Spencer, H. (1898). Principles of Psychology Vol 1. New York: D. Appelton and Company.
    • The substantially revised 3rd edition was first published in 1880 and also serves as Volume 4 of his System of Synthetic Philosophy.
  • Stewart, D. (1855). Philosophical Essays. In W. Hamilton (ed.), The Collected Works of Dugald Stewart (Vol. V) Edinburgh: Thomas Constable and Co.
  • Stout, G. F. (1899) A Manual of Psychology. New York: University Correspondence College Press.
  • Stout, S. C., and Miller, R. R. (2007). Sometimes-Competing Retrieval (SOCR): A Formalization of the Comparator Hypothesis. Psychological Review, 114(3), 759.
  • Sulloway, F. J. (1979) Freud, Biologist of the Mind: Beyond the Psychoanalytic Legend. New York: Basic Books, Inc.
  • Sutton, J. (1998). Philosophy and Memory Traces: Descartes to Connectionism. Cambridge: Cambridge University Press.
  • Tabb, K. (2019). Locke on Enthusiasm and the Association of Ideas. Oxford Studies in Early Modern Philosophy Vol 9. DOI: 10.1093/oso/9780198852452.003.0003
  • Thorndike, E. L. (1898). Animal Intelligence: An Experimental Study of the Associative Processes in Animals. Psychological Monographs: General and Applied, 2(4), i-109.
  • Thorndike, E. L. (1905). The Elements of Psychology. New York: A. G. Seiler.
  • Thorndike, E. L. (1911). Animal Intelligence: Experimental Studies. New York: The MacMillan Company
  • Thorwart, A., and Livesey, E. J. (2016). Three Ways that Non-Associative Knowledge May Affect Associative Learning Processes. Frontiers in Psychology, 7, 2024.
  • Tolman, E. C. (1932/1967). Purposive Behavior in Animals and Men. New York: Irvington Publishers, Inc.
  • Uhlmann, E. L., Poehlman, T. A., and Nosek, B. (2012). Automatic Associations: Personal Attitudes or Cultural Knowledge? In Jon D. Hanson (ed.), Ideology, Psychology, and Law. New York: Oxford University Press, 228-260.
  • Warren, H. C. (1916). Mental Association from Plato to Hume. Psychological Review, 23(3), 208.
  • Warren, H. C. (1928) A History of the Association Psychology. New York: Charles Scribner’s Sons.
    • The most complete history of associationism in existence, covering the period up to its publication. Includes more detail on views of most authors covered here, and many others.
  • Watson, J. B. (1913). Psychology as the Behaviorist Views it. Psychological Review, 20(2), 158.
  • Watson, J. B. (1924/1930). Behaviorism. Chicago: The University of Chicago Press.
  • Wundt, W. (1901/1902). Outlines of Psychology 4th ed. C. H. Judd (Trans.). Leipzig: Wilhelm Engelmann
  • Wundt, W. (1911/1912). An Introduction to Psychology. R. Pintner (Trans.). London: George Allen and Company.
  • Young, R. M. (1970). Mind, Brain and Adaptation in the Nineteenth Century: Cerebral Localization and Its Biological Context from Gall and Ferrier. Oxford: Clarendon Press.

 

Author Information

Mike Dacey
Email: mdacey@bates.edu
Bates College
U. S. A.

The Philosophy of Climate Science

Climate change is one of the defining challenges of the 21st century. But what is climate change, how do we know about it, and how should we react to it? This article summarizes the main conceptual issues and questions in the foundations of climate science, as well as of the parts of decision theory and economics that have been brought to bear on issues of climate in the wake of public discussions about an appropriate reaction to climate change.

We begin with a discussion of how to define climate. Even though “climate” and “climate change” have become ubiquitous terms, both in the popular media and in academic discourse, the correct definitions of both notions are hotly debated topics. We review different approaches and discuss their pros and cons. Climate models play an important role in many parts of climate science. We introduce different kinds of climate models and discuss their uses in detection and attribution, roughly the tasks of establishing that the climate of the Earth has changed and of identifying specific factors that cause these changes. The use of models in the study of climate change raises the question of how well-confirmed these models are and of what their predictive capabilities are. All this is subject to considerable debate, and we discuss a number of different positions. A recurring theme in discussions about climate models is uncertainty. But what is uncertainty and what kinds of uncertainties are there? We discuss different attempts to classify uncertainty and to pinpoint their sources. After these science-oriented topics, we turn to decision theory. Climate change raises difficult questions such as: What is the appropriate reaction to climate change? How much should we mitigate? To what extent should we adapt? What form should adaptation take? We discuss the framing of climate decision problems and then offer an examination of alternative decision rules in the context of climate decisions.

Table of Contents

  1. Introduction
  2. Defining Climate and Climate Change
  3. Climate Models
  4. Detection and Attribution of Climate Change
  5. Confirmation and Predictive Power
  6. Understanding and Quantifying Uncertainty
  7. Conceptualising Decisions Under Uncertainty
  8. Managing Uncertainty
  9. Conclusion
  10. Glossary
  11. References and Further Reading

1. Introduction

Climate science is an umbrella term referring to scientific disciplines studying aspects of the Earth’s climate. It includes, among others, parts of atmospheric science, oceanography, and glaciology. In the wake of public discussions about an appropriate reaction to climate change, parts of decision theory and economics have also been brought to bear on issues of climate. Contributions from these disciplines that can be considered part of the application of climate science fall under the scope of this article. At the heart of the philosophy of climate science lies a reflection on the methodology used to reach various conclusions about how the climate may evolve and what we should do about it. The philosophy of climate science is a new sub-discipline of the philosophy of science that began to crystalize at the turn of the 21st century when philosophers of science started having a closer look at methods used in climate modelling. It comprises a reflection on almost all aspects of climate science, including observation and data, methods of detection and attribution, model ensembles, and decision-making under uncertainty. Since the devil is always in the detail, the philosophy of climate science operates in close contact with science itself and pays careful attention to the scientific details. For this reason, there is no clear separation between climate science and the philosophy thereof, and conferences in the field are often attended by both scientists and philosophers.

This article summarizes the main problems and questions in the foundations of climate science. Section 2 presents the problem of defining climate. Section 3 introduces climate models. Section 4 discusses the problem of detecting and attributing climate change. Section 5 examines the confirmation of climate models and the limits of predictability. Section 6 reviews classifications of uncertainty and the use of model ensembles. Section 7 turns to decision theory and discusses the framing of climate decision problems. Section 8 introduces alternative decision rules. Section 9 offers a few conclusions.

Two qualifications are in order. First, we review issues and questions that arise in connection with climate science from a philosophy of science perspective, and with special focus on epistemological and decision-theoretic problems. Needless to say, this is not the only perspective. Much can be said about climate science from other points of view, most notably science studies, sociology of science, political theory, and ethics. For want of space, we cannot review contributions from these fields.

Second, to guard against possible misunderstandings, it ought to be pointed out that engaging in a critical philosophical reflection on the aims and methods of climate science is in no way tantamount to adopting a position known as climate scepticism. Climate sceptics are a heterogeneous group of people who do not accept the results of ‘mainstream’ climate science, encompassing a broad spectrum from those who flat out deny the basic physics of the greenhouse effect (and the influence of human activities on the world’s climate) to a small minority who actively engage in scientific research and debate and reach conclusions at the lowest end of climate impacts. Critical philosophy of science is not the handmaiden of climate scepticism; nor are philosophers ipso facto climate sceptics. So, it should be stressed here that we do not endorse climate scepticism. We aim to understand how climate science works, reflect on its methods, and understand the questions that it raises.

2. Defining Climate and Climate Change

Climate talk is ubiquitous in the popular media as well as in academic discourse, and climate change has become a familiar topic. This veils the fact that climate is a complex concept and that the correct definitions of climate and climate change are a matter of controversy. To gain an understanding of the notion of climate, it is important to distinguish it from weather. Intuitively speaking, the weather at a particular place and a particular time is the state of the atmosphere at that place and at the given time. For instance, the weather in central London at 2 pm on 1 January 2015 can be characterised by saying that the temperature is 12 degrees centigrade, the humidity is 65%, and so forth. By contrast, climate is an aggregate of weather conditions: it is a distribution of particular variables (called the climate variables) arising for a particular configuration of the climate system.

The question is how to make this basic idea precise, and this is where different approaches diverge. 21st-century approaches to defining climate can be divided into two groups: those that define climate as a distribution over time, and those that define climate as an ensemble distribution. The climate variables in both approaches include those that describe the state of the atmosphere and the ocean, and sometimes also variables describing the state of glaciers and ice sheets [IPCC 2013].

Distribution over time. The state of the Earth depends on external conditions of the system such as the amount of energy received from the sun and volcanic activity. Assume that there is a period of time over which the external conditions are relatively stable in that they exhibit small fluctuations around a constant mean value c. One can then define the climate over this time period as the distribution of the climate variables over that period under constant external conditions c [for example, Lorenz 1995]. Climate change then amounts to successive time periods being characterised by different distributions. However, in reality the external conditions are not constant and even when there are just slight fluctuations around c, the resulting distributions may be very different. Hence this definition is unsatisfactory [Werndl 2015].

This problem can be avoided by defining climate as the empirically observed distribution over a specific period of time, where external conditions are allowed to vary. Again, climate change amounts to different distributions for successive time periods. This definition is popular because it is easy to estimate from the observations, for example, from the statistics taken over thirty years that are published by the World Meteorological Organisation [Hulme et al. 2009]. A major problem of this definition can be illustrated by the example in which, in the middle of a period of time, the Earth is hit by a meteorite and becomes a much colder place. Clearly, the climate before and after the hit of the meteor differ. Yet this definition has no resources to recognize this because all it says is that climate is a distribution arising over a specific time period.

To circumvent this problem, Werndl [2015] introduces the idea of regimes of varying external conditions and suggests defining climate as the distribution over time of the climate variables arising under a specific regime of varying external conditions. The challenge for this account is to spell out what exactly is meant by a regime of varying external conditions.

Ensemble Distribution. An ensemble of climate systems (not to be confused with a model ensemble) is an infinite collection of virtual copies of the climate system. Consider the sub-ensemble of these that satisfy the condition that the present values of the climate variables lie in a specific interval around the values measured in the actual climate system (that is, the values compatible with the measurement accuracy). Now assume again that there is period of time over which the external conditions are relatively stable in that they exhibit small fluctuations around a constant mean value c. Then climate at future time t is defined as the distribution of values of the climate variables that arises when all systems in the ensemble evolve from now to t under constant external conditions c [for example, Lorenz 1995]. In other words, the climate in the future is the distribution of the climate variables over all possible climates that are consistent with current observations under the assumption of constant external conditions c.

As we have seen previously, in reality, external conditions are not constant and even small fluctuations around a mean value can lead to different distributions [Werndl 2015]. This worry can be addressed by tracing the development of the initial condition ensemble under actual external conditions. The climate at future time t then is the distribution of the climate variables that arises when the initial conditions ensemble is evolved forward for the actual path taken by the external conditions [for example, Daron and Stainforth 2013].

This definition faces a number of conceptual challenges. First, it makes the world’s climate dependent on our knowledge (via measurement accuracy), but this is counterintuitive because we think of climate as something objective that is independent of our knowledge. Second, the above definition is a definition of future climate, and it is difficult to see how the present and past climate should be defined. Yet without a notion of the present and past climate one cannot define climate change. A third problem is that ensemble distributions (and thus climate) do not relate in a straightforward way to the past time series of observations of the actual Earth and this would imply that the climate cannot be estimated from them [compare, Werndl 2015].

These considerations show that defining climate is nontrivial and there is no generally accepted or uncontroversial definition of climate.

3. Climate Models

A climate model is a representation of particular aspects of the climate system. One of the simplest climate models is an energy-balance model, which treats the Earth as a flat surface with one layer of atmosphere above it. It is based on the simple principle that in equilibrium the incoming and outgoing radiation must be equal (see Dessler [2011], Chapters 3-6, for a discussion of such models). This model can be refined by dividing the Earth into zones, allowing energy transfer between zones, or describing a vertical profile of the atmospheric characteristics. Despite their simplicity, these models provide a good qualitative understanding of the greenhouse effect.

Modern climate science aims to construct models that integrate as much as possible of the known science (for an introduction to climate modelling see [McGuffie and Henderson-Sellers 2005]). Typically, this is done by dividing the Earth (both the atmosphere and ocean) into grid cells. In 2020, global climate models have a horizontal grid scale of around 150 km. Climatic processes can then be conceptualised as flows of physical quantities such as heat or vapour from one cell to another. These flows are mathematically described by equations. These equations form the ‘dynamical core’ of a global circulation model (GCM). The equations typically are intractable with analytical methods, and powerful supercomputers are used to solve them. For this reason, they are often referred to as ‘simulation models’. To solve equations numerically, time is discretised. Current state-of-the-art simulations use time steps of approximately 30 minutes, taking weeks or months in real time on supercomputers to simulate a century of climate evolution.

In order to compute a single hypothetical evolution of the climate system (a ‘model run’), we also require an initial condition and boundary conditions. The former is a mathematical description of the state of the climate system (projected into the model’s own domain) at the beginning of the period being simulated. The latter are values for any variables which affect the system, but which are not directly output by the calculations. These include, for instance, the concentration of greenhouse gases, the amount of aerosols in the atmosphere at a given time, and the amount of solar radiation received by the Earth. Since these are drivers of climatic change, they are often referred to as external forcings or external conditions.

Where processes occur on a smaller scale than the grid, they may be included via parameterisation, whereby the net effect of the process is separately calculated as a function of the grid variables. For instance, cloud formation is a physical process that cannot be directly simulated because typical clouds are much smaller than the grid. So, the net effect of clouds is usually parameterised (as a function of temperature, humidity, and so forth) in each grid cell and fed back into the calculation. Sub-grid processes are one of the main sources of uncertainty in climate models.

There are now dozens of global climate models under continuous development by national modelling centres like NASA, the UK Met Office, and the Beijing Climate Center, as well as by smaller institutions. An exact count is difficult because many modelling centres maintain multiple versions based on the same foundation.  As an indication, in 2020 there were 89 model-versions submitted to CMIP6 (Coupled Model Intercomparison Project phase 6), from 35 modelling groups, though not all of these should be thought of as being “independent” models since assumptions and algorithms are often shared between institutions. In order to be able to compare outputs of these different models, the Coupled Model Intercomparison Project (CMIP) defines a suite of standard experiments to be run for each climate model. One standard experiment is to run each model using the historical forcings experienced during the twentieth century so that the output can be directly compared against real climate system data.

Climate models are used in many places in climate science, and their use gives rise to important questions. These questions are discussed in the next three sections.

4. Detection and Attribution of Climate Change

Every empirical study of climate has to begin by observing the climate. Meteorological observatories measure a number of variables such as air temperature near the surface of the Earth using thermometers. But more or less systematic observations are available since about 1750, and hence to reconstruct the climate before then scientists have to rely on proxy data: data for climate variables that are derived from observing other natural phenomena such as tree rings, ice cores, and ocean sediments.

The use of proxy data raises a number of methodological problems centred around the statistical processing of such data, which are often sparse, highly uncertain, and several inferential steps away from the climate variable of interest. These issues were at the heart of what has become known as the Hockey Stick controversy, which broke at the turn of the century in connection with a proxy-based reconstruction of the Northern Hemisphere temperature record [Mann, Bradley and Hughes, 1998]. The sceptics pursued two lines of argument. They cast doubt on the reliability of the available data, and they argued that the methods used to process the data are such that they would produce a hockey-stick-shaped curve from almost any data [for example, McIntyre and McKitrick 2003]. The papers published by the sceptics raised important issues and stimulated further research, but they were found to contain serious flaws undermining their conclusions. There are now more than two dozen reconstructions of this temperature record using various statistical methods and proxy data sources. Although there is indeed a wide range of plausible past temperatures, due to the constraints of the data and methods, these studies do robustly support the consensus that, over the past 1400 years, temperatures during the late 20th century are likely to have been the warmest [Frank et al. 2010].

Do rising temperatures indicate that there is climate change, and if so, can the change be attributed to human action? These two problems are known as the problems of detection and attribution. The Intergovernmental Panel on Climate Change (IPCC) defines these as follows:

Detection of change is defined as the process of demonstrating that climate or a system affected by climate has changed in some defined statistical sense without providing a reason for that change. An identified change is detected in observations if its likelihood of occurrence by chance due to internal variability alone is determined to be small […]. Attribution is defined as ‘the process of evaluating the relative contributions of multiple causal factors to a change or event with an assignment of statistical confidence.’ [IPCC 2013]

These definitions raise a host of issues. The root cause of the difficulties is the clause that climate change has been detected only if an observed change in the climate is unlikely to be due to internal variability. Internal variability is the phenomenon that the values of climate variables such as temperature and precipitation would change over time due to the internal dynamics of the climate system even in the absence of a change in external conditions, because of fluctuations in the frequency of storms, ocean currents, and so on.

Taken at face value, this definition of detection has the consequence that there cannot be internal climate change. The ice ages, for instance, would not count as climate change if they occurred because of internal variability. This is not only at odds with basic intuitions about climate and with the most common definitions of climate as a finite distribution over a relatively short time period (where internal climate change is possible); it also leads to difficulties with attribution: if detected climate change is ipso facto change not due to internal variability, then from the very beginning it is excluded that particular factors (namely, internal climate dynamics) can lead to a change in the climate, which seems to be an unfortunate conclusion.

For the case of the ice ages, many researchers would stress that internal variability is different from natural variability. Since, say, orbital changes explain the ice ages, and orbital changes are natural but external, this is a case of external climate change. While this move solves some of the problems, there remains the problem that there is no generally accepted way to separate internal and external factors, and the same factor is sometimes classified as internal and sometimes as external. For instance, glaciation processes are sometimes treated as internal factors and sometimes as prescribed external factors. Likewise, sometimes the biosphere is treated as an external factor, but sometimes it is also internally modelled and treated as an internal factor. One could even go so far to ask whether human activity is an external forcing on the climate system or an internally-generated Earth system process. Research studies usually treat human activity as an external forcing, but it could consistently be argued that human activities are an internal dynamical process. The appropriate definition simply depends on the research question of interest. For a discussion of these issues, see Katzav and Parker [2018].

The effects of internal variability are present on all timescales, from the sub-daily fluctuations experienced as weather to the long-term changes due to cycles of glaciation. Since internal variability stems from processes in a highly complex nonlinear system, it is also unlikely that the statistical properties of internal variability are constant over time, which further compounds methodological difficulties. State-of-the-art climate models run with constant forcing show significant disagreements both on the magnitude of internal variability and the timescale of variations. (On http://www.climate-lab-book.ac.uk/2013/variable-variability/#more-1321 the reader finds a plot showing the internal variability of all CMIP5 models. The plot indicates that models exhibit significantly different internal variability, leaving considerable uncertainty.) The model must be deemed to simulate pre-industrial climate (including variability) sufficiently well before it can be used for such detection and attribution studies, but we do not have thousands of years of detailed observations upon which to base that judgement. Estimates of internal variability in the climate system are produced from climate models themselves [Hegerl et al. 2010], leading to potential circularity. This underscores the difficulties in making attribution statements based on the above definition, which recognises an observed change as climate change only if is unlikely to be due to internal variability.

Since the IPCC’s definitions are widely used by climate scientists, the discussion about detection and attribution in the remainder of this section is based on these definitions. Detection relies on statistical tests, and detection studies are often phrased in terms of the likelihood of a particular event or sequence of events happening in the absence of climate change. In practice, the challenge is to define an appropriate null hypothesis (the expected behaviour of the system in the absence of changing external influences), against which the observed outcomes can be tested. Because the climate system is a dynamical system with processes and feedbacks operating on all scales, this is a non-trivial exercise. An indication of the importance of the null hypothesis is given by the results of Cohn and Lins [2005], who compare the same data against alternate null hypotheses, with results differing by 25 orders of magnitude of significance! This does not in itself show that either null is more appropriate, but it demonstrates the sensitivity of the result to the null hypothesis chosen. This, in turn, underscores the importance of the choice of null hypothesis and the difficulty of making any such choice if the underlying processes are poorly understood.

In practice, the best available null hypothesis is often the best available model of the behaviour of the climate system, including internal variability, which for most climate variables usually means a state of the art GCM. This model is then used to perform long control runs with constant forcings in order to quantify the internal variability of the model (see discussion above). Climate change is then said to have been detected when the observed values fall outside a predefined range of the internal variability of the model. The difficulty with this method is that there is no single “best” model to choose: many such models exist, they are similarly well developed, but, as noted above, they have appreciably different patterns of internal variability.

The differences between different models are relatively unimportant for the clearest detection results such as recent increases in global mean temperature. Here, as stressed by Parker [2010], detection is robust across different models (for a discussion of robustness see Section 6), and, moreover, there is a variety of different pieces of evidence all pointing to the conclusion that the global mean temperature has increased beyond that which can be attributed to internal variability. However, the issues of which null hypothesis to use and how to quantify internal variability, can be important for the detection of subtler local climate change.

If climate change has been detected, then the question of attribution arises. This might be an attribution of any particular change (either a direct climatic change such as increased global mean temperature, or an impact such as the area burnt by forest fires) to any identified cause (such as increased CO2 in the atmosphere, volcanic eruptions, or human population density). Where an impact is considered, a two-step or multi-step approach may be appropriate. An example of this, taken from the IPCC Good Practice Guidance paper [Hegerl et al. 2010], is the attribution of coral reef calcification impacts to rising CO2 levels, in which an intermediate stage is used by first attributing changes in the carbonate ion concentration to rising CO2 levels, then attributing calcification processes to changes in the carbonate ion concentration. This also illustrates the need for a clear understanding of the physical mechanisms involved, in order to perform a reliable multi-step attribution in the presence of many potential confounding factors.

In the interpretation of attribution results, in particular those framed as a question of whether human activity has influenced a particular climatic change or event, there is a tendency to focus on whether the confidence interval of the estimated anthropogenic effect crosses zero. The absence of such a crossing indicates that change is likely to be due to human factors. This results in conservative attribution statements, but it reflects the focus of the present debate where, in the eyes of the public and media, “attribution” is often understood as confidence in ruling out non-human factors, rather than as giving a best estimate or relative contributions of different factors.

Statistical analysis quantifies the strength of the relationship, given the simplifying assumptions of the attribution framework, but the level of confidence in the simplifying assumptions must be assessed outside that framework. This level of confidence is standardised by the IPCC into discrete (though subjective) categories (“very high”, “high”, and so forth.), which aim to take account of the process knowledge, data limitations, adequacy of models used, and the presence of potential confounding factors. The conclusion that is reached will then have a form similar to the IPCC’s headline attribution statement:

It is extremely likely [³95% probability] that more than half of the observed increase in global average surface temperature from 1951 to 2010 was caused by the anthropogenic increase in greenhouse gas concentrations and other anthropogenic forcings together. [IPCC 2013; Summary for Policymakers, section D.3].

One attribution method is optimal fingerprinting. The method seeks to define a spatio-temporal pattern of change (fingerprint) associated with each potential driver (such as the effect of greenhouse gases or of changes in solar radiation), normalised relative to the internal variability, and then perform a statistical regression of observed data with respect to linear combinations of these patterns. The residual variability after observations have been attributed to each factor should then be consistent with the internal variability; if not, this suggests that an important source of variability remains unaccounted for. Parker [2010] notes that fingerprint studies rely on several assumptions. Chief among them is linearity, that is, that the response of the climate system when several forcing factors are present is equal to a linear combination of the effects of the forcings. Because the climate system is nonlinear, this is clearly a source of methodological difficulty, although for global-scale responses (in contrast to regional-scale responses) additivity has been shown to be a good approximation.

Levels of confidence in these attribution statements are primarily dependent on physical understanding of the processes involved.  Where there is a clear, simple, well-understood mechanism, there should be greater confidence in the statistical result; where the mechanisms are loose, multi-factored or multi-step, or where a complex model is used as an intermediary, confidence is correspondingly lower.  The Guidance Paper cautions that,

Where models are used in attribution, a model’s ability to properly represent the relevant causal link should be assessed. This should include an assessment of model biases and the model’s ability to capture the relevant processes and scales of interest. [Hegerl 2010, 5]

As Parker [2010] argues, there is also higher confidence in attribution results when the results are robust and there is a variety of evidence. For instance, the finding that late twentieth-century temperature increase was mainly caused by greenhouse gas forcing is found to be robust given a wide range of different models, different analysis techniques, and different forcings; and there is a variety of evidence all of which supports this claim. Thus our confidence that greenhouse gases explain global warming is high. (For further useful extended discussion of detection and attribution methods in climate science, see pages 872-878 of IPCC [2013] and in the Good Practice Guidance paper by Hegerl et al. [2010], and for a discussion of how such hypotheses are tested see Katzav [2013].)

In addition to the large-scale attribution of climate change, attribution of the degree to which individual weather events have become either more likely or more extreme as a result of increasing atmospheric greenhouse gas concentrations is now common. It has a particular public interest as it is perceived as a way both to communicate that climate impacts are happening already, perhaps quantifying risk numerically to price insurance, and offering a motivation for climate mitigation.  There is therefore also an incentive to conduct these studies quickly, to inform timely news articles, and some groups have formed to respond quickly to reports of extreme weather and conduct attribution studies immediately. This relies on the availability of data, may suffer from unclear definitions of exactly what category of event is being analysed, and is open to criticism for publicity prior to peer review.  There are also statistical implications of choosing to analyse only those events which have happened and not those that did not happen. For a discussion of event attribution see Lloyd and Oreskes [2019] and Lusk [2017].

5. Confirmation and Predictive Power

Two questions arise in connection with models: how are models confirmed and what is their predictive power? Confirmation concerns the question of whether, and to what degree, a specific model is supported by the data. Lloyd [2009] argues that many climate models are confirmed by past data. Parker [2009] objects to this claim. She argues that the idea that climate models per se are confirmed cannot be seriously entertained because all climate models are known to be wrong and empirically inadequate. Parker urges a shift in thinking from confirmation to adequacy for purpose: models can only be found to be adequate for specific purposes, but they cannot be confirmed wholesale. For example, one might claim that a particular climate model adequately predicts the global temperature increase that will occur by 2100 (when run from particular initial conditions and relative to a particular emission scenario). Yet, at the same time, one might hold that the predictions of global mean precipitation by 2100 by the same model cannot be trusted.

Katzav [2014] cautions that adequacy for purpose assessments are of limited use. He claims that these assessments are typically unachievable because it is far from clear which of the model’s observable implications can possibly be used to show that the model is adequate for the purpose. Instead, he argues that climate models can at best be confirmed as providing a range of possible futures. Katzav is right to stress that adequacy for purpose assessments are more difficult than appears at first sight. But the methodology of adequacy for purpose cannot be dismissed wholesale; in fact, it is used successfully across the sciences (for example, when ideal gas models are confirmed to be useful for particular purposes). Whether or not adequacy for purpose assessment is possible depends on the case at hand.

If one finds that one model predicts specific variables well and another model doesn’t, then one would like to know the reasons why the first model is successful and the second not. Lenhard and Winsberg [2010] argue that this is often very difficult, if not impossible: For complex climate models a strong version of confirmation holism makes it impossible to tell where the failures and successes of climate models lie. In particular, they claim that it is impossible to assess the merits and problems of sub-models and the parts of models. There is a question, though, whether this confirmation holism affects all models and whether it is here to stay. Complex models have different modules for the atmosphere, the ocean, and ice. These modules can be run individually and also together. The aim of the many new Model Intercomparison Projects (MIPs) is, by comparing individual and combined runs, to obtain an understanding of the performance and physical merits of separate modules, which it is hoped will identify areas for improvement and eventually result in better performance of the entire model.

Another problem concerns the use of data in the construction of models. The values of model parameters are often estimated using observations, a process known as calibration. For example, the magnitude of the aerosol forcing is sometimes estimated from data. When data have been used for calibration, the question arises whether the same data can be used again to confirm the model. If data are used for confirmation that have not already been used for calibration, they are use-novel. If data are used for both calibration and confirmation, this is referred to as double-counting.

Scientists and philosophers alike have argued that double-counting is illegitimate and that data have to be use-novel to be confirmatory [Lloyd 2010; Shackley et al. 1998; Worrall 2010]. Steele and Werndl [2013] oppose this conclusion and argue that on Bayesian and relative-likelihood accounts of confirmation double-counting is legitimate. Furthermore, Steele and Werndl [2015] argue that model selection theory presents a more nuanced picture of the use of data than the commonly endorsed positions. Frisch [2015] cautions that Bayesian as well as other inductive logics can be applied in better and worse ways to real problems such as climate prediction. Nothing in the logic prevents facts from being misinterpreted and their confirmatory power exaggerated (as in ‘the problem of old evidence’ which Frisch [2015] discusses). This is certainly a point worth emphasising. Indeed, Steele and Werndl [2013] stress that the same data cannot inform a prior probability for a hypothesis and also further (dis)confirm the hypothesis. But they do not address all the potential pitfalls in applying Bayesian or other logics to the climate and other settings. Their argument must be understood as a limited one: there is no univocal logical prohibition against the same data serving for calibration and confirmation. As far as non-Bayesian methods of model selection goes, there are two cases. First, there are methods such as cross-validation where the data are required to be use-novel. For cross-validation, the data are split up into two groups: the first group is used for calibration and the second for confirmation. Second, there are the methods such as the Akaike Information Criterion for which the data need not be use-novel, although information criteria methods are hard to apply in practice to climate models because the number of degrees of freedom is poorly defined.

This brings us to the second issue: prediction. In the climate context this is typically framed as the issue of projection. ‘Projection’ is a technical term in the climate modelling literature and refers to a prediction that is conditional on a particular forcing scenario and a particular initial conditions ensemble. The forcing scenario is specified either by the amount of greenhouse gas emissions and aerosols added to the atmosphere or directly by their atmospheric concentrations, and these in turn depend on future socioeconomic and technological developments.

Much research these days is undertaken with the aim of generating projections about the actual future evolution of the Earth system under a particular emission scenario, upon which policies are made and real-life decisions are taken. In these cases, it is necessary to quantify and understand how good those projections are likely to be. It is doubtful that this question can be answered along traditional lines. One such line would be to refer to the confirmation of a model against historical data (Chapter 9 of IPCC [2013] discusses model evaluation in detail) and argue that the ability of a model to successfully reproduce historical data should give us confidence that it will perform well in the future too. It is unclear at best whether this is a viable answer. The problem is that climate projections for high forcing scenarios take the system well outside any previously experienced state, and at least prima facie there is no reason to assume that success in low forcing contexts is a guide to success in high-forcing contexts; for example, a model calibrated on data from a world with the Arctic Sea covered in ice might no longer perform well when the sea ice is completely melted and the relevant dynamical processes are quite different. For this reason, calibration to past data has at most limited relevance for the assessment of a model’s predictive success [Oreskes et al. 1994; Stainforth et al. 2007a, 2007b, Steele and Werndl 2013].

This brings into focus the fact that there is no general answer to the question of the trustworthiness of model outputs. There is widespread consensus that predictions are better for longer time averages, larger spatial averages, low specificity and better physical understanding; and, all other things being equal, shorter lead times (nearer prediction horizons) are easier to predict than longer ones. Global mean temperature trends are considered trustworthy, and it is generally accepted that the observed upward trend will continue [Oreskes 2007], although the basis of this confidence is usually a physical understanding of the greenhouse effect with which the models are consistent, rather than a direct reliance on the output of models themselves. A 2013 IPCC report [IPCC 2013, Summary for Policymakers, section D.1] professes that modelled surface temperature patterns and trends are trustworthy on the global and continental scale, but, even in making this statement, assigns a probability of at least 66% (‘likely’) to the range within with 90% of model outcomes fall. In plainer terms, this is an expert-assigned probability of at least tens of percent that the models are substantially wrong even about global mean temperature.

There still are interesting questions about the epistemic grounds on which such assertions are made (and we return to them in the next section). A harder problem, however, concerns the use of models as providers of detailed information about the future local climate. The United Kingdom Climate Impacts Programme produces projections that aim to make high-resolution probabilistic projections of the local climate up to the end of the century, and similar projects are run in many other countries [Thompson et al. 2016]. The Programme’s set of projections known as UKCP09 [Sexton et al. 2012, Sexton and Murphy 2012] produces projections of the climate up to 2100 based on HadCM3, a global climate model developed at the UK Met Office Hadley Centre. Probabilities are given for events on a 25km grid for finely defined specific events such as changes in the temperature of the warmest day in summer, the precipitation of the wettest day in winter, or the change in summer-mean cloud amount, with projections blocked into overlapping thirty-year segments which extend to 2100. It is projected, for instance, that under a medium emission scenario the probability for a 20-30% reduction in summer mean precipitation in central London in 2080 is 0.5. There is a question of whether these projections are trustworthy and policy relevant. Frigg et al. urge caution on grounds that many of the UKCP09’s foundational assumptions seem to be questionable [2013, 2015] and that structural model error may have significant repercussions on small scales [2014]. Winsberg [2018] and Winsberg and Goodwin [2016] criticise these cautionary arguments as overstating the limitations of such projections. In 2019, the Programme launched a new set of projections, known as UKCP18 (https://www.metoffice.gov.uk/research/collaboration/ukcp). It is an open question whether these projections are open to the same objections, and, if so, how severe the limitations are.

6. Understanding and Quantifying Uncertainty

Uncertainty features prominently in discussions about climate models, and yet is a concept that is poorly understood and that raises many difficult questions. In most general terms, uncertainty is a lack of knowledge. The first challenge is to circumscribe more precisely what is meant by ‘uncertainty’ and what the sources of uncertainty are. A number of proposals have been made, but the discussion is still in a ‘pre-paradigmatic’ phase. Smith and Stern [2011] identify four relevant varieties of uncertainty: imprecision, ambiguity, intractability and indeterminacy. Spiegelhalter and Riesch [2011] consider a five-level structure with three within-model levels-event, parameter and model uncertainty-and two extra-model levels concerning acknowledged and unknown inadequacies in the modelling process. Wilby and Dessai [2010] discuss the issue with reference to what they call the cascade of uncertainty, studying how uncertainties magnify as one goes from assumptions about future global emissions of greenhouse gases to the implications of these for local adaption. Petersen [2012, Chapters 3 and 6] introduces a so-called uncertainty matrix listing the sources of uncertainty in the vertical and the sorts of uncertainty in the horizontal direction. Lahsen [2005] looks at the issue from a science studies point of view and discusses the distribution of uncertainty as a function of the distance from the site of knowledge production. And these are but a few of the many proposals.

The next problem is the one of measuring and quantifying uncertainty in climate predictions. Among the approaches that have been devised in response to this challenge, ensemble methods occupy centre stage. Current estimates of climate sensitivity and increase in global mean temperature under various emission scenarios, for instance, include information derived from ensembles containing multiple climate models. Multi-model ensembles are sets of several different models which differ in mathematical structure and physical content.  Such an ensemble is used to investigate how predictions of relevant climate variables vary (or do not vary) according to model structure and assumptions.  A special kind of multi-model ensemble is known as a “perturbed parameter ensemble”. It contains models with the same mathematical structure in which particular parameters assume different values, thereby effectively conducting a sensitivity analysis on a single model by systematically varying some of the parameters and observing the effect on the outcomes. Early analyses such as the climateprediction.net simulations and the UKCP09 results rely on perturbed parameter ensembles only, due to resource limitations; international projects such as the Coupled Model Intercomparison Projects (CMIP) and the work that goes into the IPCC assessments are based on multi-model ensembles containing different model structures. The reason to use ensembles is the acknowledged uncertainties in individual models, which concerns both the model structure and the values of parameters in the model. It is a common assumption that ensembles help understand the effects of these uncertainties either by producing and identifying “robust” predictions, or by providing estimates of this uncertainty about future climate change. (Parker [2013] provides an excellent discussion of ensemble methods and the problems that attach to them.)

A model-result is robust if all or most models in the ensemble show the same result; for general discussion of robustness analysis see Weisberg [2006]. If, for instance, all models in an ensemble show more than 4º increase in global mean temperature by the end of the century when run under a specific emission scenario, this result is robust across the specified ensemble. Does robustness justify increased confidence? Lloyd [2010, 2015] argues that robustness arguments are powerful in connection with climate models and lend credibility at least to core claims such as the claim that there was global warming in the 20th Century. Parker [2011], by contrast, reaches a more sober conclusion: ‘When today’s climate models agree that an interesting hypothesis about future climate change is true, it cannot be inferred […] that the hypothesis is likely to be true or that scientists’ confidence in the hypothesis should be significantly increased or that a claim to have evidence for the hypothesis is now more secure’ [ibid. 579]. One of the main problems is that if today’s models share the same technological constraints posed by today’s computer architecture and understanding of the climate system, then they inevitably share some common errors. Indeed, such common errors have been widely acknowledged (see, for instance, Knutti et al. [2010]) and studies have demonstrated and discussed the lack of model independence [Bishop and Abramowitz 2013; Jun et al. 2008a; 2008b]. But if models are not independent, then there is a question about how much epistemic weight agreement between them carries.

When ensembles do not yield robust predictions, then the spread of results within the ensemble is sometimes used to estimate quantitatively the uncertainty of the outcome. There are two main approaches to this. The first approach aims to translate the histogram of model results directly into a probability distribution: in effect, the guiding principle is that the probability of an outcome is proportional to the fraction of models in the ensemble which produce that result. The thinking behind this method seems to be to invoke some sort of frequentist approach to probabilities. The appeal to frequentism presupposes that models can be treated as exchangeable sources of information (in the sense that there is no reason to trust one ensemble member any more than any other). However, as we have previously seen, the assumption that models are independent has been questioned. There is a further problem: MMEs are ‘ensembles of opportunity’, grouping together existing models. Even the best ensembles such as CMIP6 are not designed to systematically explore all possibilities. It is therefore not clear why the frequency of ensemble projections should double as a guide to probability. The IPCC acknowledges this limitation (see discussion in Chapter 12 of IPCC [2013]) and thus downgrade the assessed likelihood of ensemble-derived ranges, deeming it only “likely” (³66%) that the real-world global mean temperature will fall within the 90% model range (for a discussion of this case see Thompson et al [2016]).

A more modest approach regards ensemble outputs as a guide to possibility rather than probability. In this view, the spread of an ensemble presents the range of outcomes that cannot be ruled out. The bounds of this set of results-often referred to as a ‘non-discountable envelope’-provide a lower bound of the uncertainty [Stainforth et al. 2007b]. In this spirit Katzav [2014] argues that a focus on prediction is misguided and that models ought to be used to show that particular scenarios are real possibilities.

While undoubtedly less committal than the probability approach, also non-discountable envelopes raise questions. The first is the relation between non-discountability and possibility. Non-discountable results are ones that cannot be ruled out. How is this judgment reached? Do results which cannot be ruled out indicate possibilities? If not, what is their relevance for estimating lower bounds? And, could the model, if pushed more deliberately towards “interesting” behaviours, actually make that envelope wider?  Furthermore, it is important to keep in mind that the envelope just represents some possibilities. Hence it does not indicate the complete range of possibilities, making particular types of formalised decision-making procedures impossible. For a further discussion of these issues see Betz [2009, 2010].

Finally, a number of authors emphasise the limitations of model-based methods (such as ensemble methods) and submit that any realistic assessment of uncertainties will also have to rely on other factors, most notably expert judgement. Petersen [2012, Chapter 4] outlines the approach of the Netherlands Environmental Assessment Agency (PBL), which sees expert judgment and problem framings as essential components of uncertainty assessment. Aspinall [2010] suggests using methods of structured expert elicitation.

In light of the issues raised above, how should uncertainty in climate science be communicated to decision-makers? The most prominent framework for communicating uncertainty is the IPCC’s, which is used throughout the Fifth Assessment Report (AR5), is explicated in the ‘Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties’ and further explicated in [Mastrandrea et al. 2011]. The framework appeals to two measures for communicating uncertainty. The first, a qualitative ‘confidence’ scale, depends on both the type of evidence and the degree of agreement amongst experts. The second measure is a quantitative scale for representing statistical likelihoods (or more accurately, fuzzy likelihood intervals) for relevant climate/economic variables. The following statement exemplifies the use of these two measures for communicating uncertainty in AR5: ‘The global mean surface temperature change for the period 2016–2035 relative to 1986–2005 is similar for the four RCPs and will likely be in the range 0.3°C to 0.7°C (medium confidence). [IPCC 2013] A discussion of this framework can be found in Adler and Hirsch Hadorn [2014], Budescu et al. [2014], Mach et al. [2017], and Wüthrich [2017].

At this point, it should also be noted that the role of ethical and social values in relation to uncertainties in climate science is controversially debated. Winsberg [2012] appeals to complex simulation modelling to argue that it is infeasible for climate scientists to produce results that are not influenced by their ethical and social values. More specifically, he argues that assignments of probabilities to hypotheses about future climate change are influenced by ethical and social values because of the way these values come into play in the building and evaluating of climate models. Parker [2014] contends that pragmatic factors rather than social or ethical values often play a role in resolving these modelling choices. She further objects that Winsberg’s focus on precise probabilistic uncertainty estimates is misguided; coarser estimates like those used by the IPCC better reflect the extent of uncertainty and are less influenced by values. She concludes that Winsberg has exaggerated the influence of ethical and social values here but suggests that a more traditional challenge to the value-free ideal of science fits the climate case. Namely, one could argue that estimates of uncertainty are themselves always somewhat uncertain, and that the decision to offer a particular estimate of uncertainty thus might appropriately involve value judgments [compare, Douglas 2009].

7. Conceptualising Decisions Under Uncertainty

What is the appropriate reaction to climate change? How much should we mitigate? To what extent should we adapt? And what form should adaptation take? Should we build larger water reserves? Should we adapt houses, and our social infrastructure more generally, to a higher frequency of extreme weather events like droughts, heavy rainfalls, floods, and heatwaves, as well as the increased incidence of extremely high sea levels or the more frequent occurrence of particularly hot days are extreme weather events? The decisions that we make in response to these questions have consequences affecting both individuals and groups at different places and times. Moreover, the circumstances of many of these decisions involve uncertainty and disagreement that is sometimes both severe and wide-ranging, concerning not only the state of the climate (as discussed above) and the broader social consequences of any action or inaction on our part, but also the range of actions available to us and what significance we should attach to their possible consequences. These considerations make climate decision-making both important and hard. The stakes are high, and so too are the difficulties for standard decision theory—plenty of reason for philosophical engagement with this particular application of decision theory.

Let us begin by looking at the actors in the climate domain and the kinds of decision problems that concern them. When introducing decision theory, it is common to distinguish three main domains: individual decision theory (which concerns the decision problem of a single agent who may be uncertain of her environment), game theory (which focuses on cases of strategic interaction amongst rational agents), and social choice theory (which concerns procedures by which a number of agents may ‘think’ and act collectively). All three realms are relevant to the climate-change predicament, whether the concern is adapting to climate change or mitigating climate change or both.

Determining the appropriate agential perspective and type of engagement between agents is important, because otherwise decision-modelling efforts may be in vain. For instance, it may be futile to focus on the plight of individual citizens when the power to affect change really lies with states. It may likewise be misguided to analyse the prospects for a collective action on climate policy, if the supposed members of the group do not see themselves as contributing to a shared decision that is good for the group as a whole. It would also be misleading to exclude from an individual agent’s decision model the impact of others who perceive that they are acting in a strategic environment. This is not, however, to recommend a narrow view of the role of decision models-that they must always represent the decisions of agents as they see them, and can never be aspirational; the point is rather that we should not employ decision models with particular agential framings in a naïve way.

Getting the agential perspective right is just the first step in framing a decision problem so that it presents convincing reasons for action. There remains the task of representing the details of the decision problem from the appropriate epistemic and evaluative perspective. Our focus is individual decision theory, for reasons of space, and because most decision settings ultimately involve the decision of an individual, whether this be a single person or a group acting as an individual.

The standard model of (individual) decision-making under uncertainty used by decision theorists derives from the classic work of von Neumann and Morgenstern [1944] and Leonard Savage [1954]. It treats actions as functions from possible states of the world to consequences, these being the complete outcomes of performing the action in question in that state of the world. All uncertainty is taken to be uncertainty about the state of the world and is quantified by a single probability function over the possible states, where the probabilities in question measure either objective risk or the decision maker’s degrees of belief (or a combination of the two). The relative value of consequences is represented by an interval-scaled utility function over these consequences. Decision-makers are advised to choose the action with maximum expected utility (EU); where the EU for an action is the sum of the probability-weighted utility of the possible consequences of the action.

It is our contention that this model is inadequate for many climate-oriented decisions, because it fails to properly represent the multidimensional nature and severity of the uncertainty that decision-makers face. To begin with, not all the uncertainty that climate decision-makers face is empirical uncertainty about the actual state of the world (state uncertainty). There may be further empirical uncertainty about what options are available to them and what are the consequences of exercising each option for each respective state (option uncertainty). In what follows we use the term ‘empirical uncertainty’ to cover both state uncertainty and option uncertainty. Furthermore, decision-makers face a non-empirical kind of uncertainty-ethical uncertainty-about what values to assign to possible consequences.

Let us now turn to empirical uncertainty. As noted above, standard decision theory holds that all empirical uncertainty can be represented by a probability function over the possible states of the world. There are two issues here. The first is that confining all empirical uncertainty to the state space is rather unnatural for complex decision problems such as those associated with climate change. In fact, decision models are less convoluted if we allow the uncertainty about states to depend on the actions that might be taken (compare, Richard Jeffrey’s [1965] expected utility theory), and if we also permit further uncertainty about what consequence will arise under each state, given the action taken (an aspect of option uncertainty). For instance, consider a crude version of the mitigation decision problem faced by the global planner: it may be useful to depict the decision problem with a state-space partition in terms of possible increases in average global temperature over a given time period. In this case, our beliefs about the states (how likely they each are) would be conditional on the mitigation option taken. Moreover, for each respective mitigation option, the consequence arising in each of the states depends on further uncertain features of the world, for instance the extent to which, on average, regional conditions would be favourable to food production and whether social institutions would facilitate resilience in food production.

The second issue is that using a precise probability function to represent uncertainty about states (and consequences) can misrepresent the severity of this uncertainty. For instance, even if one assumes that the position of the scientific community may be reasonably well represented by a precise probability distribution over the state space, conditional on the mitigation option, precise probabilities over the possible food productions and other economic consequences, given this option and average global temperature rise, are less plausible. Note that the global social planner’s mitigation decision problem is typically analysed in terms of a so-called Integrated Assessment Model (IAM), which does indeed involve dependencies between mitigation strategies and both climate and economic variables. There is some disparity in the representation of empirical uncertainty: Nordhaus’s [2008] reliance on ‘best estimates’ for parameters like climate sensitivity can be compared with Stern’s [2007] use of ‘confidence intervals’. But these are relatively minor differences. Critics argue that all extant IAMs inadequately represent the uncertainty surrounding projections of future wealth under the status quo and alternative mitigation strategies [see Weitzman 2009, Frisch 2013, Stern 2013]. In particular, both Nordhaus [2008] and Stern [2007] controversially assume increasing wealth over time (or positive consumption growth rate) even for the status quo where nothing is done to mitigate climate change.

Popular among philosophers is the use of sets of probability functions to represent severe uncertainty surrounding decision states/consequences, whether the uncertainty is due to evidential limitations or due to evidential/expert disagreement. This is a minimal generalisation of the standard decision model, in the sense that probability measures still feature: roughly, the more severe the uncertainty, the more probability measures over the space of possibilities needed to conjointly represent the epistemic situation (see, for instance, Walley [1991]). For maximal uncertainty all possibilities are on a par-they are effectively assigned probability [0, 1].  Indeed it is a strength of the imprecise probability representation that it generalises the two extreme cases, that is, the precise probabilistic as well as the possibilistic frameworks. (See Halpern [2003] for a thorough treatment of frameworks, both qualitative and quantitative, for representing uncertainty.) In some contexts, it may be suitable to weight the possible probability distributions in terms of plausibility (as required for some of the decision rules discussed below). The weighting approach may in fact match the IPCC’s representation of the uncertainty surrounding decision-relevant climate and economic variables. Indeed, an important question is whether and how the IPCC’s representation of uncertainty can be translated into an imprecise probabilistic framework, as discussed here and in the next section. An alternative to the aforementioned proposal is that the IPCC’s confidence and likelihood measures for relevant variables should be combined to form an unweighted imprecise set of probability distributions, or even a precise probability distribution, suitable for input into an appropriate decision model.

Decision makers face uncertainty not only about what will or could happen, but also about what value to attach to these possibilities-in other words, they face ethical uncertainty. Such value or ethical uncertainty can have a number of different sources. The most important ones arise in connection with judgments about how to distribute the costs and benefits of mitigation and adaptation amongst different regions and countries, about how to take account of persons whose existence depends on what actions are chosen now, and about the degree to which future wellbeing should be discounted. (For discussion and debate about the ethical significance of various climate outcomes, particularly at the level of global rather than regional or national justice, see the articles in Gardiner et al.’s [2010] edited collection, Climate Ethics.) Of these, the latter has been the subject of the most debate, because of the extent to which (the global planner’s) decisions about how drastically to cut carbon emissions are sensitive to the discount rate used in evaluating the possible outcomes of doing so (as highlighted in Broome [2008]). Discounting thus provides a good illustration of the importance of ethical uncertainty.

In many economic models, a discount rate is applied to a measure of total wellbeing at different points in time (the ‘pure rate of time preference’), with a positive rate implying that future wellbeing carries less weight in the evaluations of options than present wellbeing. Note that the overall ‘social discount rate’ in economic models is the sum of the pure rate of time preference and a second term pertaining to the discounting of goods or consumption rather than wellbeing per se. See Broome [1992] and Parfit [1984] for helpful discussions of the reasons for discounting goods that do not imply discounting wellbeing. (The consumption growth rate is an important component of this second discounting term that is subject to empirical uncertainty, as discussed above; see Greaves [2017] for an examination of all the assumptions underlying the ‘social discount rate’ and its role in the standard economic method for evaluating policy options.) Many philosophers regard any pure discounting of future wellbeing as completely unjustified from an objective point of view. This is not to deny that temporal location may nonetheless correlate with features of the distribution of wellbeing that are in fact ethically significant. If people will be better off in the future, for instance, it is reasonable to be less concerned about their interests than those of the present generation, much as one might prioritise the less well-off within a single generation. But the mere fact of a benefit occurring at a particular time cannot be relevant to its value, at least from an impartial perspective.

Economists do nonetheless often discount wellbeing in their policy-oriented models, although they disagree considerably about what pure rate of time preference should be used. One view, exemplified by the Stern Review and representing the impartial perspective described above, is that only a very small rate (in the order of 0.5%) is justified, and this on the grounds of the small probability of the extinction of the human population. Other economists, however, regard a partial rather than an impartial point of view more appropriate in their models. A view along these lines, exemplified by Nordhaus [2007] and Arrow [1995a], is that the pure rate of time preference should be determined by the preferences of current people. But typical derivations of average pure time discounting from observed market behaviour are much higher than those used by Stern (around 3% by Nordhaus’s estimate). Although the use of this data has been criticised for providing an inadequate measure of people’s reasoned preferences (see, for example, Sen [1982], Drèze and Stern [1990], Broome [1992]), the point remains that any plausible method for determining the current generation’s attitude to the wellbeing of future generations is likely to yield a rate higher than that advocated by the Stern Review. To the extent that this debate about the ethical basis for discounting remains unresolved, there will be ethical uncertainty about the discount rate in climate policy decisions. This ethical uncertainty may be represented analogously to empirical uncertainty-by replacing the standard precise utility function with a set of possible utility functions.

8. Managing Uncertainty

How should a decision-maker choose amongst the courses of action available to her when she must make the choice under conditions of severe uncertainty? The problem that climate decision-makers face is that, in these situations, the precise utility and probability values required by standard EU theory may not be readily available.

There are, broadly speaking, three possible responses to this problem.

(1) The decision-maker can simply bite the bullet and try to settle on precise probability and utility judgements for the relevant contingencies. Orthodox decision theorists argue that rationality requires that decisions be made as if they maximise the decision maker’s subjective expectation of benefit relative to her precise degrees of belief and values. Broome [2012, 129] gives an unflinching defence of this approach: “The lack of firm probabilities is not a reason to give up expected value theory […] Stick with expected value theory, since it is very well-founded, and do your best with probabilities and values.” This approach may seem rather bold, not least in the context of environmental decision making. Weitzman [2009], for instance, argues that whether or not one assigns non-negligible probability to catastrophic climate consequences radically changes the assessment of mitigation options. Moreover, in many circumstances there remains the question of how to follow Broome’s advice: How should the decision-maker settle, in a non-arbitrary way, on a precise opinion on decision-relevant issues in the face of an effectively ‘divided mind’? There are two interrelated strategies: she can deliberate further and/or aggregate conflicting views. The former aims for convergence in opinion, while the latter aims for an acceptable compromise in the face of persisting conflict. (For a discussion of deliberation see Fishkin and Luskin [2005]; for more on aggregation see, for instance, Genest and Zidek [1986], Mongin [1995], Sen [1970], List and Puppe [2009]. There is a comparatively small formal literature on deliberation, a seminal contribution being Lehrer and Wagner’s [1981] model for updating probabilistic beliefs.)

(2) The decision-maker can try to delay making a decision, or at least postpone parts of it, in the hope that her uncertainty will become manageable as more information becomes available, or as disagreements resolve themselves through a change in attitudes. The basic motive for delaying a decision is to maintain flexibility at zero cost (see Koopmans [1962], Kreps and Porteus [1978], Arrow [1995b]). Suppose that we must decide between building a cheap but low sea wall or a high, but expensive, one, and that the relative desirability of these two courses of action depends on unknown factors, such as the extent to which sea levels will rise. In this case it would be sensible to consider building a low wall first but leave open the possibility of raising it in the future. If this can be done at no additional cost, then it is clearly the best option. In many adaptation scenarios, the analogue of the ‘low sea wall’ may in fact be social-institutional measures that enable a delayed response to climate change, whatever the details of this change turn out to be. In many cases, however, the prospect of cost-free postponement of a decision (or part thereof) is simply a mirage, since delay often decreases rather than increases opportunities due to changes in the background environment. This is often true for climate-change adaptation decisions, not to mention mitigation decisions.

(3) The decision-maker can employ a different decision rule to that prescribed by EU theory; one that is much less demanding in terms of the information it requires. A great many different proposals for such rules exist in the literature, involving more or less radical departures from the orthodox theory and varying in the informational demands they make. It should be noted from the outset that there is one widely-agreed rationality constraint on these non-standard decision rules: ‘(EU)-dominated options’ are not admissible choices, that is, if an option has lower expected utility than another option according to all permissible pairs of probability and utility functions, then the former dominated option is not an admissible choice. This is a relatively minimal constraint, but it may well yield a unique choice of action in some decision scenarios. In such cases, the severe uncertainty is not in fact decision relevant. For example, it may be the case that, from the global planner’s perspective, a given mitigation option is better than continuing with business as usual, whatever the uncertain details of the climate system. This is even more plausible to the extent that the mitigation option counts as a ‘win-win’ strategy [Maslin and Austin 2012], that is, to the extent that it has other positive impacts, say, on air quality or energy security, regardless of mitigation results. In many more fine-grained or otherwise difficult decision contexts, however, the non-EU-dominance constraint may exclude only a few of the available options as choice-worthy.

A consideration that is often appealed to in order to further discriminate between options is caution. Indeed, this is an important facet of the popular but ill-defined Precautionary Principle. (The Precautionary Principle is referred to in the IPCC [2014b] ARC-5 WGII report. See, for instance, Gardiner [2006] and Steele [2006] for discussion of what the Precautionary Principle does/could stand for.) Cautious decision rules give more weight to the ‘down-side’ risks; the possible negative implications of a choice of action. The Maxmin-EU rule, for instance, recommends picking the action with greatest minimum expected utility (see Gilboa and Schmeidler [1989], Walley [1991]). The rule is simple to use, but arguably much too cautious, paying no attention at all to the full spread of possible expected utilities. The α-Maxmin rule, in contrast, recommends taking the action with the greatest α-weighted sum of the minimum and maximum expected utilities associated with it. The relative weights for the minimum and maximum expected utilities can be thought of as reflecting either the decision maker’s pessimism in the face of uncertainty or else their degree of caution (see Binmore [2009]). (For a comprehensive survey of non-standard decision theories for handling severe uncertainty in the economics literature, see Gilboa and Marinacci [2012]. For applications to climate policy see Heal and Millner [2014])

A more informationally-demanding set of rules are those that draw on considerations of confidence and/or reliability. The thought here is that an agent is more or less confident about the various probability and utility functions that characterise her uncertainty. For instance, when the estimates derive from different models or experts, the decision maker may regard some models as better corroborated by available evidence than others or else some experts as more reliable than others in their judgments. In these cases, it is reasonable, ceteris paribus, to favour actions of which you are more confident that they will have beneficial consequences. One (rather sophisticated) way of doing this is to weight each of the expected utilities associated with an action in accordance with how confident you are about the judgements supporting them and then choose the action with the maximum confidence-weighted expected utility (see Klibanoff et al. [2005]). This rule is not very different from maximising expected utility and indeed one could regard confidence weighting as an aggregation technique rather than an alternative decision rule. But considerations of confidence may be appealed to even when precise confidence weights cannot be provided. Gärdenfors and Sahlin [1982/ 1988], for instance, suggest simply excluding from consideration any estimates that fall below a reliability threshold and then picking cautiously from the remainder. Similarly, Hill [2013] uses an ordinal measure of confidence that allows for stake-sensitive thresholds of reliability that can then be combined with varying levels of caution. This rule has the advantage of allowing decision-makers to draw on the confidence grading of scientific claims adopted by the IPCC (see Bradley et al [2017]).

One might finally distinguish decision rules that are cautious in a slightly different way-that compare options in terms of ‘robustness’ to uncertainty, relative to a problem-specific satisfactory level of expected utility. Better options are those that are more assured of having an expected utility that is good enough or regret-free, in the face of uncertainty. The ‘information-gap theory’ developed by Ben-Haim [2001] provides one formalisation of this basic idea that has proved popular in environmental management theory. Another prominent approach to robust decision-making is that developed by Lempert, Popper and Bankes [2003]. These two frameworks are compared in Hall et al. [2012]. Recall that the uncertainty in question may be multi-faceted, concerning probabilities of states/outcomes, or values of final outcomes. Most decision rules that appeal to robustness assume that a best estimate for the relevant variables is available, and then consider deviations away from this estimate. A robust option is one that has a satisfactory expected utility relative to a class of estimates that deviate from the best one to some degree; the wider the class in question, the more robust the option. Much depends on what expected utility level is deemed satisfactory. For mitigation decision making, one salient satisfactory level of expected utility is that associated with a 50% chance of average global temperature rise of 2 degrees Celsius or less. Note that one may otherwise interpret any such mitigation temperature target in a different way, namely as a constraint on what counts as a feasible option. In other words, mitigation options that do not meet the target are simply prohibited options, not suitable for consideration. For adaptation decisions, the satisfactory level would depend on local context, but roughly speaking, robust options are those that yield reasonable outcomes for all the inopportune climate scenarios that have non-negligible probability given some range of uncertainty. These are plausibly adaptation options that focus on resilience to any and all of the aforesaid climate scenarios, perhaps via the development of social institutions that can coordinate responses to variability and change. (Robust decision-making is endorsed, for instance, by Dessai et al. [2009] and Wilby and Dessai [2010], who indeed associate this kind of decision rule with resilience strategies. See also Linkov and others [2014] for discussion of resilience strategies vis-à-vis risk management.)

9. Conclusion

This article reviewed, from a philosophy of science perspective, issues and questions that arise in connection with climate science. Most of these issues are the subject matter of ongoing research, and they indeed deserve further attention. Rather than repeating these points, we would like to mention a topic that has not received the attention that it deserves: the epistemic significance of consensus in the acceptance of results. As the controversy over the Cook et al. [2013] paper shows, many people do seem to think that the level of expert consensus is an important reason to believe in climate change given that they themselves are not expert; and conversely, attacking the consensus and sowing doubt is a classic tactic of the other side. The role of consensus in the context of climate change deserves more attention than it has received hitherto, but for some discussions about consensus see (Inmaculada de Melo-Martín, Kristen Intemann, 2014).

10. Glossary

Attribution (of climate change): The process of evaluating the relative contributions of multiple causal factors to a change or event with an assignment of statistical confidence.

Boundary conditions: Values for any variable which affect the system but which are not directly output by the calculations.

Calibration: The process of estimating values of model parameters which are most consistent with observations.

Climate model: A representation of certain aspects of the climate system.

Detection (of climate change): The process of demonstrating that climate or a system affected by climate has changed in some defined statistical sense without providing a reason for that change.

Double counting: The use of data for both calibration and confirmation.

Expected utility (for an action): The sum of the probability-weighted utility of the possible consequences of the action.

External conditions (of the climate system): Conditions that influence the state of the Earth such as the amount of energy received from the sun.

Initial conditions: A mathematical descriptions of the state of the climate system at the beginning of the period being simulated.

Internal variability: The phenomenon that climate variables such as temperature and precipitation would change over time due to the internal dynamics of the climate system even in the absence of changing external conditions.

Null hypothesis: The expected behaviour of the climate system in the absence of changing external influences.

Projection: The prediction of a climate model that is conditional on a certain forcing scenario.

Proxy data: The data for climate variables that derived from observing natural phenomena such as tree rings, ice cores and ocean sediments.

Robustness (of a result):  A result is robust if separate (ideally independent) models or lines of evidence lead to the same conclusion.

Use novel data: Data that are used for confirmation and have not been used for calibration.

11. References and Further Reading

  • Adler C. E. and G. Hirsch Hadorn. (2014). The IPCC and treatment of uncertainties: topics and sources of dissensus. Wiley Interdisciplinary Reviews: Climate Change 5.5, 663-676.
  • Arrow K. J. (1995b). A Note on Freedom and Flexibility. Choice, Welfare and Development. (eds. K. Basu, P. Pattanaik, and K. Suzumura), 7-15. Oxford: Oxford University Press.
  • Arrow K. J. (1995a). ‘Discounting Climate Change: Planning for an Uncertain Future. Lecture given at Institut d’Économie Industrielle, Université des Sciences Sociales, Toulouse.’ <http://idei.fr/doc/conf/annual/paper_1995.pdf>
  • Aspinall W. (2010).  A route to more tractable expert advice. Nature 463, 294-295.
  • Ben-Haim Y. (2001). Information-Gap Theory: Decisions Under Severe Uncertainty, 330 pp. London: Academic Press.
  • Betz G. (2009). What range of future scenarios should climate policy be based on? Modal falsificationism and its limitations. Philosophia Naturalis 46, 133-158.
  • Betz G. (2010). What’s the worst case?. Analyse und Kritik 32, 87-106.
  • Binmore K. (2009). Rational Decisions, 216 pp. Princeton, NJ: Princeton University Press.
  • Bishop C. H. and G. Abramowitz. (2013). Climate model dependence and the replicate Earth paradigm. Climate Dynamics 41, 885-900.
  • Bradley, R, Helgeson, C. and B. Hill (2017). Climate Change Assessments: Confidence, Probability and Decision, Philosophy of Science 84(3): 500-522.
  • Bradley, R, Helgeson, C. and B. Hill (2018). Combining Probability with Qualitative Degree-of-Certainty Assessment. Climatic Change 149 (3-4): 517-525,
  • Broome J. (2012). Climate Matters: Ethics in a Warming World, 192 pp. New York: Norton.
  • Broome J. (1992). Counting the Cost of Global Warming, 147 pp. Cambridge: The White Horse Press.
  • Broome J. (2008). The Ethics of Climate Change. Scientific American 298, 96-102.
  • Budescu, D. V., H. Por, S. B. Broomell and M. Smithson. (2014). The interpretation of IPCC probabilistic statements around the world. Nature Climate Change 4, 508-512.
  • Cohn T. A. and H. F. Lins. (2005). Nature’s style: naturally trendy. Geophysical Research Letters 32, L23402.
  • Cook J. et al. (2013). Quantifying the consensus on the anthropogenic global warming in the scientific literature. Environmental Research Letters 8, 1-7.
  • Daron J. D. and D. Stainforth. (2013). On predicting climate under climate change. Environmental Research Letters 8, 1-8.
  • de Melo-Martín I., and K. Intemann (2014). Who’s afraid of dissent? Addressing concerns about undermining scientific consensus in public policy developments. Perspectives on Science 22.4, 593-615.
  • Dessai S. et al. (2009). Do We Need Better Predictions to Adapt to a Changing Climate? Eos 90.13, 111-112.
  • Dessler A. (2011). Introduction to Modern Climate Change. Cambridge: Cambridge University Press.
  • Drèze J., and Stern, N. (1990). Policy reform, shadow prices, and market prices. Journal of Public Economics 42.1, 1-45.
  • Douglas H. (2009). Science, Policy, and the Value-Free Ideal. Pittsburgh: Pittsburgh University Press.
  • Fishkin J. S., and R. C. Luskin. (2005). Experimenting with a Democratic Ideal: Deliberative Polling and Public Opinion. Acta Politica 40, 284-298.
  • Frank D., J. Esper, E. Zorita and R. Wilson. (2010). A noodle, hockey stick, and spaghetti plate: A perspective on high-resolution paleoclimatology. Wiley Interdisciplinary Reviews: Climate Change 1.4, 507-516.
  • Frigg R. P., D. A. Stainforth and L. A. Smith. (2013). The Myopia of Imperfect Climate Models: The Case of UKCP09. Philosophy of Science 80.5, 886-897.
  • Frigg R. P., D. A. Stainforth and L. A. Smith. (2015). An Assessment of the Foundational Assumptions in High-Resolution Climate Projections: The Case of UKCP09 2015, draft under review.
  • Frigg R. P., S. Bradley, H. Du and L. A. Smith. (2014a). Laplace’s Demon and the Adventures of His Apprentices. Philosophy of Science 81.1, 31-59.
  • Frisch M. (2013). Modeling Climate Policies: A Critical Look at Integrated Assessment Models. Philosophy and Technology 26, 117-137.
  • Frisch, M. (2015). Tuning climate models, predictivism, and the problem of old evidence. European Journal for Philosophy of Science 5.2, 171-190.
  • Gärdenfors P. and N.-E. Sahlin. [1982] (1988). Unreliable probabilities, risk taking, and decision making. Decision, Probability and Utility, (eds. P. Gärdenfors and N.-E. Sahlin), 313-334. Cambridge: Cambridge University Press.
  • Gardiner S. (2006). A Core Precautionary Principle. The Journal of Political Philosophy 14.1, 33-60.
  • Gardiner S., S. Caney, D. Jamieson, H. Shue (2010). Climate Ethics: Essential Readings. Oxford: Oxford University Press
  • Genest C. and J. V. Zidek. (1986). Combining Probability Distributions: A Critique and Annotated Bibliography. Statistical Science 1.1, 113-135.
  • Gilboa I. and M. Marinacci. (2012). Ambiguity and the Bayesian Paradigm. Advances in Economics and Econometrics: Theory and Applications, Tenth World Congress of the Econometric Society (eds. D. Acemoglu, M. Arellano and E. Dekel), 179-242 Cambridge: Cambridge University Press.
  • Gilboa I. and D. Schmeidler. (1989). Maxmin expected utility with non-unique prior. Journal of Mathematical Economics 18, 141-153.
  • Greaves, H. (2017). Discounting for public policy: A survey. Economics and Philosophy 33.3, 391-439.
  • Hall J. W., Lempert, R. J., Keller, K., Hackbarth, A., Mijere, C., McInerney, D. J. (2012). Robust Climate Policies Under Uncertainty: A Comparison of Robust Decision-Making and Info-Gap Methods. Risk Analysis 32.10, 1657-1672.
  • Halpern J. Y. (2003). Reasoning About Uncertainty, 483 pp. Cambridge, MA: MIT Press.
  • Heal. G. and A. Millner (2014) Uncertainty and Decision Making in Climate Change Economics. Review of Environmental Economics and Policy 8:120-137.
  • Hegerl G. C., O. Hoegh-Guldberg, G. Casassa, M. P. Hoerling, R. S. Kovats, C. Parmesan, D. W. Pierce, P. A. Stott. (2010). Good Practice Guidance Paper on Detection and Attribution Related to Anthropogenic Climate Change. Meeting Report of the Intergovernmental Panel on Climate Change Expert Meeting on Detection and Attribution of Anthropogenic Climate Change (eds. T. F. Stocker, C. B. Field, D. Qin, V. Barros, G.-K. Plattner, M. Tignor, P. M. Midgley and K. L. Ebi. Bern). Switzerland: IPCC Working Group I Technical Support Unit, University of Bern.
  • Held I. M. (2005). The Gap between Simulation and Understanding in Climate Modeling. Bulletin of the American Meteorological Society 80, 1609-1614.
  • Hill B. (2013). Confidence and Decision. Games and Economic Behavior 82, 675-692.
  • Hulme M., S. Dessai, I. Lorenzoni and D. Nelson. (2009). Unstable Climates: exploring the statistical and social constructions of climate. Geoforum 40, 197-206.
  • IPCC. (2013). Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge and New York: Cambridge University Press.
  • IPCC. (2014). Climate Change 2014: Impacts, Adaptation, and Vulnerability. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge and New York: Cambridge University Press.
  • Jeffrey R. (1965). The Logic of Decision, 231 pp. Chicago: University of Chicago Press.
  • Jun M., R. Knutti, D. W Nychka. (2008). Local eigenvalue analysis of CMIP3 climate model errors. Tellus A 60.5, 992-1000.
  • Katzav J. (2013). Severe testing of climate change hypotheses. Studies in History and Philosophy of Philosophy of Modern Physics 44.4, 433-441.
  • Katzav J. (2014). The epistemology of climate models and some of its implications for climate science and the philosophy of science. Studies in History and Philosophy of Modern Physics 46, 228-238.
  • Katzav, J. & W. S. Parker (2018). Issues in the Theoretical Foundations of Climate Science. Studies in History and Philosophy of Modern Physics 63, 141-149.
  • Klibanoff P., M. Marinacci and S. Mukerji. (2005). A smooth model of decision making under ambiguity. Econometrica 73, 1849-1892.
  • Klintman M. (2019). Knowledge Resistance: How We Avoid Insight From Others. Manchester: Manchester University Press.
  • Knutti R., R. Furrer, C. Tebaldi, J. Cermak, and G. A. Meehl. (2010). Challenges in Combining Projections from Multiple Climate Models. Journal of Climate 23.10, 2739-2758.
  • Koopmans T. C. (1962). On flexibility of future preference. Cowles Foundation for Research in Economics, Yale University, Cowles Foundation Discussion Papers 150.
  • Kreps D. M. and E. L. Porteus. (1978). Temporal resolution of uncertainty and dynamic choice theory. Econometrica 46.1, 185-200.
  • Lahsen M. (2005). Seductive Simulations? Uncertainty Distribution Around Climate Models. Social Studies of Science 35.6, 895-922.
  • Lehrer K. and Wagner, C. (1981). Rational Consensus in Science and Society, 165 pp. Dordrecht: Reidel.
  • Lempert R. J., Popper, S. W., Bankes, S. C. (2003). Shaping the Next One Hundred Years: New Methods for Quantitative Long-Term Policy Analysis, 208 pp. Santa Monica, CA: RAND Corporation, MR-1626-RPC.
  • Lenhard J. and E. Winsberg. (2010). Holism, entrenchment, and the future of climate model pluralism. Studies in History and Philosophy of Modern Physics 41, 253-262.
  • Linkov I. et al. (2014). Changing the resilience program. Nature Climate Change 4, 407-409.
  • List C. and C. Puppe. (2009). Judgment aggregation: a survey. Oxford Handbook of Rational and Social Choice (eds. P. Anand, C. Puppe and P. Pattanaik). Oxford: Oxford University Press.
  • Lorenz E. (1995). Climate is what you expect. Prepared for publication by NCAR. Unpublished, 1-33.
  • Lloyd E. A. (2010). Confirmation and robustness of climate models. Philosophy of Science 77, 971-984.
  • Lloyd E. A. (2015). Model robustness as a confirmatory virtue: The case of climate science. Studies in History and Philosophy of Science 49, 58-68.
  • Lloyd E. A. (2009). Varieties of Support and Confirmation of Climate Models. Proceedings of the Aristotelian Society Supplementary Volume LXXXIII, 217-236.
  • Lloyd, E., N. Oreskes (2019). Climate Change Attribution: When Does it Make Sense to Add Methods? Epistemology & Philosophy of Science 56.1, 185-201.
  • Lusk, G. (2017). The Social Utility of Event Attribution: Liability, Adaptation, and Justice-Based Loss and Damage. Climatic Change 143, 201–12.
  • Mach, K. J., M. D. Mastrandrea, P. T. Freeman, and C. B. Field (2017). Unleashing Expert Judgment in Assessment. Global Environmental Change 44, 1–14.
  • Mann M. E., R. S. Bradley and M.K. Hughes (1998). Global-scale temperature patterns and climate forcing over the past six centuries. Nature 392, 779-787.
  • Maslin M. and P. Austin. (2012). Climate models at their limit?. Nature 486, 183-184.
  • Mastrandrea M. D., K. J. Mach, G.-K. Plattner, O. Edenhofer, T. F. Stocker, C. B. Field, K. L. Ebi, and P. R. Matschoss. (2011). The IPCC AR5 guidance note on consistent treatment of uncertainties: a common approach across the working groups. Climatic Change 108, 675-691.
  • McGuffie K. and A. Henderson-Sellers. (2005). A Climate Modelling Primer, 217 pp. New Jersey: Wiley.
  • McIntyre S. and R. McKitrick. (2003). Corrections to the Mann et. al. (1998) proxy data base and northern hemispheric average temperature series. Energy & Environment 14.6, 751-771.
  • Mongin P. (1995). Consistent Bayesian Aggregation. Journal of Economic Theory 66.2, 313-51.
  • Nordhaus W. D. (2007). A Review of the Stern Review on the Economics of Climate Change. Journal of Economic Literature 45.3, 686-702.
  • Nordhaus W. C. (2008). A Question of Balance, 366 pp. New Haven, CT: Yale University Press.
  • Oreskes N. and E. M. Conway. (2012). Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming, 355 pp. New York: Bloomsbury Press.
  • Oreskes N. (2007) The Scientific Consensus on Climate Change: How Do We Know We’re Not Wrong? Climate Change: What It Means for Us, Our Children, and Our Grandchildren (eds. J. F. C. DiMento and P. Doughman), 65-99. Boston: MIT Press.
  • Oreskes N., K. Shrader-Frechette and K. Belitz. (1994). Verification, validation, and confirmation of numerical models in the Earth Science. Science New Series 263.5147, 641-646.
  • Parfit D. (1984). Reasons and Persons, 560 pp. Oxford: Clarendon Press.
  • Parker W. S. (2009). Confirmation and Adequacy for Purpose in Climate Modelling. Aristotelian Society Supplementary Volume 83.1 233-249.
  • Parker W. S. (2010). Comparative Process Tracing and Climate Change Fingerprints.  Philosophy of Science 77, 1083-1095.
  • Parker W. S. (2011). When Climate Models Agree: The Significance of Robust Model Predictions. Philosophy of Science 78.4, 579-600.
  • Parker W. S. (2013). Ensemble modeling, uncertainty and robust predictions. Wiley Interdisciplinary Reviews: Climate Change 4.3, 213-223.
  • Parker W. S. (2014). Values and Uncertainties in Climate Prediction, Revisited. Studies in History and Philosophy of Science Part A 46, 24-30.
  • Petersen A. C. (2012). Simulating Nature: A Philosophical Study of Computer-Simulation Uncertainties and Their Role in Climate Science and Policy Advice, 210 pp. Boca Raton, Florida: CRC Press.
  • Resnik M. (1987). Choices: an introduction to decision theory, 221 pp. Minneapolis: University of Minnesota Press.
  • Savage L. J. (1954). The Foundations of Statistics, 310 pp. New York: John Wiley & Sons.
  • Sen A. (1982). Approaches to the choice of discount rate for social benefit–cost analysis.  Discounting for Time and Risk in Energy Policy (ed. R. C. Lind), 325-353. Washington, DC: Resources for the Future.
  • Sen A. (1970). Collective Choice and Social Welfar. San Francisco: Holden-Day Inc.
  • Sexton D. M. H., J. M. Murphy, M. Collins and M. J. Webb. (2012). Multivariate Probabilistic Projections Using Imperfect Climate Models. Part I: Outline of Methodology. Climate Dynamics 38, 2513-2542.
  • Sexton D. M. H., and J. M. Murphy. (2012). Multivariate Probabilistic Projections Using Imperfect Climate Models. Part II: Robustness of Methodological Choices and Consequences for Climate Sensitivity. Climate Dynamics 38, 2543-2558.
  • Shackley S., P. Young, S. Parkinson and B. Wynne. (1998). Uncertainty, Complexity and Concepts of Good Science in Climate Change Modelling: Are GCMs the Best Tools? Climatic Change 38, 159-205.
  • Smith L. A. and N. Stern. (2011). Uncertainty in science and its role in climate policy. Phil. Trans. R. Soc. A 369.1956, 4818-4841.
  • Spiegelhalter D. J. and H. Riesch. (2011). Don’t know, can’t know: embracing deeper uncertainties when analysing risks. Phil. Trans. R. Soc. A 369, 4730-4750.
  • Stainforth D. A., M. R. Allen, E. R. Tredger and L. A. Smith. (2007a). Confidence, Uncertainty and Decision-support Relevance in Climate Predictions. Philosophical Transactions of the Royal Society A 365, 2145-2161.
  • Stainforth D. A., T. E. Downing, R. Washington, A. Lopez and M. New. (2007b). Issues in the Interpretation of Climate Model Ensembles to Inform Decisions. Philosophical Transactions of the Royal Society A 365, 2163-2177.
  • Steele K. (2006). The precautionary principle: a new approach to public decision-making?. Law Probability and Risk 5, 19-31.
  • Steele K. and C. Werndl.  (2013). Climate Models, Confirmation and Calibration. The British Journal for the Philosophy of Science 64, 609-635.
  • Steele K. and C. Werndl. forthcoming (2015). The Need for a More Nuanced Picture on Use-Novelty and Double-Counting. Philosophy of Science.
  • Stern N. (2007). The Economics of Climate Change: The Stern Review, 692 pp. Cambridge: Cambridge University Press.
  • Stern, N. (2013). The Structure of Economic Modeling of the Potential Impacts of Climate Change: Grafting Gross Underestimation of Risk onto Already Narrow Scientific Models. Journal of Economic Literature 51.3, 838-859.
  • Thompson, Erica, Roman Frigg and Casey Helgeson. Expert Judgment for Climate Change Adaptation, Philosophy of Science 83(5), 2016, 1110-1121,
  • von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behaviour, 739 pp. Princeton: Princeton University Press.
  • Walley P. (1991). Statistical Reasoning with Imprecise Probabilities, 706 pp. New York: Chapman and Hall.
  • Weitzman M. L. (2009). On Modeling and Interpreting the Economics of Catastrophic Climate Change. The Review of Economics and Statistics 91.1, 1-19.
  • Werndl C. (2015). On defining climate and climate change. The British Journal for the Philosophy of Science, doi:10.1093/bjps/axu48.
  • Wilby R. L. and S. Dessai. (2010). Robust adaptation to climate change. Weather 65.7, 180-185.
  • Weisberg Michael. (2006). Robustness Analysis. Philosophy of Science 73, 730-742.
  • Winsberg E. (2012). Values and Uncertainties in the Predictions of Global Climate Models. Kennedy Institute of Ethics Journal 22, 111-127.
  • Winsberg, E. 2018. Philosophy and Climate Science. Cambridge: Cambridge University Press.
  • Winsberg, E and W. M. Goodwin (2016). The Adventures of Climate Science in the Sweet Land of Idle Arguments. Studies in History and Philosophy of Modern Physics 54, 9-17.
  • Worrall J. (2010). Error, Tests, and Theory Confirmation. Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability, and the Objectivity and Rationality of Science (eds. D. G. Mayo and A. Spanos), 125-154. Cambridge: Cambridge University Press.
  • Wüthrich, N. (2017). Conceptualizing Uncertainty: An Assessment of the Uncertainty Framework of the Intergovernmental Panel on Climate Change. In EPSA15 Selected Papers, 95-107. Cham: Springer.

 

Author Information

Richard Bradley
London School of Economics and Political Science
UK

Roman Frigg
London School of Economics and Political Science
UK

Katie Steele
Australian National University
Australia

Erica Thompson
London School of Economics and Political Science
UK

Charlotte Werndl
University of Salzburg
Austria
and
London School of Economics and Political Science
UK

Causation

The question, “What is causation?” may sound like a trivial question—it is as sure as common knowledge can ever be that some things cause another, that there are causes and they necessitate certain effects. We say that we know that what caused the president’s death was an assassin’s shot. But when asked why, we will most certainly reply that it is because the latter was necessary for the former—which is an answer that, upon close philosophical examination, falls short of veracity. In a less direct way, the president’s grandmother’s giving birth to his mother was necessary for his death too. That, however, we would not describe as this death’s cause.

The first section of this article states the reasons why we should care about causation, including those that are non-philosophical. Sections 2 and 3 define the axis of the division into ontological and semantic analyses, with the Kantian and skeptical accounts as two alternatives. Set out there is also Hume’s pessimistic framework for thinking about causation—since before we ask what causation is, it is vital to consider whether we can come to know it at all.

Section 4 examines the semantic approaches, which analyze what it means to say that one thing causes another. The first, the regularity theories, nonetheless turns out to be problematic when dealing with unrepeatable and implausibly enormous cases, among many. Some of these theories limit the ambitions of Lewis’s theory of causation as a chain of counterfactual dependence, and also suffer from the causal redundancy and causal transitivity objections. Although the scientifically-minded interventionists try to reconnect our will to talk in terms of causation with our agency, probability theories accommodate the indeterminacy of quantum physics and relax the strictness of exceptions-unfriendly regularity accounts. Yet they risk falling into the trap of confounding causation and probability.

The next section brings us back to ontology. Since causation is hardly a particular entity, nominalists define it with recurrence over and above instances. Realists bring forward the relation of necessitation, seemingly in play whenever causation occurs. Dispositionalism claims that to cause means to dispose to happen. Process theories base their analysis on the notions of process and transmission—for instance, of energy, which might capture well the nature of causation in the most physical sense.

Another historically significant family of approaches is the concern of Section 6, which examines how Kant removes causation from the domain of things-in-themselves to include it in the structure of consciousness. This has also inspired the agency views which claim agency is inextricably tied up with causal reasoning.

The last, seventh section, deals with the most skeptical work on causation. Some, following Bertrand Russell, have tried to get rid of the concept altogether, believing it a relic of a past and timeworn metaphysical speculation. Pluralism and thickism see the ill fate of any attempt at defining causation in that what the word can mean is in fact a bundle of different concepts, or not any single and meaningful one at all.

Table of Contents

  1. What Is Causation and Why Do People Care?
  2. Hume’s Challenge
  3. A Family Tree of Causal Theories
  4. Semantic Analyses
    1. Regularity Theories
      1. The Problem of Implausibly Enormous Cases
      2. The Problem of the Common Cause
      3. The Problem of Overdetermination
      4. The Problem of Unrepeatability
    2. Counterfactual Theories
      1. The Problems of Common Cause, Enormousness and Unrepeatability
      2. The Problem of Causal Redundancy
      3. A New Problem: Causal Transitivity
    3. Interventionism
    4. Probabilistic Theories
  5. Ontological Stances
    1. Nominalism
    2. Realism
    3. Dispositionalism
    4. Process Theories
  6. Kantian Approaches
    1. Kant Himself
    2. Agency Views
  7. Skepticism
    1. Russellian Republicanism
    2. Pluralism and Thickism
  8. References and Further Reading

1. What Is Causation and Why Do People Care?

Causation is a live topic across a number of disciplines, due to factors other than its philosophical interest. The second half of the twentieth century saw an increase in the availability of information about the social world, the growth of statistics and the disciplines it enables (such as economics and epidemiology), and the growth of computing power. This led, at first, to the prospect of much-improved policy and individual choice through analysis of all this data, and especially in the early twenty-first century, to the advent of potentially useful artificial intelligence that might be able to achieve another step-change in the same direction. But in the background of all of this lurks the specter of causation. Using information to inform goal-directed action often seems to require more than mere extrapolation or projection. It often seems to require that we understand something of the causal nature of the situation. This has seemed painfully obvious to some, but not to others. Increasing quantities of information and abilities to process it force us to decide whether or not causation is part of this march of progress or an obstacle on the road.

So much for why people care about causation. What is this thing that we care about so much?

To paraphrase the great physicist Richard Feynman, it is safe to say that nobody understands causation. But unlike quantum physics, causation is not a useful calculating device yielding astoundingly accurate predictions, and those who wish to use causal reasoning for any actual purpose do not have the luxury of following Feynman’s injunction to “shut up and calculate”. The philosophers cannot be pushed into a room and left to debate causation; the door cannot be closed on conceptual debate.

The remainder of this section offers a summary of the main elements of disagreement. The next section presents a “family tree” of different historical and still common views on the topic, which may help to make some sense of the state of the debate.

Some philosophers have asked what causation is, that is, they have asked an ontological question. Some of these have answered that it is something over and above (or at least of a different kind from) its instances: that there is a “necessitation relation” that is a universal rather than a particular thing, and in which cause-effect pairs participate, or of which they partake, or something similar “in virtue” of which they instantiate causation (Armstrong, 1983). These are realists about causation (noting that others discussed in this paragraph are also realists in a more general sense, but not about universals). Others, perhaps a majority, believe that causation is something that supervenes upon (or is ultimately nothing over and above) its instances (Lewis, 1983; Mackie, 1974). These are nominalists. Yet others believe that it is something somewhat different from either option: a disposition, or a bundle of dispositions, which are taken to be fundamental (Mumford & Anjum, 2011). These are dispositionalists.

Second, philosophers have sought a semantic analysis of causation, trying to work out what “cause” and cognates mean, in some deeper sense of “meaning” than a dictionary entry can satisfy. (It is worth bearing in mind, however, that the ontological and semantic projects are often pursued together, and cannot always be separated.) Some nominalists believe it is a form of regularity holding between distinct existences (Mackie, 1974). These are regularity theorists. Others, counterfactual theorists, believe it is a special kind of counterfactual dependence between distinct existences (Lewis, 1973a), and others hold that causes raise the probability of their effects in a special way (Eells, 1991; Suppes, 1970). Among counterfactual theorists are various subsets, notably interventionists (for example, Woodward, 2003) and contrastivists (for example, Schaffer, 2007). There is also an overlapping subset of thinkers with a non-philosophical motivation, and sometimes background, who develop technical frameworks for the purpose of performing causal inference and, in doing so, define causation, thus straying into the territory of offering semantic analysis (Hernán & Robins, 2020; Pearl, 2009; Rubin, 1974). Out of kilter with the historical motivation of those approaching counterfactual theorizing from a philosophical angle, some of those coming from practical angles appear not to be nominalists (Pearl & Mackenzie, 2018). Yet others, who may or may not be nominalists, hold that causation is a pre-scientific or “folk science” notion which, like “energy”, should be mapped onto a property identified by our current best science, even if that means deviating from the pre-scientific notion (Dowe, 2000).

Third, there are those who take a Kantian approach. While this is an answer to ontological questions about causation, it is reasonably treated in a separate category, different from the ontological approach mentioned first above in this section, because the question Kant tried to answer is better summarized not as “What sort of thing is causation?” but “Is causation a thing at all?” Kant himself thought that causation is a constitutive condition of experience (Kant, 1781), thus not a part of the world, but a part of us—a way we experience the world, without which experience would be altogether impossible. Late twentieth-century thinkers suggested that causation is not a necessary precondition of all experience but, more modestly, a dispositional property of us to react in certain ways—a secondary property, like color—arising from the fact that we are agents (Menzies & Price, 1993).

The fourth approach to causation is, in a broad sense, skeptical. Thus some have taken the view that it is a redundant notion, one that ought to be dispensed with in favor of modern scientific theory (Russell, 1918). Such thinkers do not have a standard name but might reasonably be called republicans, following a famous line of Bertrand Russell’s (see the first subsection of section 7.). Some (pluralists) believe that there is no single concept of causation but a plurality of related concepts which we lump together under the word “causation” for some reason other than that there is such a thing as causation (Cartwright, 2007; Stapleton, 2008). Yet another view, which might be called thickism and which may or may not be a form of pluralism, holds that causal concepts are “thick”, as some have suggested for ethical concepts (Anscombe, 1958; although Anscombe did not use this term herself). That is, the fundamental referents of causal judgements are not causes, but kicks, pushes, and so forth, out of which there is no causal component to be abstracted, extracted, or meaningfully studied (Anscombe, 1969; Cartwright, 1983).

Cutting across all these positions is a question as to what the causal relata are, if indeed causation is a relation at all. Some say they are events (Lewis, 1973a, 1986); others, aspects (Paul, 2004); or others, facts (Mellor, 1995), among other ideas.

Disagreement about fundamentals is great news if you are a philosopher, because it gives you plenty to work on. It is a field of engagement that has not settled into trench warfare between a few big guns and their troops. It is indicative of a really fruitful research area, one with live problems, fast-paced developments, and connections with real life—that specter that lurks in the background of philosophy seminar rooms and lecture halls, just as causation lurks in the background of more practical engagements.

However, confusion about fundamentals is not great news if you are trying to make the best sense of the data you have collected, looking for guidance on how to convince a judge that your client is or is not liable, trying to make a decision about whether to ban a certain food additive or wondering how your investment will respond to the realization of a certain geopolitical risk. It is certainly not helpful if one is trying to decide what will be the most effective public health measure to slow the spread of an epidemic.

2. Hume’s Challenge

David Hume posed the questions that all the ideas discussed in the remainder of this article attempt to answer. He had various motivations, but a fair abridgement might be as follows.

Start with the obvious fact that we frequently have beliefs about what will happen, or is happening elsewhere right now, or has happened in the past, or, more grandly, what happens in general. One of Hume’s examples is that the sun will rise tomorrow. An example he gives of a general belief is that bread, in general, is nourishing. How do we arrive at these beliefs?

Hume argues that such beliefs derive from experience. We believe the sun rises because we have experienced it rising on all previous mornings. We believe bread is nourishing because it has always been nourishing when we have encountered it in our experience.

However, Hume argues that this is an inadequate justification on its own for the kind of inference in question. There is no contradiction in supposing that the sun will simply not rise tomorrow. This would not be logically incompatible with previous experience. Previous experience does not render it impossible. On the contrary, we can easily imagine such a situation, perhaps use it as the premise for a story, and so forth. Similar remarks apply to the nourishing effects of bread, and indeed to all our beliefs that cannot be justified logically (or mathematically, if that is different) from some indisputable principles.

In arguing thus, Hume might be understood as reacting to the rationalist component of the emerging scientific worldview, that component that emphasized the ability of the human mind to reach out and understand. Descartes believed that through the exercise of reason we could obtain knowledge of the world of experience. Newton believed that the world of experience was indeed governed by some kind of mathematical necessity or numerical pattern, which our reason could uncover, and thus felt able to draw universal conclusions from a little, local data. Hume rejected the confidence characteristic of both Descartes and Newton. Given the central role that this confidence about the power of the human mind played in the founding of modern science, Hume, and empiricists more generally, might be seen as offering not a question about common sense inferences, but a foundational critique of one of the central impulses of the entire scientific enterprise—perhaps not how twentieth and twenty-first-century philosophers in the Anglo-American tradition would like to see their ancestry and inspiration.

Hume’s argument was simple and compelling and instantiated what appears to be a reasonably novel argumentative pattern or move. He took a metaphysical question and turned it into an epistemological one. Thus he started with “What is necessary connection?” and moved on to “How do we know about necessary connection?”

The answer to the latter question, he claimed, is that we do not know about it at all, because the only kind of necessity we can make sense of is that of logical and mathematical necessity. We know about the necessity of logic and mathematics through examining the relevant “ideas”, or concepts, and seeing that certain combinations necessitate others. The contrary would be contradictory, and we can test for this by trying to imagine it. Gandalf is a wizard, and all wizards have staffs; we cannot conceive of these claims being true and yet Gandalf being staff-less. Once we have the ideas indicated in those claims, Gandalf’s staff ownership status is settled.

Experience, however, offers no necessity. Things happen, while we do not perceive their being “made” to happen. Hume’s argument to establish this is the flip side of his argument in favor of our knowledge of a priori truths. He challenges us to imagine causes happening without their usual effects: bread not to nourish, billiard balls to go into orbit when we strike them (this example is a somewhat augmented form of Hume’s own one), and so forth. It seems that we can do this easily. So we cannot claim to be able to access necessity in the empirical world in this way. We perceive and experience constant conjunction of cause and effect and we may find it fanciful to imagine stepping from a window and gently floating to the ground, but we can do it, and sometimes do so, both deliberately and involuntarily (who has not dreamed they can fly?). But Hume agrees with Descartes that we cannot even dream that two and two make five (if we clearly comprehend those notions in our dream—of course one can have a fuzzy dream in which one accepts the claim that two and two make five, without having the ideas of two, plus, equals and five in clear focus).

Hume’s skepticism about our knowledge of causation leads him to skepticism about the nature of causation: the metaphysical question is given an epistemological treatment, and then the answer returned to the metaphysical question is epistemologically motivated. His conclusion is that, for all we can tell, there is no necessary connection, there is only a series of constant conjunctions, usually called regularities. This does not mean that there is no causal necessity, only that there is no reason to believe that there is. For the Enlightenment project of basing knowledge on reason rather than faith, this is devastating.

The constraint of metaphysical speculation by epistemological considerations remains a central theme of twenty-first century philosophy, even if it has somewhat loosened its hold in this time. But Hume took his critique a step further, with further profound significance for this whole philosophical tradition. He asked what we even mean by “cause”, and specifically, by that component of cause he calls “necessary connection”. (He identifies two others: temporal order and spatiotemporal contiguity. These are also topics of philosophical and indeed physical debate, but are less prominent in early twenty-first century philosophy, and thus are not discussed in this article.) He argues that we cannot even articulate what it would be for an event in the world we experience to make another happen.

The argument reuses familiar material. We have a decent grasp on logical necessity; it is the incoherence of the denial of the necessity in question, which we can easily spot (in his view). But that is not the necessary connection we seek. But, a question remains open, what other kind of necessity could there be? If it does not involve the impossibility of what is necessitated, then in what sense is it necessitated? This is not a rhetorical question; it is a genuine request for explanation. Supplying one is, at best, difficult; at worst, it is impossible. Some have tried (several attempts are discussed throughout the remainder of the article) but most have taken the view that it is impossible. Hume’s own explanation is that necessary connection is nothing more than a feeling, the expectation created in us by endless experience of same cause followed by same effect. Granted, this is a meaning for “necessary connection”; but it is one that robs “necessary” of anything resembling necessity.

The move from “What is X?” to “What does our concept of X mean?” has driven philosophers even harder than the idea that metaphysical speculation must be epistemologically constrained—partly because philosophical knowledge was thought for a long time to be constrained to knowledge of meanings; but that is another story (see Ch 10 of Broadbent, 2016).

This is the background to all subsequent work on causation as rejuvenated by the Anglo-American tradition, and also to the practical questions that arise. The ideas that we cannot directly perceive causation, and that we cannot reason logically from cause to effect, have repeatedly given rise to obstacles in science, law, policy, history, sports, business, politics—more or less any “world-oriented” activity you can think of. The next section summarizes the ways that people have understood this challenge: the most important questions they think it raises and their answers to these questions.

3. A Family Tree of Causal Theories

Here is a diagram indicating one possible way of understanding the relationship between different historically significant and still influential theories of, and approaches to, and even understandings of the philosophical problems posed by causation—and indeed some approaches that do not think the problems are philosophical at all.

two levels of theories of causation

Figure 1. A “family tree” of theories of causation

At the top level are approaches to causation corresponding to the kinds of questions one might deem important to ask about it. At the second level are theories that have been offered in response to these questions. Some of these theories have sub-theories which do not really merit their own separate level, and are dealt with in this article as variations on a theme (each receiving treatment in its own subsection).

Some of these theories motivate each other, in particular, nominalism and regularity theories often go hand-in-hand. Others are relatively independent, while some are outright incompatible. These compatibility relationships themselves may be disputed.

Two points should be noted regarding this family tree. First, an important topic is absent: the nature of the causal relata. This is because any stance about their nature does not constitute a position about causation on its own; it cuts across this family tree and features importantly in some theories but not in others. While some philosophers have argued that it is very important (Mellor, 1995, 2004; Paul, 2004; Schaffer, 2007), and featured it centrally in their theories of causation (second level on the tree), it does not feature centrally in any approach to causation (top level on the tree), except that insofar as everyone agrees that the causal relata, whatever they are, must be distinct to avoid mixing up causal and constitutive facts. The topic is skipped in this article because, while it is interesting, it is somewhat orthogonal.

The second point to note about this family tree is that others are possible. There are many ways one might understand twenty-first-century work on causation, and thus there are other “family trees” implicit in other works, including other introductions to the topic. One might even think that no such family tree is useful. The one presented above is a tool only one that the reader might find useful, but it should ultimately be treated as itself a topic for debate, dispute, amendment, or rejection.

4. Semantic Analyses

Semantic analyses of causation seek to give the meaning of causal assertions. They typically take “c causes e” to be the exemplary case, where “c” and “e” may be one of a number of things: facts, events, aspects, and so forth. (Here, lower case letters c and e are used to denote some particular cause and effect respectively. Upper case letters C and E refer to classes and yield general causal claims, as in “smoking causes lung cancer”.) Whatever they are, they are universally agreed to be distinct, since otherwise we would wrongly confuse constitutive with causal relations. My T-shirt’s having yellow bananas might end up as a cause of its having yellow shapes on it, for example, which is widely considered to be unacceptable—because it is quite different from my T-shirt’s yellow bananas causing the waitress bringing me my coffee to stare.

The main three positions are regularity theories, probabilistic theories, and counterfactual theories.

a. Regularity Theories

The regularity theory implies that causes and effects are not usually one-off pairs, but recurrent. Not only is the coffee I just drank causing me to perk up, but drinking coffee often has this effect. The regularity view claims that two claims suffice to explain causation: the fact  that causes are followed by their effects, plus the fact that cause-effect pairs happen a lot. On the other hand, coincidental pairings do not typically recur. I scratched my nose while drinking the coffee, and this scratching was followed by me perking up. But nose-scratching is not generally followed by perking up. Whereas coffee-drinking is. Coffee-drinking and perking up are part of a regularity; in Hume’s phrase they are constantly conjoined. Which cannot be said about nose-scratching and perking up.

Obviously, the tool needs sharpening. Most of the Cs that we encounter are not always followed by Es, and most of the Es that we encounter are not always (that is, not only) caused by Cs. The assassin shoots (c) the president, who dies (e). But assassins often miss. Moreover, presidents often die of things other than being shot.

David Hume is sometimes presented as offering a regularity theory of causation (usually on the basis of Pt 5 of Hume, 1748), but this is crude at best and downright false at worst (Garrett, 2015). More plausibly, he offered regularities as the most we can hope for in ontology of causation, that is, as the basis of any account of what there might be “in the objects” that most closely corresponds to the various causal notions we have. But his approach to semantics was revisionary; he took “cause” to express a feeling that the experience of regularity produces in us. Knowing whether such regularities continue in the objects beyond our experience requires that we know of some sort of necessary connection sustaining the regularity. And the closest thing to necessary connection that we know about is regularity. We are in a justificatory circle.

It was John Stuart Mill who took Hume’s regularity ontology and turned it into a regularity theory of causation (Mill, 1882). The first thing he did was to address the obvious point that causes and effects are not constantly conjoined, in either direction. He confined the direction of constancy to the cause-effect direction, so that causes are always followed by their effects, but effects need not be necessarily preceded by the same causes. He expanded the definition of “cause” to include the enormousness that suffices for the effect. So, if e is the president’s death, then to say that c caused e is not to say that Es are always preceded by Cs, but rather that Cs are always followed by Es. Moreover, when we speak of the president’s being shot as the cause, we are being casual and strictly inaccurate. Strictly speaking, c is not the cause, but c*, being the entirety of things that were in place, including the shot, such that this entirety of things is sufficient for the president’s death. Strictly speaking, c* is the cause of e. There is no mysterious necessitation invoked because “sufficient” here just means “is always followed by”. When the wind is as it was, and the president is where he was, and the assassin aims so, and the gun fires thus, and so on and so forth, the president always dies, in all situations of this kind.

Mill thought that exceptionless regularities could be achieved in this way. In fact, he believed that the “law of causality”, being the exceptionless regularity between causes (properly defined) and effects, was the only true law (Mill, 1882). All the laws of science, he believed, had exceptions: objects falling in air do not fall as Newton’s laws of motion say they should, for example (this example is not Mill’s own). But objects released just so, at just such a temperature and pressure, with just such a mass, shape and surface texture, always fall in this way. Thus, according to Mill, the law of causality was the fundamental scientific law.

This theory faces a number of objections, even setting aside the lofty claims about the status of the “law of causality”. The subsubsections below discuss four of them.

i. The Problem of Implausibly Enormous Cases

To be truly sufficient for an effect, a cause must be enormous. It must include everything that, if on another occasion it is different, yields an overall condition that is followed by a different effect. It is questionable that “cause” is reasonably understood as referring to such an enormousness.

Enormousness poses problems for more than just the analysis of the common meaning of “cause”. It also makes it unclear how we can arrive at and use knowledge of causes. These are such gigantic things that they are bound to be practically unknowable to us. What makes our merry inference from a shot by an ace assassin who has never yet missed to the imminent death of the president is not the fact that the assassin has never yet missed, since this constancy is incidental; the causal regularity is between huge preceding conditions. In the previous cases in this section where the assassin shot, these may well not have been at all the same.

It is not clear that such objections are compelling, however. The idea of Mill’s account concerns the nature of causation and not our knowledge of it, much less our casual inferences, which might well depend on highly contingent and local regularities, which might be underwritten by truly causal ones without instantiating them. Mill himself provides a lengthy discussion of the use of causal language to pick out one part of the whole cause. As for getting the details right, Mill’s central idea seems to admit of other implementations, and an advocate would want to try these.

There was a literature in the early-to-middle twentieth century trying, in effect, to mend Mill’s account so as to get the blend of necessity and sufficiency just right for correctly characterizing the semantics of “cause”, against a background assumption that Millian regularity was the ultimate ontological truth about causation. This literature took its final forms in Jonathan Mackie’s INUS analysis (Mackie, 1974).

Mackie offered more than one account of causation. His INUS analysis was an account of causation “in the objects”, that is, an account in the Humean spirit of offering the closest possible objective characterization of what we appear to mean by causal judgements, without necessarily supposing that causal judgements are ultimately meaningful or that they ultimately refer to anything objective.

Mackie’s view was that a cause was an insufficient but necessary part of an unnecessary but sufficient condition for the effect. Bear in mind that “necessary” and “sufficient” are to be understood materially, non-modally, as expressing regularities: “x is necessary for y” means “y is always accompanied (or in the causal case, preceded) by y” and “x is sufficient for y” means “x is always accompanied (or in the causal case, followed) by y”. If we set aside temporal order, necessity and sufficiency are thus inter-definable; for x to be sufficient for y is for y to be necessary for x, and vice versa.

Considering our assassin, how does his shot count as a cause, according to the INUS account?

Take the I of INUS first. The assassin’s shot was clearly Insufficient for the president’s death. The president might suddenly have dipped his head to bestow a medal on a citizen (Forsyth, 1971). All sorts of things can and do intervene on such occasions. Shots of this nature are not universally followed by deaths. c is Insufficient for e.

Second, take the N. The shot is clearly Necessary in some sense for the death. In that situation, without the shot, there would have been no death. In strict regularity-talk, such situations are not followed by deaths in the absence of a shot. At the same time, we can hardly say that shots are required for presidents to die; most presidents find other ways to discharge this mortal duty. Mackie explains this limited necessity by saying not that c is Necessary for e, but that c is a Necessary part of a larger condition that preceded e.

Moving to the U, this larger condition is Unnecessary for the effect. There are plenty of presidential deaths caused by things other than shots, as just discussed; this was the reason we saw for not saying that the shot is necessary for the death. c is an Insufficient part but Necessary part of an Unnecessary condition for e.

Finally, the S. The condition of which c is an unnecessary part (so far as the occurrence of e is concerned), but it is sufficient. e happens—and it is no coincidence that it does. In strict regularity talk, every such condition is followed by an E. There is no way for an assassin to shoot just so, in just those conditions, which include the non-ducking of the president, his lack of a bullet proof vest, and so forth, and for the president not to die. Thus c is an Insufficient but Necessary part of an Unnecessary but Sufficient condition for e. To state it explicitly:

c is a cause of e if and only if c is a necessary but insufficient part of an unnecessary but sufficient condition for e.

In essence, Mackie borrows Mill’s “whole cause” idea, but drops the implausible idea that “cause” strictly refers to the “whole cause”. Instead, he makes “cause” refer to a part of the whole cause, one that satisfies the special conditions.

As well as addressing the problem of enormousness, which is fundamentally a plausibility objection, Mackie intends his INUS account to address the further and probably more pressing objections which follow.

ii. The Problem of the Common Cause

An immediate problem for any regularity account of causation is that, just as effects have many causes, causes also have many effects, and these effects may accompany each other very regularly. Recall Mill’s clarification that effects need not be constantly preceded by the same causes, and that “constant conjunction” was in this sense directional: same causes are followed by same effects, but not vice versa. This is strongly intuitive—as the saying goes, there is more than one way to skin a cat. Essentially, Mill tells us that we do not have to worry that effects are not always preceded by the same causes.

However, we are still left in a predicament, even with this unidirectional constant conjunction of same-cause-same-effect. When combined with the fact that a single cause always has multiple effects, we seem to land up with the result that constant conjunctions will also obtain between these effects. Cs are always followed by E1s, and Cs are always followed by E2s. So, whenever there is a C, we have an E1 and an E2, meaning that whenever we have an E1, we have an E2, and vice versa.

How does a regularity theory get out of this without dropping the fundamental analytical tool it uses to distinguish cause from coincidence, the unfailing succession of same effect on same cause, knowing that the singular “effect” should actually be substituted with the plural “effects”?

Here is an example of the sort of problem for naïve regularity theories that Mackie’s account is supposed to solve. My alarm sounds, and I get out of bed. Shortly afterwards, our young baby starts to scream. This happens daily: the alarm wakes me up, and I get out of bed; but it also wakes the baby up. I know that it is not my getting out of bed that causes the baby to scream. How? Because I get out of bed in the night at various other times, and the baby does not wake up on those occasions; because my climbing out of bed is too quiet for a baby in another room to hear; and for other such reasons. Also, even when I sleep through the alarm (or try to), the baby wakes up. But what if the connections were as invariable as each other—there were no (or equally many) exceptions?

Consider this classic example. The air pressure drops, and my barometer’s needle indicates that there will be a storm. There is a storm. My barometer’s needle dropping obviously does not cause the storm. But, as a reliable storm-predictor, it is followed by a storm regularly—that is the whole point of barometers.

Mackie’s INUS theory supplies the following answer. The barometer’s falling is not an INUS condition for the storm’s occurrence, because situations that are exactly similar except for the absence of a barometer can and do occur. The falling of the barometer may be a part of a sufficient condition for the storm to occur, but it is not a necessary part of that condition. Storms happen even when no barometer is there to predict them. (Likewise, the storm is not an INUS condition for the barometer falling, in case that is a worry despite the temporal order, because barometers can be induced to fall in vacuum chambers.)

Thus the intuition I have in the alarm/baby case is the correct one; the regularity between alarm and baby waking is persistent regardless of my getting out of bed, and that between my getting out of bed and the baby waking fails in otherwise similar occasions where there is no alarm.

However, this all depends on a weakening of the initial idea behind the regularity theory, since it amounts to accepting that there are many cases of apparent causation without underlying regularity, which are therefore true, not in virtue of match-strikes being followed by flames, but for a more complicated reason, albeit one that makes use of the notion of regularity. Hume’s idea that we observe like causes followed by like effects suffers a blow, and together with it, the epistemological motivation of the regularity theory, as well as its theoretical elegance. It is to this extent a concession on the part of the regularity theory. There are other cases where we do want to say that c causes e even though Cs are not always followed by Es.

In fact, such is the majority of cases. Striking the match causes it to light even though many match-strikes fail to produce a spark, breaking the match, and so forth. There are similar scenarios in which the match is struck but there is no flame; yet the apparent conclusion that the match strike does not cause the flame cannot be accepted. Perhaps we must insist that the scenarios differ because the match is not struck exactly so, but now we are not analyzing the meaning of “striking the match caused it to light”, since we are substituting an unknown and complicated event for “striking the match”, for the sake of insisting that causes are always followed by their effects—which is a failing of the analytical tool.

Common cause situations thus present prima facie difficulties for the regularity account. Mackie’s account may solve the problem; nonetheless, if there were an account of causation that did not face the problem in the first place, or that dealt with the problem with less cost to the guiding idea of the regularity approach and with less complication, it would be even more attractive. This is one of the primary advantages claimed by the two major alternatives, counterfactual and probabilistic accounts, which are discussed in their two appropriate subsections below.

iii. The Problem of Overdetermination

As noted in the subsubsection on the problem of the common cause, many effects can be caused in more than one way. A president may be assassinated with a bullet or a poison. The regularity theory can deal with this easily by confining the relevant kind of regularity to one direction. In Mackie’s account, causes are not sufficient for their effects, which may occur in other ways. But the problem has another form. If an effect may occur in more than one way, what is to stop more than one of these ways from being present at the same time? Assassin 1 shoots the president, but Assassin 2’s on-target bullet would have done the job if Assassin 1 had missed. c causes e, but c’ would have caused e otherwise.

Such situations are referred to by various names. This article uses the term redundancy as a catch-all for any situation like this, in which a cause is “redundant” in the sense that the effect would have occurred without the actual cause. (Strictly, all that is required is that the cause might have occurred, because the negation of “would not” is “might” (Lewis, 1973b).) Within redundancy, we can distinguish symmetric from asymmetric overdetermination. Symmetric overdetermination occurs when two causes appear absolutely on a par. Suppose two assassins shoot at just the same time, and both bullets enter the president’s heart at just the same time. Either would have sufficed, but in the event, both were present. Neither is “more causal”. The example is not contrived. Such situations are quite common. You and I both shout “Look out!” to the pedestrian about to step in front of a car, and both our shouts are loud enough to cause the pedestrian to jump back. And so forth.

In asymmetric overdetermination, one of the events is naturally regarded as the cause, while the other is not, but both are sufficient in the circumstances for the effect. One is a back-up, which would have caused the effect had the actual cause not done so. For example, suppose that Assassin 2 had fired a little later than Assassin 1, and that the president was already dead by the time Assassin 2’s bullet arrived. Assassin 2’s shot did not kill the president, but had Assassin 1 not shot (or had he not shot accurately enough), Assassin 2’s shot would still have killed the president. Such cases are more commonly referred to as preemption, which is the terminology used in this article since it is more descriptive: the first cause preempts the second one. Again, preemption examples need not be contrived or far-fetched. Suppose I shout “Look out!” a moment after you, but still soon enough for the pedestrian to step back. Your shout caused the pedestrian to step back, but had you not shouted, my shout would have caused the pedestrian to step back. There is nothing outlandish about this; such things happen all the time.

The difficulty here is that there should be two INUS conditions where there is one. Assassin 1’s shot is a necessary part of a sufficient condition for the effect. But so is Assassin 2’s shot. However, Assassin 1’s shot is the true cause.

In the symmetric overdetermination case, one may take the view that they are both equally causes of the effect in question. However, there is still the preemption case, where Assassin 1 did the killing and not Assassin 2. (If you doubt this, imagine they are a pair of competitive twins, counting their kills, and thus racing to be first to the president in this case; Assassin 1 would definitely insist on chalking this one up as a win).

Causal redundancy has remained a thorn in the side of all mainstream analyses of causation, including the counterfactual account (see the appropriate subsection). What makes it so troubling is that we use this feature of causation all the time. Just as we exploit the fact that causes have multiple effects when we are devising measuring instruments, we exploit the fact that we can bring a desired effect about in more than one way every time we set up a failsafe mechanism, a Plan B, a second line of defense, and so forth. Causal redundancy is no mere philosopher’s riddle: it is a useful part of our pragmatic reasoning. Accounting for the fact that we use “cause” in situations where there is also a redundant would-be cause thus seems central to explicating “cause” at all.

iv. The Problem of Unrepeatability

This is less discussed than the problems of the common cause and overdetermination, but it is a serious problem for any regularity account. The problem was elegantly formulated by Bertrand Russell, who pointed out that, once a cause is specified so fully that its effect is inevitable, it is at best implausible and perhaps (physically) impossible that the whole cause occur more than once (Russell, 1918). The fundamental idea of the regularity approach is that cause-effect pairs instantiate regularities in a way that coincidences do not. This objection tells against this fundamental idea. It is not clear what the regularity theorist can reply. She might weaken the idea of regularity to admit of exceptions, but then the door is open to coincidences, since my nose-scratching just before the president’s death might be absent on another such occasion, and yet this might no longer count against its candidacy for cause. At any rate, the problem is a real one, casting doubt on the entire project of analyzing causation in terms of regularity.

We might respond by substituting a weaker notion than true sufficiency: something like “normally followed by”. Nose-scratchings are not normally followed by presidents’ deaths. However, this is not a great solution for regularity theories, because (a) the weaker notion of sufficiency is a departure from the sort of clarity that regularity theorists would otherwise celebrate, and (b) a similar battery of objections will apply: we can find events that, by coincidence, are normally followed by others, merely by chance. Indeed, if enough things happen, so that there are enough events, we can be very confident of finding at least some such patterns of events.

b. Counterfactual Theories

Mackie developed a counterfactual theory of the concept of causation, alongside his theory of causation in objects as regularity. However, at almost exactly the same time, a philosopher at the other end of his career (near the start) developed a theory sharing deep convictions about the fundamental nature of regularities, the priority of singular causal judgements, and the use of counterfactuals to supply their semantics, and yet setting the study of causation on an entirely new path. This approach dominated the philosophical landscape for nearly half a century since the time of writing, not only as a prominent theory of causation, but as an outstanding piece of philosophical work, and thus served as an exemplar for analytic metaphysicians, as a central part of the 1970s story of the emboldening of analytic metaphysics, following years in exile while positivism reigned.

David Lewis’s counterfactual theory of causation (Lewis, 1973a) starts with the observation that, commonly, if the cause had not happened, the effect would not have happened. To build a theory from this observation, Lewis had three major tasks. First, he had to explain what “would” means in this context; he had to provide a semantics for counterfactuals. Second, he had to deal with cases where counterfactuals appear to be true without causation being present, so that counterfactual dependence appears not to be sufficient for causation (since if it were, a lot of non-causes would be counted as causes). Third, he had to deal with cases where it appears that, if the cause had not happened, the effect would still have happened anyway: cases of causal redundancy, where counterfactual dependence appears not to be necessary for causation.

For a considerable period of time, the consensus was that Lewis had succeeded with the first two tasks but failed the third. In the early years of the twenty-first century, however, the second task—establishing that counterfactuals are sufficient for causation—also received critical scrutiny.

Lewis’s theory of causation does not state that effect counterfactually depends on cause, but rather, that c causes e if and only if there is a causal chain running from c to e whose links consist in a chain of counterfactual dependence. The reason for the use of chains is explained by the need to respond to the problem of preemption, as explained in the subsection covering the problem of causal redundancy. Counterfactual dependence is thus not a necessary condition for causation. However, it is a sufficient condition, since whenever we do find counterfactual dependence (of the “right sort”), we find causation. On his view, counterfactual dependence is thus sufficient but not necessary for causation; what is necessary is a chain of counterfactual dependence, but not necessarily the overarching dependence of effect on cause.

The best way to understand Lewis’s theory is through his responses to problems (as he himself sets it out). This is the approach taken in the remainder of this subsection.

i. The Problems of Common Cause, Enormousness and Unrepeatability

Lewis takes his theory to be able to deal easily with the problem of the common cause, which he parcels with another problem he calls the problem of effects. This is the problem that causes might be thought to counterfactually depend on their effects as well as the other way around. Not so, says Lewis, because counterfactual dependence is almost always forward-tracking (Lewis, 1973a, 1973b, 1979). The cases where it is not are easily identifiable, and these occurrences of counterfactual dependence are not apt for analyzing causation, just as counterfactuals representing constitutive relations (such as “If I were not typing, I would not be typing fast”) are not apt.

Lewis’s argument for the ban on backtracking is as follows. Suppose a spark causes a fire. We can imagine a situation where, with a small antecedent change, the fire does not occur. This change may involve breaking a law of nature (Lewis calls such changes “miracles”) but after that, the world may roll on exactly as it would under our laws (Lewis, 1979). This world is therefore very similar to ours, differing in one minor respect.

Now consider what we mean when we start a sentence with “If the fire had not occurred…” By saying so, we do not mean that the spark would not have occurred either. For otherwise, we would also have to suppose that the wire was never exposed, and thus that the careless slicing of a workman’s knife did not occur, and therefore that the workman was more conscientious, perhaps because his upbringing was different, and that of his parents before him, and…? Lewis says: that cannot be. When we assert a counterfactual, we do not mean anything like that at all. Rather, we mean that the spark still occurred, along with most other earlier events; but for some or other reason, the fire did not.

Why this is so is a matter of considerable debate, and much further work by Lewis himself. For these purposes, however, all that is needed is the idea that, by the time when the fire occurs, the spark is part of history, and there will be some other way to stop the fire—some other small “miracle”—that occurs later, and thus preserves a larger degree of historical match with the actual world, rendering it more similar.

The problem of the common cause is then solved by way of a simple parallel. It might appear that there is counterfactual dependence between the effects of a common cause: between barometer falling and storm, for example. Not so. If the barometer had not fallen, the air pressure, which fell earlier, would remain fallen; and the storm would have occurred anyway. If the barometer had not fallen, that would be because some tiny little “miracle” would have occurred shortly beforehand (even Lewis’s account requires at least this tiny bit of backtracking, and he is open about that.) This would lead to its not falling when it should. In a nutshell, if the barometer had not fallen, it would have been broken.

Put that way, the position does not sound so attractive; on the contrary, it sounds somewhat artificial. Indeed, this argument, and Lewis’s theory of causation as a whole, depend heavily on a semantics for counterfactuals according to which the closest world at which the antecedent is true determines the truth of the counterfactual. If the consequent is true at that world, the counterfactual is true; otherwise, not. (Where the antecedent is false, we have vacuous truth.) This semantics is complex and subject to many criticisms, but it is also an enormous intellectual achievement, partly because a theory of causation drops out of it virtually for free, or so it appears when the package is assembled. There is no space here to discuss the details of Lewis’s theory of counterfactuals (for critical discussions see in particular: Bennett, 2001, 2003; Elga, 2000; Hiddleston, 2005), but if we accept that theory, then his solution to the problem of effects follows easily.

Lewis deals even more easily with the problems of enormousness and unrepeatability that trouble regularity theories. The problem of enormousness is that, to ensure a truly exceptionless regularity, we must include a very large portion of the universe indeed into the cause (Mill’s doctrine of the “whole cause”). According to Mill, strictly speaking, this is what “cause” means. But according to common usage, it most certainly is not what “cause” means: when I say that the glass of juice quenched my thirst, I am not talking about the Jupiter, the Andromeda galaxy, and all the other heavenly bodies exerting forces on the glass, the balance of which was part of the story of the glass raising to my lips. I am talking about a glass of juice.

The counterfactual theory deals with this easily. If I had not drunk the juice, my thirst would not have been quenched. This is what it means to say that drinking the juice caused my thirst to be quenched; which is what I mean when I say that it quenched my thirst. There is no enormousness. There are many other causes, because there are many other things that, had they not been so, would have resulted in my thirst not being quenched. But, Lewis says, a multiplicity of causes is no problem; we may have all sorts of pragmatic reasons for singling some out rather than others, but these do not have implications for the underlying concept of cause, nor indeed for the underlying causal facts.

The problem of unrepeatability was that, once inflated to the enormous scale of a whole cause, it becomes incredible that such things recur at all, let alone regularly. Again, there is no problem here: ordinary events like the drinking of juice can easily recur.

While later subsubsections discuss the problems that have proved less tractable for counterfactual theories, we should firstly note that even if we set aside criticisms of Lewis’s theory of counterfactuals, his solution to the problem of the common cause is far less plausible on its own terms than Lewis and his commentators appear to have appreciated. It is at least reasonable to suggest that we use barometers precisely because they track the truth of what they predict (Lipton, 2000). It does not seem wild to think that if the barometer had not fallen, the storm would not after all have been going to occur: more naturally, the storm would not after all have been impending. Lewis’s theory implies that in the nearest worlds where the barometer does not fall, my picnic plans would have been rained out. If I believed that, I would immediately seek a better barometer.

Empirical evidence suggests that there is a strong tendency for this kind of reasoning in situations where causes and their multiple effects are suitably tightly connected (Rips, 2010). Consider a mechanic wondering why the car will not start. He tests the lights which also do not work. So he infers that it is probably the battery. It is. But in Lewis’s closest world where the lights do work, the battery is still flat: an outrageous suggestion for both the mechanic and any reasonable similarity-based semantics of counterfactuals (for another instance of this objection see Hausman, 1998). Or, if not, then he must accept that the lights’ not lighting causes the car not to start (or vice versa). Philosophers are not usually very practical and sometimes this shows up; perhaps causation is a particularly high-risk area in this regard, given its practical utility.

ii. The Problem of Causal Redundancy

If Assassin 1 had not shot (or had missed) then the president still would (or might) have died, because Assassin 2 also shot. Recall that two important kinds of redundancy can be distinguished (as discussed in the subsubsection on the problem of the common cause). One is symmetric overdetermination, where the two bullets enter the heart at the same time. Lewis says that in this case our causal intuitions are pretty hazy (Lewis, 1986). That seems right; imagine a firing squad—what would we say about the status of Soldier 1’s bullet, Soldier 2’s bullet, Soldier 3’s, … when they are all causally sufficient but none of them causally necessary? We would probably want to say that it was the whole firing squad that was the cause of the convict’s death. So we should in those residual overdetermination cases that cannot be dealt with in other ways, says Lewis. Assassin 1 and Assassin 2 are together the cause. The causal event is the conjunction of these two events. Had that event not occurred, the effect would not have occurred. Lewis runs into some trouble with the point that the negation of a conjunction is achieved by negating just one of its conjuncts, and thus Assassin 1’s not shooting is enough to render the conjunctive event absent—even if Assassin 2 had still shot and the president would still have died. Lewis says that we have to remove the whole event when we are assessing the relevant counterfactuals.

This starts to look less than elegant; it lacks the conviction and sense of insight that characterize Lewis’s bolder propositions. However, our causal intuitions are so unclear that we should take the attitude that spoils go to the victor (meaning, the account that has solved all the cases where our intuitions are clear). Even if this solution to symmetric overdetermination is imperfect, which Lewis does not admit, the unclarity of our intuitions would mean that there is no basis to contest the account that is victorious in other areas.

Preemption is the other central kind of causal redundancy, and it has proved a persistent problem for counterfactual approaches to causation. It cannot be set aside as a “funny case” in the way of symmetric overdetermination, because we do have clear ideas about the relevant causal facts, but they do not play nicely with counterfactual analysis. Assassins 1 and 2 may be having a competition as to who can chalk up more “kills”, in which case they will be deeply committed to the truth of the claim that the preempting bullet really did cause the death, despite the subsequent thudding of the loser’s bullet into the presidential corpse. A second later or a day later—it would not matter from their perspective.

Lewis’s attempted solution to the problem of preemption seeks, once again, to apply features of his semantics for counterfactuals. The two features applied are non-transitivity and, once again, non-backtracking.

Counterfactuals are unlike indicative conditionals in not being transitive (Lewis, 1973b, 1973c). For indicatives, the pattern If A then B, if B then C, therefore if A then C is valid. But not so for counterfactuals. If Bill had not gone to Cambridge (B), he would have gone to Oxford (C); and if Bill had been a chimpanzee (A), he would not have gone to Cambridge (B). If counterfactuals are transitive, then it can be concluded that, if Bill had been a chimpanzee (A), he would have gone to Oxford (C). Notwithstanding its prima facie appeal, this argument has not been found compelling, and the moral usually drawn is that transitivity fails for counterfactuals.

Lewis thus suggests that causation consists in a chain of counterfactual dependence, rather than a single counterfactual. Suppose we have a cause c and an effect e, connected by a chain of intermediate events d1, … dn. Lewis says: it can be false that if c had not occurred then e would not have occurred, yet true that c causes e, provided that there are events d1, … dn such that if c had not occurred then d1 would not have occurred, and… if dn (henceforth dn is simply called d for readability) had not occurred, then e would not have occurred.

This is one step of the solution, because it provides for the effect to fail to counterfactually depend upon Assassin 1’s shot, yet Assassin 1’s shot to still be the cause. Provides for, but does not establish. The obvious remaining task is to establish that there is a chain of true counterfactuals from Assassin 1’s shot to the president’s death—and, if there is, that there is not also a chain from Assassin 2’s shot.

This is where the second deployment of a resource from Lewis’s semantics for counterfactuals comes into play (and this is sometimes omitted from explanations of how Lewis’s solution to preemption is supposed to work). His idea is that, at the time of the final event in the actual causal chain, d, the would-be causal chain has already been terminated, thanks to something in the actual causal chain. d* has already failed to happen, so to speak: its time has passed. So “~d → ~e” is true, because d* would not occur in the absence of d. ~d-worlds where d* occurs are further than some worlds where ~d* as in actuality.

This solution may work for some cases; these have become known as early preemption cases. But it does not work for others, and these have become known as late preemption. Consider the moment when Assassin 1’s bullet strikes the president, and suppose that this is the last event, d, in the causal chain from Assassin 1’s shot c to the president’s death e. Then ask what would have happened if this event had not happened—by a small miracle the bullet deviated at the last moment, for example. At that moment, Assassin 2’s bullet was speeding on its lethal path towards the president. On Lewis’s view, after the small miracle by which Assassin 1’s bullet does not strike (after ~d), the world continues to evolve as if the actual laws of nature held. So Assassin 2’s bullet strikes the president a moment later, killing him.

Various solutions have been tried. We might specify the president’s death very precisely, as the death that occurred just then, a moment earlier than the death that would have occurred had Assassin 2’s bullet struck; and the angle of the bullet would have been a bit different; and so forth. In short: that death would not have occurred, but for Assassin 1, even if some other, similar death, involving the same person and a similar cause, would have occurred in its place. But Lewis himself provides a compelling response, which is simply that this is not at all what phrases like “the president died” or “the death of the president” refer to when we use them in a causal statement. Events may be more or less tightly specified, and there can be a distinction drawn between actual and counterfactual deaths, tightly specified. But that is not the tightness of specification we actually use in this causal judgement, as in many others.

A related idea is to accept that the event of the president’s death is the same in the actual and counterfactual cases, but to appeal to small differences in the actual effect that would have happened if the actual cause had been a bit different. Therefore, in very close worlds, where Assassin 1 shot just a moment earlier or later, but still soon enough to beat Assassin 2, or where a fly in the bullet’s path had caused just a miniscule deviation, or similar ones, the death would have been just minutely different. It still counts as the same death-event, but with just slightly different properties. Influence is what Lewis calls this counterfactual co-variance of event properties, and he suggests that a chain of influence connects cause and effect, but not preempted cause and effect (Lewis, 2004a).

However, there even seem to be cases where influence fails, notably the trumping cases pressed in particular by Jonathan Schaffer (2004). Merlin casts a spell to turn the prince into a frog at the stroke of midnight. Morgana casts the same spell, but at a later point in the day. It is the way of magic, suppose, that the first spell cast is the one that operates; had Morgana cast a spell to turn the prince into a toad instead, the prince would nevertheless have turned into a frog, because Merlin’s earlier spell takes priority. Yet she in fact specified a frog. If Merlin had not cast his spell, the prince would still have turned into a frog—and there would have been no difference at all in the effect. There is no chain of influence.

We do not have to appeal to magic for such instances. I push a button to call an elevator, which duly illuminates, but even so, an impatient or unobservant person arriving directly after me pushes it again. The elevator arrives. It does so in just the same way and in just the same time as if I had not pushed it, or had pushed it just a tiny moment earlier or later, more or less forcefully, and so forth. In today’s world, where magic is rarely observed, electrical mediation of cause and effects is a fruitful hunting ground for cases of trumping.

There is a large literature on preemption, because the generally accepted conclusion is that, despite Lewis’s extraordinary ingenuity, the counterfactual analysis of causation cannot be completed. Many philosophers are still attracted to a counterfactual approach: indeed it is an active area of research outside philosophy (as in interdisciplinary work), offering as it does a framework for technical development and thus for operationalization in the business of inferring causes. But for analyzing causation—for providing a semantic analysis, for saying what “causation” means—there is general acceptance that some further resource is needed. Counterfactuals are clearly related to causation in a tight way, but the nature of that connection still appears frustratingly elusive.

iii. A New Problem: Causal Transitivity

Considerably more could be said about counterfactual analysis of causation; it dominated philosophical attention for decades, and drew more attention than any other approach after superseding the regularity theories in the 1970s. Since discussions of preemption dried up, attention has shifted to the less controversial claim that counterfactual dependence is sufficient for causation. One is briefly introduced here: transitivity.

In Lewis’s account, and more broadly, causation is often supposed to be transitive, even though counterfactual dependence is not. This is central to Lewis’s response to the problem of preemption. It also seems to tie with the “non-discriminatory” notion of cause, according to which my grandmother’s birth is among the causes, strictly speaking, of my writing these words, even if we rarely mention it.

To say that a relation R is transitive is to say that if R(x,y) and R(y,z) then R(x,z). There seem to be cases showing that causation is not like this after all. Hiker sees a boulder bounding down the slope towards him, ducks, and survives. Had the boulder not bounded, he would not have ducked, and had he not ducked, he would have died. There is a chain of counterfactual dependence, and indeed a chain of causation. But there is not an overarching causal relation. The bounding boulder did not cause Hiker’s survival.

Cases of this kind, known as double prevention, have provoked various solutions, not all of which involve an attempt to “fix” the Lewisian approach. Ned Hall suggests that there are two concepts of causation, which conflict in cases like this (Hall, 2004). Alex Broadbent suggests that permitting backtracking counterfactuals in limited contexts permits introducing as a necessary condition on causation the dependence of cause on effect, which cases of this kind fail (Broadbent, 2012). But their significance remains unclear.

c. Interventionism

There is a very wide range of other approaches to the analysis of causation, given the apparent dead ends that the big ideas of regularity and counterfactual dependence have reached. Some develop the idea of counterfactual dependence, but shift the approach from conceptual analysis to something less purely conceptual, more closely related to causal reasoning, in everyday and scientific contexts, and perhaps more focused on investigating and understanding causation than producing a neat and complete theory. Interventionism is the most well-known of these approaches.

Interventionism starts with the idea that causation is fundamentally connected to agency: to the fact that we are agents who make decisions and do things in order to bring about the goals we have decided upon. We intervene in the world in order to make things happen. James Woodward sets out to remove the anthropocentric component of this observation, to devise a characterization of interventions in broadly speaking objective terms, and to use this as the basis for an account of how causal reasoning works—meaning, it manages to track how the world works, and thus enables us to make things happen (Woodward, 2003, 2006).

Woodward’s interests are thus focused on causal explanation in particular, trying to answer the questions of what causal explanations amount to, what information they carry, what they mean. The notion of explanation he arrived at is analyzed and unpacked in detail. The target of analysis shifts from “c causes e”, not merely to “c explains e” (which was the target of much previous work in the philosophy of explanation), but to a full paragraph of explanation of why and how the temperature in a container increases when the volume is reduced in terms of the ideal gas law and the kinetic theory of heat.

Interventionism offers a different approach to thinking about causation, and perhaps the most difficult thing for someone approaching it from the perspective of the Western philosophical canon is to work out what exactly it achieves, or aims to achieve. It does not tell us precisely what causation itself is. It may help us understand causation; but if it does, the upshot does not fall short of being problematic—being a series of interesting observations, akin to those of J. L. Austin and the ordinary language philosophers, or an operationalizable concept of causation, one that might be converted into a fully automatic causal reasoning “module” to be implemented in a robot. The latter appears to be the goal of some in the non-philosophical world, such as Judea Pearl. Such efforts are ambitious and interesting, potentially illuminating the nature of causal inference, even if this potential is yet to be fully realized, and even if a question of significance so long as implementation remains hard to conceive.

Perhaps what interventionist frameworks offer is a language for talking about causation more precisely. So it is with Pearl, who is also a kind of interventionist, holding that causal facts can be formally represented in diagrams called Directed Acyclic Graphs displaying counterfactual dependencies between variables (Pearl, 2009; Pearl & Mackenzie, 2018). These counterfactual dependencies are assessed against what would happen if three was an intervention, a “surgical”, hypothetical one, to alter the value of only a (or some) specified variable(s). Formulating causal hypotheses in this way is meant to offer mathematical tools for analyzing empirical data, and such tools have indeed been developed by some, notably in epidemiology. In epidemiology, the Potential Outcomes Approach, which is a form of interventionism and a relative of Woodward’s philosophical account, attracts a devoted following. The primary insistence of its followers is on the precise formulation of causal hypotheses using the language of interventions (Hernán, 2005, 2016; Hernán & Taubman, 2008), which is a little ironic, given that a basis for Woodward’s philosophical interventionism was the idea of moving away from the task of strictly defining causation. The Potential Outcomes Approach constitutes a topic of intense debate in epidemiology (Blakely, 2016; Broadbent, 2019; Broadbent, Vandenbroucke, & Pearce, 2016; Krieger & Davey Smith, 2016; Vandenbroucke, Broadbent, & Pearce, 2016; VanderWeele, 2016), and its track record of actual discoveries remains limited; its main successes have been in re-analyzing old data which was wrongly interpreted at the time, but where the mistake is either already known or no longer matters.

If this sounds confusing, that is because it is. This is a very vibrant area of research. Those interested in interventionism are strongly advised not to confine themselves to the philosophical literature but to read at least a little of Judea Pearl’s (albeit voluminous) corpus, and engage with the epidemiological debate on the Potential Outcomes Approach. Even if it is yet to see its most concise and conceptually organized formulation on which work is ongoing, the initial lack of organization of a field of study is indicative of its ongoing development—exactly the kind of field one who is looking to make a mark, or at least a contribution, should take an interest in. Once the battle lines are drawn up, and the trenches are dug, the purpose of the entire war is called into question.

d. Probabilistic Theories

Probabilistic theories (for example: Eells, 1991; Salmon, 1993; Suppes, 1970) start with the idea that causes raise the probability of their effects. Striking a match may not always be followed by its lighting, but certainly makes it more likely; whereas coincidental antecedents, such as my scratching my nose, do not.

Probabilistic theories originate in part as an attempt to soften the excesses of regularity theories, given the absence of observable exceptionless regularities. More importantly, however, they are motivated by the observation that the world itself may be fundamentally indeterministic, if quantum physics withstands the test of time. A probabilistic theory could cope with a deterministic world as well as an indeterministic one, but a regularity theory could not. Moreover, given the shift in odds towards an indeterministic universe, the fights about regularity start to look scholastic, concerning the finer details of a superstructure whose foundations, never before critically examined, have crumbled upon exposure to the fresh air of empirical science.

Probabilistic approaches may be combined with other accounts, such as agency approaches (Price, 1991). Alternatively, probability may be taken as the primary analytical tool, and this approach has given rise to its own literature on probabilistic theories.

The first move of a probabilistic theory is to deal with the problem that effects raise the probability of other effects of a shared cause. To do so, the notion of screening off is introduced (Suppes, 1970). A cause has many effects, and conditionalizing on the cause alters their probabilities even if we hold the other effects fixed. But not so if we conditionalize on an effect. The probability of the storm occurring, given that the air pressure does not fall, is lower than the probability given that the air pressure does fall, even if we hold fixed the falling of the barometer; and vice versa. But if we hold fixed the air pressure falling (at, say 1 atmosphere, as in actuality) while conditionalizing on the barometer, we do not see any difference in the probability of the storm in case the barometer falls than in case it does not.

To unpack this a bit, consider all the occasions on which air pressure has fallen, all those on which barometers have fallen, and all those on which storms have occurred (and barometers have been present). The problem could then be stated like this. When air pressure falls, storm occurrences are very much more common than when it does not. Moreover, storm occurrences are very much more common in cases where barometers have fallen than in cases where they (have been present but) have not. Thus it appears that both air pressure and barometers cause storms. But, a question prompts, do they truly do so? Or is this one a case of spurious causation?

The screening-off solution says you should proceed as follows. First, consider how things look when you hold the barometer status fixed. In cases where the barometer does not fall, but air pressure does, storm occurrences are more frequent than in cases where neither the barometer falls nor does air pressure. Likewise in cases where barometers do fall. Now hold fixed air pressure status, considering first those cases where air pressure does not fall, but barometers do—storms are not more common there. Among cases where air pressure does fall, storms are not more common in cases where barometers do fall than in cases where they do not.

Thus, air pressure screens off the barometer falling from the storm. Once you settle on the behavior of the air pressure, and look only at cases where the air pressure behaves in a certain way, the behavior of the barometer is irrelevant to how commonly you find storms. On the other hand, if you settle on a certain barometer behavior, the status of the air pressure remains relevant to how commonly you encounter storms.

This asymmetry determines the direction of causation. Effects raise the probability of their causes, and indeed of other effects—that is why we can perform causal inference, and can infer the impending storm from the falling barometer. But causes “screen off” their effects from each other, while effects do not: the probability of the storm stops tracking the behavior of the barometer as soon as we fix the air pressure, which screens the storm from the barometer; whereas the probability of the storm continues to track the air pressure even when we fix the barometer (and likewise for the barometer when we fix the storm).

One major source of doubt about probabilistic theories is simply that probability and causation are different things (Gillies, 2000; Hesslow, 1976; Hitchcock, 2010). Causes may indeed raise probabilities of effects, but that is because causes make things happen, not because making things happen and raising their probabilities are the same thing. This general objection may be motivated by various counterexamples, of which perhaps the most important are chance-lowering causes.

Chance-lowering causes reduce the probability of their effects, but nonetheless cause them (Dowe & Noordhof, 2004; Hitchcock, 2004). Taking birth control pills reduces the probability of pregnancy. But it is not always a cause of non-pregnancy. Suppose that, as it happens, reproductive cycles are the cause. Or suppose that there is an illness causing the lack of pregnancy. Or suppose a man takes the pills. In such cases, provided the probability of pregnancy is not already zero, the pill may reduce the probability of pregnancy (albeit slightly), while the cause may be something else. In another well-worn example, a golfer slices a ball which veers off the course, strikes a tree, and bounces in for a hole in one. Slicing the ball lowered the probability of a hole in one but nonetheless caused it. Many attempts to deal with chance-lowering causes have been made, but none has secured general acceptance.

5. Ontological Stances

Ontological questions concern the nature of causation, meaning, in a phrase that is perhaps equally obscure, the kind of thing it is. Typically, ontological views of causation seek not only to explain the ontological status for its own sake, but to incorporate causation into a favored ontological framework.

There is a methodological risk in starting with, for example, “I’m a realist…” and then looking for a way to make sense of causation from this perspective. The risk is similar to that of a scientist who begins committed to a hypothesis and looks for a way to confirm it. This approach can be useful, leading to ingenuity in the face of discouraging evidence, and has led to some major scientific breakthroughs (such as Newtonian mechanics and germ theory, to take two quite different examples). It does not entail confirmation bias; indeed, the breakthrough cases are characterized by an obsession with the evidence that does not seem to fit, and by dissatisfaction with a weight of extant confirming evidence that might have convinced a lesser investigator. (Darwin’s sleepless nights about the male peacock’s tail amount to an example; the male peacock’s tail is a cumbersome impediment to survival, and Darwin had not rest until he found an explanation in terms of a mechanism differing from straightforward natural selection, namely, sexual selection.) However, in less genius hands, setting out to show how your theory can explain the object of investigation carries an obvious risk of confirmation bias; indeed, sometimes it turns the activity into something that does not deserve to be called an investigation at all. Moreover it can make for frustrating discussions.

One question about “the nature of causation” is whether causation is something that exists over and above particular things that are causally related, in any sense at all. Nominalism says no, realism says yes, and dispositionalism seeks to explain causation by realism about dispositions, which are things that nominalists would not countenance, but that are different from universals (or at least from the necessitation relation that realists endorse). Process theories offer something different again, seeking to identify a basis for causation in our current best science, thus remaining agnostic (within certain bounds) on larger metaphysical matters, and merely denying the need for causal theory to engage metaphysical resources (as do causal realism and dispositionalism) or to commit to a daunting reductive project (as does nominalism).

a. Nominalism

Nominalists believe that there is nothing (or very little) other than what Lewis calls “distinct existences” (Lewis, 1983, 1986). According to nominalism, causation is obviously not a particular thing because it recurs. So it is not a thing at all, existing over and above its particular instances.

The motivation for nominalism is the same as the motivation for regularity theories, that is, David Hume’s skeptical attack on necessary connection. The nominalist project is to show that sense can be made of causation, and knowledge of it obtained, without this notion. Ultimately, the goal is to show that (or at least to show to what extent) the knowledge that depends on causal knowledge is warranted.

Nominalism thus depends fundamentally on the success of the semantic project, which is discussed in the previous section. Attacks on those projects amount to attacks on, or challenges for, nominalism. They are not rehearsed here. The remainder of this section considers alternatives to nominalism.

b. Realism

Realists believe that there are real things, usually called universals, that exist in abstraction from particulars. Nominalists deny this. The debate is one of the most ancient in philosophy and this article is not the place to introduce it. Here, the topic is realism and nominalism about causation.

Realists believe that there is something often called the necessitation relation which holds between causes and effects, but not between non-causal pairs. Nominalists think that there is no such thing, but that causation is just some sort of pattern among causes and effects, for instance, that causes are always followed by their effects, distinguishing them from mere coincidences (see the subsection on regularity theories).

Before continuing, a note on the various meanings of “realism” is necessary. It is important not to confuse realism about causation (and, similarly, about laws of nature) with metaphysical realism. To be realist about something is to assert its mind-independent existence. In the case of universals, the debate is about whether they exist aside from particulars. The emphasis is on existence. In debates about metaphysical realism, the emphasis is on mind-independence. The latter is contrasted with relativist positions such as epistemic relativism, according to which there are no facts independent of a knower (Bloor, 1991, 2008), or Quine’s ontological relativity, according to which reference is relative to a frame of reference (Quine, 1969), which is best understood as either being or arising from a conceptual framework.

Nominalists may or may not be metaphysical anti-realists of one or another kind. In fact, unlike Quine (a nominalist, that is, an anti-realist about universals, and also a metaphysical anti-realist), the most prominent opponents of nominalism about causation (which is a kind of causal anti-realism) are metaphysical realists. For instance, the nominalist David Lewis believes that there is nothing (or relatively little) other than what he calls distinct existences, but he is realist about these existences (Lewis, 1984). In this area of the debate about causation, however, broad metaphysical realism is a generally accepted background assumption. The question is then whether or not causation is to be understood as some pattern of distinct existences, whether actual or counterfactual, or whether on the contrary it is to be understood as a universal: the “necessitation relation”.

The classic statements of realism about causation are by David Armstrong and Michael Tooley (Heathcote & Armstrong, 1991; Tooley, 1987). These also concern laws of nature, which, on their accounts, underlie causal relations. The core of such accounts of laws and causation is the postulation of a kind of necessity that is not logical necessity. In other words, they refuse to accept Hume’s skeptical arguments about the unintelligibility or unknowability of non-logical necessity (which are presented in the subsection on regularity theories). On Armstrong’s view, there is a second-order universal he calls the necessitation relation which relates first order universals, which are regular properties and relations such as being a massive object or having a certain velocity relative to a given frame of reference. If it is a law that sodium burns with a yellow flame, that means that the necessitation relation holds between the universals (or complexes of them) denoted by the predicates “is sodium” and “burns with a yellow flame”. Being sodium and burning necessitate a yellow flame.

Causal relations derive from the laws. The burning sodium causes there to be a yellow flame, because of the necessitation relation that holds between the universals. Where there is sodium, and it burns, there must be a yellow flame. The kind of necessity is not logical, and nor is it strictly exceptionless. But there is a kind of necessity, nonetheless.

How, exactly, are the laws meant to underlie causal relations? Michael Tooley considers the relation between causation and laws, on the realist account of both, in detail (Tooley, 1987). But even if it can be answered, the most obvious question for realism about universals is what exactly they are (Heathcote & Armstrong, 1991).

For the realist account of causation, saying what universals are is particularly important. That is because the necessitation relation seems somewhat unlike other universals. Second order universals such as, for example, shape, of which particular shapes partake, are reasonably intelligible. I have a grasp on what shape is, even if I struggle to say what it is apart from giving examples of actual shapes. At least, I think I know what “shape” means. But I do not know what “necessitates” means. David Lewis puts the point in the following oft-cited passage:

The mystery is somewhat hidden by Armstrong’s terminology. He uses ‘necessitates’ as a name for the lawmaking universal N; and who would be surprised to hear that if F ‘necessitates’ G and a has F, then a must have G? But I say that N deserves the name of ‘necessitation’ only if, somehow, it really can enter into the requisite necessary connections. It can’t enter into them just by bearing a name, any more than one can have mighty biceps just by being called ‘Armstrong’.  (Lewis, 1983, p. 366)

Does realism help with the problems that nominalist semantic theories encounter? One advantage of realism is that it makes semantics easy. Causal statements are made true by the obtaining, or not, of the necessitation relation between cause and effect. This relation holds between the common cause of two effects, but not between the effects; between the preempting, but not the preempted, cause and the effect. Classic problems evaporate; they are an artefact of the need arising from nominalism to analyze causation in terms of distinct events, a project that realists are too wise to entertain.

But that may, in a way, appear as cheating. For it hardly sounds any different from the pre-theoretic statement that causes cause their effects, while effects of a common cause do not cause each other, and that preempted events are not causes.

One way to press this objection is to look at whether realism assists people who face causal problems, outside of philosophical debate. When people other than metaphysicians encounter difficulties with causation, they do not typically find themselves assisted by the notion of a relation of necessitation. Lawyers may apply a counterfactual “but for” test: but for the defendant’s wrongful act, would the harm have occurred? In doing so, they are not adducing more empirical evidence, but offering a different way to approach, analyze, or think through the evidence. They do not, however, find it useful to ask whether the defendant’s wrongful act necessitated the harm. In cases where the “but for” test fails, other options have been tried, including asking whether the wrongful act made the harm more probable; and scientific evidence is sometimes adduced to confirm that there is, in general, a possibility that the wrongful act could have caused the harm. But lawyers never ask anything like: did the defendant’s act necessitate the harm? Not only would this seem far too strong for any prosecution in its right mind to introduce; it would also not seem particularly helpful. The judge would almost certainly want help in understanding “necessitate”, which in non-obvious cases sounds as obscure and philosophical-sounding as “cause”, and then we would be back with the various legal “tests” that have been constructed.

The realist might reply that metaphysics is not noted for its practical utility, and that the underlying metaphysical explanation for regularities and counterfactuals is the existence of a necessitation relation. Fair enough, but it is interesting that offering counterfactuals or probabilities in place of causal terms is thought to elucidate them, and that there is not a further request to elucidate the counterfactuals or probabilities; whereas there would be a further request (presumably) to explicate necessitation. Realists seem to differ not just from nominalists but from everyone else in seeing universals as explaining all these things, while not seeing any need for further explication of universals.

c. Dispositionalism

Dispositionalism is a relatively newly explored view, aiming to take a different tack from nominalism and realism (Mumford & Anjum, 2011). Dispositions are fundamental constituents of reality on this view (Mumford, 1998). Counterfactuals are to be understood in terms of dispositions (and not the other way round (Bird, 2007)). Causation may also be explained in this way, and without dog-legging through counterfactuals, which averts the problems attendant on counterfactual analyses of causation.

To cause an effect is, in essence, to dispose the effect to happen. Effects do not have to happen. But causes dispose them to. This is how their probabilities are raised. This is why, had the cause not occurred, the effect would not have occurred.

The literature on dispositionalism is relatively new and developing in the 21st century, with the position receiving a book-length formulation only in the 2010s (see Mumford & Anjum, 2011). Interested readers are invited to consult that work, which offers a much more useful introduction to the subtleties of this new approach than is effected here.

d. Process Theories

A further approach which has developed an interesting literature but which is not treated in detail in this article is the process approach. Wesley Salmon suggested that causation be identified with some physical quantity or property, which he characterized as the transmission of a “mark” from cause to effect (Salmon, 1998). This idea was critiqued and then developed by Phil Dowe, who suggested that the transmission of energy should be identified as the underlying physical quantity (Dowe, 2000). Dowe’s approach has the merits of freeing itself from the restrictions of conceptual analysis, while at the same time solving some familiar problems. Effects of a common cause transmit no energy to each other. Preempted events transmit no energy to the effects of the preempting causes, which, on the other hand, do so.

The attraction of substituting a scientific concept, or a bundle of concepts, for causation is obvious. Such treatments have proved fruitful for other pre-theoretic notions like “energy” and offer to fit causation into a worldview which, arguably (see the subsection on Russellian Republicanism), does not appear in our best scientific accounts of reality.

On the other hand, the account does face objections. Energy is in fact transmitted from Assassin 2’s shot to the president, as light bounces off and travels faster than a speeding bullet. Accounts like Dowe’s must be careful to specify the right physical process in order to remain plausible as accounts of causation and then to justify the choice of this particular process on some objective, and ultimately scientific, basis. There is also the problem that, in ordinary talk, we often regard absences or lacks as causes. It is my lack of organizational ability that caused me to miss the deadline. Whether absences can cause is a contested topic (Beebee, 2004; Lewis, 2004b; Mellor, 2004) and one reason for this is that they appear to be a problem for this account of causation.

6. Kantian Approaches

a. Kant Himself

Kant responded to Hume by taking further the idea that causation is not part of the objective world (Kant, 1781).

Hume argued that the only thing in the objects was regularity, and that this fell far short of fulfilling any notion of necessary connection. He further argued that our idea of necessary connection was merely a feeling of expectation. But Hume was (arguably) a realist about the world, and about the regularities it contains, even if he doubted our justification for believing in regularities and doubted that causation was anything in the world beyond a feeling we sometimes get.

Kant, however, took a different view of the world itself, of which causation is meant to be a part. His view is transcendental idealism, the view that space and time are ways in which we experience the world, but not features of the world itself. According to this view, the world exists but it is wholly unknowable. The world constrains what we experience, but what we experience does not tell us about what it is like in itself, that is, independent of how we experience it.

Within this framework, Kant was an empirical realist. That is to say, given the constraints that the noumenal world imposes on what we experience, there are facts about how the phenomenal world goes. Facts about this world are not simply “up to us”. They are partly determined by the noumenal world. But they are also partly determined by the ways we experience things, and thus we are unable to comprehend those things in themselves, apart from the ways we experience them. A moth bangs into a pane of glass, and cannot simply fly through it; the pane of glass constrains it. But clearly the moth’s perceptual modalities also constrain what kind of thing it takes the pane of glass to be. Otherwise, it would not keep flying into the glass.

Kant argued that causation is not an objective thing, but a feature of our experience. In fact, he argued that causation is essential to any kind of experience. The ordering of events in time only amounts to experience if we can distinguish within the general flow of events or of sensory experiences, some streams that are somehow connected. We see a ship on a river. We look away, and look back a while later, to see the ship further down the river (the example is discussed in the Second Analogy in Kant’s Critique of Pure Reason). Only if we can see this as the same ship, moved further along the river, can we see this as a ship and a river at all. Otherwise it is just a series of frames, no more comprehensible than a row of impressionist paintings in an art gallery.

Kant used causation as the exemplar of a treatment he extended to shape, number, and various other apparent features of reality which, in his view, are actually fundamental elements of the structure of experience. His argument that causation is a necessary component of all experience is no longer compelling. It seems that very young children have experience, but not much by way of a concept of causation. Some animals may be able to reason causally, but some clearly cannot, or at least cannot to any great extent. It is a separate question whether they have experience, and some seem to. Thus he seems to have over-extended his point. On the other hand, the insight that there is a fundamental connection between causation and some aspect of us and our engagement with the world may have something to it, and this has subsequently attracted considerable attention.

b. Agency Views

On agency views of causation, the fact that we are agents is inextricably tied up with the fact that we have a causal concept, think causally, make causal judgements, and understand the world as riddled with causality. Agents have goals, and seek to bring them about, through exercising what at least to them seems like their free will. They make decisions, and they do something about them.

Agency theories have trouble providing answers to certain key questions which renders them very unpopular. If a cause is a human action, then what of causes that are not human actions, like the rain causing the dam to flood—if such events are causes by analogy, then that prompts the troublesome questions for agency theories—in what respect are things like rain analogous to human actions? Did someone or something decide to “do” the rain? If not, then in what does the analogy consist?

The most compelling response to these questions lies in the work of Huw Price, beginning with a paper he co-wrote with Peter Menzies (Menzies & Price, 1993). They argue that causation is (or is like) a secondary property, like color. Light comes in various wavelengths, some of which we can perceive. We are able to differentiate among wavelengths to some level of accuracy. This differentiated perception is what we call “color”. We see color, not “as” wavelengths of light (whatever exactly that would be), but as a property of the things off which light bounces or from which it emanates. Color is thus not just a wavelength of light: it is a disposition that we have to react in a certain way to a wavelength of light; alternatively, it is a disposition of light to provoke a certain reaction in us.

Causation, they suggest, is a bit like this. It has some objective basis in the world; but is also depends on us. It is mediated not by our being conscious beings, as in the case of color, but by our being agents. Certain patterns of events in the world, or at least certain features of the world, produce a “cause-effect” response in us. We cannot just choose what causes what. At the same time, this response is not reducible to features of the world alone; our agency is part of the story.

This approach deals with the anthropomorphism objection by retaining the objective basis of causes and effects, while confirming that the interpretation of this objective basis as causal is contributed by us due to the fact that we are agents.

This approach is insufficiently taken up in the literature, and there is not a well-developed literature of objections and responses, beyond the point that the approach remains suggestive and not completely made out. Price has argued for a perspectivalism about causation, arguing that entropic or other asymmetries account for the asymmetries that we project onto time and counterfactual dependence.

Yet this is a sensible direction of exploration, given our inability to observe causation in objects, and our apparent failure to find an objective substitute. It departs from the kind of realism that is dominant in early twenty-first century philosophy of causation, but perhaps that departure is due.

7. Skepticism

a. Russellian Republicanism

Bertrand Russell famously argued that causation was “a relic of a bygone age, surviving, like the monarchy, only because it is erroneously supposed to do no harm” (Russell, 1918). He advanced arguments against the Millian regularity view of causation that was dominant at the time, one of which is the unrepeatability objection discussed above in this article. But his fundamental point is a simple one: our theories of the fundamental nature of reality have no place for the notion of cause.

One response is simply to deny this, and to point out that scientists do use causal language all the time. It is however doubtful that this defense deflects the skeptical blow. Whether or not physicists use the word “cause”, there is nothing like causation in the actual theories which are expressed by equations. As Judea Pearl points out, mathematical equations are symmetrical (Pearl & Mackenzie, 2018). You can rearrange them to solve for different variables. They still say the same thing, in all their arrangements. They express a functional relationship between variables. Causation, on the other hand, is asymmetric. The value of the causal variable(s) sets the value of the effect variable(s). In a mathematical equation, however, “setting” is universal. If one changes the value of the pressure in a fixed mass of gas, then, according to the ideal gas law, either volume or temperature must change (or both). But there is no way to increase the pressure except through adjusting the volume or temperature. The equations do not tell us that.

A better objection might pick up on this response by saying that this example shows that there are causal facts. If physics does not capture them, then it should.

This response is not particularly plausible at a fundamental level, where the prospect of introducing such an ill-defined notion as cause into the already rather strange world of quantum mechanics is not appealing. But it might be implemented through a reductionist strategy. Huw Price offers something like this, suggesting that certain asymmetries, notably the arrow of time, might depend jointly upon our nature as agents, and the temporal asymmetry of the universe. Such an approach is compatible with Russell’s central insight but dispenses with his entertaining, overly enthusiastic, dismissal of the utility of causation. It remains useful, despite being non-fundamental; and its usefulness can be explained. This is perhaps the most promising response to Russell’s observation, and one that deserves more attention and development in the literature.

b. Pluralism and Thickism

Pluralists believe that there is no single concept of causation, but a plurality of related concepts which we lump together under the word “causation” (Anscombe, 1971; Cartwright, 1999). This view tends to go hand-in-hand with a refusal to accept a basic premise of Hume’s challenge, which is that we do not observe causation. We do observe causation, say the objectors. We see pushes, kicks, and so forth. Therefore, they ask, in what sense are we not observing causation?

This line of thought is compelling to some but somewhat inscrutable to many, who remain convinced that pushes and kicks look just the same as coincidental sequences like the sun coming out just before the ball enters the goal or the shopping cart moves—until we have learned, from experience, that there is a difference. Thus, most remain convinced that Hume’s challenge needs a fuller answer. Most also agree with Hume that there is something that causes have in common, and that one needs to understand this if one is to distinguish the kicks and pushes of the world from the coincidences.

A related idea, a form of pluralism, one might call thickism. In an ethical context, some have proposed the existence of “thick” ethical concepts characterized by their irreducibility into an evaluative and factual component. (This is a critique of another Humean doctrine, the fact-value distinction.) Thus, generosity is both fundamentally good and fundamentally an act of giving. It is not a subset of acts of giving defined as those which are good; some of these might be rather selfish, but better than nothing; others might be gifts of too small a kind to count as generous; others might be good for other reasons, because they bring comfort rather than because they are generous (bringing a bunch of flowers to a sick person is an act of kindness but not really generosity). Generosity is thick.

The conclusion one might draw from the existence of thick concepts is that there is not (or not necessarily) a single property binding all the thick concepts together, and thus that it is fruitless to try to identify or analyze it. Similar remarks might be applied to causes. Transitive verbs are commonly causal. To push the cart along is not analyzable into, say, to move forward and at the same time to cause the cart to move. One could achieve this by having a companion push the cart when you move forward, and stop when you stop. Pushes (in the transitive sense) are causal, but the causal element cannot be extracted for analysis.

Against this contention is the point that, in a practical context, the extraction of causation seems exactly what is at issue. In the statistics-driven sciences, in law, in policy-decisions, the non-causal facts seem clear, but the causal facts not. The question is exactly whether the non-causal facts are accompanied by causation. There does seem to be an important place in our conceptual framework for a detached concept of cause, because we apply that concept beyond the familiar world of kicks and pushes. As for those familiar causes, the tangling up of a kind of action with a cause hardly shows that there is no distinction between causes and non-causes. If we do not call a push-like action a cause on one occasion (when my friend pushes the trolley according to my movements) while we do on another (when I push the trolley), this could just as easily be taken to show that we need a concept of causation to distinguish pushing from mere moving forward.

8. References and Further Reading

  • Anscombe, G. E. M. (1958). Modern moral philosophy. Philosophy, 33(124), 1–19.
  • Anscombe, G. E. M. (1969). Causality and Extensionality. The Journal of Philosophy, 66(6), 152–159.
  • Anscombe, G. E. M. (1971). Causality and Determination. Cambridge: Cambridge University Press.
  • Armstrong, D. (1983). What Is a Law of Nature? Cambridge: Cambridge University Press.
  • Beebee, H. (2004). Causing and Nothingness. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 291–308). Cambridge, Massachusetts: MIT Press.
  • Bennett, J. (2001). On Forward and Backward Counterfactual Conditionals. In G. Preyer, & F. Siebelt (Eds.), Reality and Humean Supervenience (pp. 177–203). Maryland: Rowman and Littlefield.
  • Bennett, J. (2003). A Philosophical Guide to Conditionals. Oxford: Oxford University Press.
  • Bird, A. (2007). Nature’s Metaphysics. Oxford: Oxford University Press.
  • Blakely, T. (2016). DAGs and the restricted potential outcomes approach are tools, not theories of causation. International Journal of Epidemiology, 45(6), 1835–1837.
  • Bloor, D. (1991). Knowledge and Social Imagery (2nd ed.). Chicago: University of Chicago Press.
  • Bloor, D. (2008). Relativism at 30,000 feet. In M. Mazzotti (ed.), Knowledge as Social Order: Rethinking the Sociology of Barry Barnes (pp. 13–34). Aldershot: Ashgate.
  • Broadbent, A. (2012). Causes of causes. Philosophical Studies, 158(3), 457–476. https://doi.org/10.1007/s11098-010-9683-0
  • Broadbent, A. (2016). Philosophy for graduate students: Core topics from metaphysics and epistemology. In Philosophy for Graduate Students: Core Topics from Metaphysics and Epistemology. https://doi.org/10.4324/9781315680422
  • Broadbent, A. (2019). The C-word, the P-word, and realism in epidemiology. Synthese. https://doi.org/10.1007/s11229-019-02169-x
  • Broadbent, A., Vandenbroucke, J. P., & Pearce, N. (2016). Response: Formalism or pluralism? A reply to commentaries on “causality and causal inference in epidemiology.” International Journal of Epidemiology, 45(6), 1841–1851. https://doi.org/10.1093/ije/dyw298
  • Cartwright, N. (1983). Causal Laws and Effective Strategies. Oxford: Clarendon Press.
  • Cartwright, N. (1999). The Dappled World: A Study of the Boundaries of Science. Cambridge: Cambridge University Press.
  • Cartwright, N. (2007). Hunting Causes and Using Them: Approaches in Philosophy and Economics. New York: Cambridge University Press.
  • Dowe, P. (2000). Physical Causation. Cambridge: Cambridge University Press.
  • Dowe, P., & Noordhof, P. (2004). Cause and Chance: Causation in an Indeterministic World. London: Routledge.
  • Eells, E. (1991). Probabilistic Causality. Cambridge: Cambridge University Press.
  • Elga, A. (2000). Statistical Mechanics and the Asymmetry of Counterfactual Dependence. Philosophy of Science (Proceedings), 68(S3), S313–S324.
  • Forsyth, F. (1971). The Day of the Jackal. London: Hutchinson.
  • Garrett, D. (2015). Hume’s Theory of Causation. In D. C. Ainslie, & A. Butler (Eds.), The Cambridge Companion to Hume’s Treatise (pp. 69–100). https://doi.org/10.1017/CCO9781139016100.006
  • Gillies, D. (2000). Philosophical Theories of Probability. London: Routledge.
  • Hall, N. (2004). Two Concepts of Causation. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 225–276). Cambridge, Massachusetts: MIT Press.
  • Hausman, D. (1998). Causal Asymmetries. Cambridge: Cambridge University Press.
  • Heathcote, A., & Armstrong, D. M. (1991). Causes and Laws. Noûs, 25(1), 63–73. https://doi.org/10.2307/2216093
  • Hernán, M. A. (2005). Invited Commentary: Hypothetical Interventions to Define Causal Effects—Afterthought or Prerequisite? American Journal of Epidemiology, 162(7), 618–620.
  • Hernán, M. A. (2016). Does water kill? A call for less casual causal inferences. Annals of Epidemiology, 26(10), 674–680.
  • Hernán, M. A., & Robins, J. M. (2020). Causal Inference: What If. Retrieved from https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/
  • Hernán, M. A., & Taubman, S. L. (2008). Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity, 32, S8–S14.
  • Hesslow, G. (1976). Two Notes on the Probabilistic Approach to Causality. Philosophy of Science, 43(2), 290–292.
  • Hiddleston, E. (2005). A Causal Theory Of Counterfactuals. Australasian Journal of Philosophy, 39(4), 632–657.
  • Hitchcock, C. (2004). Routes, processes and chance-lowering causes. In P. Dowe, & P. Noordhof (Eds.), Cause and Chance (pp. 138–151). London: Routledge.
  • Hitchcock, C. (2010). Probabilistic Causation. Stanford Encyclopedia of Philosophy. Retrieved from https://plato.stanford.edu/archives/fall2010/entries/causation-probabilistic/
  • Hume, D. (1748). An Enquiry Concerning Human Understanding (1st ed.). London: A. Millar.
  • Kant, I. (1781). The Critique of Pure Reason (1st ed.).
  • Krieger, N., & Davey Smith, G. (2016). The ‘tale’ wagged by the DAG: broadening the scope of causal inference and explanation for epidemiology. International Journal of Epidemiology, 45(6), 1787–1808. https://doi.org/10.1093/ije/dyw114
  • Lewis, D. (1973a). Causation. Journal of Philosophy, 70 (17), 556–567.
  • Lewis, D. (1973b). Counterfactuals. Cambridge, Massachusetts: Harvard University Press.
  • Lewis, D. (1973c). Counterfactuals and Comparative Possibility. Journal of Philosophical Logic, 2(4), 418–446.
  • Lewis, D. (1979). Counterfactual Dependence and Time’s Arrow. Noûs, 13(4), 455–476.
  • Lewis, D. (1983). New Work for a Theory of Universals. Australasian Journal of Philosophy, 61(4), 343–377.
  • Lewis, D. (1984). Putnam’s Paradox. Australasian Journal of Philosophy, 62(3), 221–236.
  • Lewis, D. (1986). Philosophical Papers (vol. II). Oxford: Oxford University Press.
  • Lewis, D. (2004a). Causation as Influence. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 75–106). Cambridge, Massachusetts: MIT Press.
  • Lewis, D. (2004b). Void and Object. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 277–290). Cambridge, Massachusetts: MIT Press.
  • Lipton, P. (2000). Tracking Track Records. Proceedings of the Aristotelian Society ― Supplementary Volume, 74(1), 179–205.
  • Mackie, J. (1974). The Cement of the Universe. Oxford: Oxford University Press.
  • Mellor, D. H. (1995). The Facts of Causation. Abingdon: Routledge.
  • Mellor, D. H. (2004). For Facts As Causes and Effects. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 309–324). Cambridge, Massachusetts: MIT Press.
  • Menzies, P., & Price, H. (1993). Causation as a Secondary Quality. The British Journal for the Philosophy of Science, 44(2), 187–203.
  • Mill, J. S. (1882). A System of Logic, Ratiocinative and Inductive (8th ed.). New York and Bombay: Longman’s, Green, and Co.
  • Mumford, S. (1998). Dispositions. Oxford: Oxford University Press.
  • Mumford, S., & Anjum, R. L. (2011). Getting Causes from Powers. London: Oxford University Press.
  • Paul, L. A. (2004). Aspect Causation. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 205–223). Cambridge, Massachusetts: MIT Press.
  • Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). Cambridge: Cambridge University Press.
  • Pearl, J., & Mackenzie, D. (2018). The Book of Why. New York: Basic Books.
  • Price, H. (1991). Agency and Probabilistic Causality. The British Journal for the Philosophy of Science, 42(2), 157–176.
  • Quine, W. V. (1969). Ontological Relativity and Other Essays. New York: Columbia University Press.
  • Rips, L. J. (2010). Two Causal Theories of Counterfactual Conditionals. Cognitive Science, 34(2), 175–221. https://doi.org/10.1111/j.1551-6709.2009.01080.x
  • Rubin, D. (1974). Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology, 66(5), 688–701.
  • Russell, B. (1918). On the Notion of Cause. London: Allen and Unwin.
  • Salmon, W. C. (1993). Probabilistic Causality. In E. Sosa, & M. Tooley (Eds.), Causation (pp. 137-153). Oxford: Oxford University Press.
  • Salmon, W. C. (1998). Causality and Explanation. Oxford: Oxford University Press.
  • Schaffer, J. (2004). Trumping Preemption. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 59–74). Cambridge, Massachusetts: MIT Press.
  • Schaffer, J. (2007). The Metaphysics of Causation. Stanford Encyclopedia of Philosophy. Retrieved from https://plato.stanford.edu/archives/win2007/entries/causation-metaphysics/
  • Stapleton, J. (2008). Choosing What We Mean by “Causation” in the Law. Missouri Law Review, 73(2), 433–480. Retrieved from https://scholarship.law.missouri.edu/mlr/vol73/iss2/6
  • Suppes, P. (1970). A Probabilistic Theory of Causality. Amsterdam: North-Holland.
  • Tooley, M. (1987). Causation: A Realist Approach. Oxford: Clarendon Press.
  • Vandenbroucke, J. P., Broadbent, A., & Pearce, N. (2016). Causality and causal inference in epidemiology: the need for a pluralistic approach. International Journal of Epidemiology, 45(6), 1776–1786. https://doi.org/10.1093/ije/dyv341
  • VanderWeele, T. J. (2016). Commentary: On Causes, Causal Inference, and Potential Outcomes. International Journal of Epidemiology, 45(6), 1809–1816.
  • Woodward, J. (2003). Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press.
  • Woodward, J. (2006). Sensitive and Insensitive Causation. The Philosophical Review, 115(1), 1–50.

 

Author Information

Alex Broadbent
Email: abbroadbent@uj.ac.za
University of Johannesburg
Republic of South Africa

Kit Fine (1946—)

Kit Fine is an English philosopher who is among the most important philosophers of the turn of the millennium. He is perhaps most influential for reinvigorating a neo-Aristotelian turn within contemporary analytic philosophy. Fine’s prolific work is characterized by a unique blend of logical acumen, respect for appearances, ingenious creativity, and originality. His vast corpus is filled with numerous significant contributions to metaphysics, philosophy of language, logic, philosophy of mathematics, and the history of philosophy.

Although Fine is well-known for favoring ideas familiar from the neo-Aristotelian tradition (such as dependence, essence, and hylomorphism), his work is most distinctive for its methodology. Fine’s general view is that metaphysics is not best approached through the study of language Roughly put, Fine’s approach focuses on providing a rigorous account of the apparent phenomena themselves, and not just how we represent them in language or thought, prior to any attempt to discern the reality underlying them. Furthermore, a strong and ecumenical respect for the intelligible options, demands patience for the messy details, even when they resist tidying or systematization. All this leads to a steadfastness in refusing to allow epistemic qualms about how we know what we seem to know to interfere with our attempts to clarify just what it is that we seem to know.

This article surveys the wide variety of Fine’s contributions to philosophy, and it conveys what Fine’s distinctive methodology is and how it informs his contributions to philosophy.

Table of Contents

  1. Biography
  2. Fine Philosophy
  3. Metaphysics
    1. Modality
    2. Essence
    3. Ontology
    4. Mereology
    5. Realism
    6. Ground
    7. Tense
  4. Philosophy of Language
    1. Referentialism
    2. Semantic Relationism
    3. Vagueness
    4. Truthmaker Semantics
  5. Logics and Mathematics
    1. Logics
    2. Arbitrary Objects
    3. Philosophy of Mathematics
  6. History
  7. References and Further Reading

1. Biography

Fine was born in England on March 26, 1946. He earned a B.A. in Philosophy, Politics, and Economics at the University of Oxford in 1967. He was then appointed to a position at the University of Warwick. There he was mentored by Arthur Prior. Although Fine was never enrolled in a graduate program, his Ph.D. thesis For Some Proposition and So Many Possible Worlds was examined and accepted by William Kneale and Dana Scott just two years later.

Since then, Fine has held numerous academic appointments, including at: University of Warwick; St John’s College, University of Oxford; University of Edinburgh; University of California, Irvine; University of Michigan, Ann Arbor; and University of California, Los Angeles. Fine joined New York University’s philosophy department in 1997, where he is now Silver Professor and University Professor of Philosophy and Mathematics. He is currently also a Distinguished Research Professor at the University of Birmingham. Fine also held visiting positions at: Stanford University; University of Toronto; University of Arizona; Australian National University; University of Melbourne; Princeton University; Harvard University; New York University at Abu Dhabi; University of Aberdeen; and All Souls College, University of Oxford.

He has served the profession as an editor or an editorial board member of Synthese; The Journal of Symbolic Logic; Notre Dame Journal of Formal Logic; and Philosophers’ Imprint.

Fine’s contributions to philosophy have been recognized by numerous awards, including a Guggenheim Foundation Fellowship, American Council of Learned Societies Fellowship, Fellow of the American Academy of Arts and Sciences, Fellow at the National Center for the Humanities, Corresponding Fellow at the British Academy, an Anneliese Maier Research Award from the Alexander von Humboldt Foundation, and a Leibowitz Award (with Stephen Yablo).

Fine’s corpus is enormous. By mid-2020 he had published over 130 journal articles, 5 books. At least half a dozen articles and 8 monographs are forthcoming. His work is at once of both great breadth and depth, spanning many core areas of philosophy and engaging its topics with great erudition and technical sophistication. His trailblazing work is highly original, rarely concerned with wedging into topical or parochial debates but rather with making novel advances to the field in creative and unexpected ways. This article de-emphasizes his technical contributions, and it focuses upon his more distinctive or influential work.

2. Fine Philosophy

When engaging with the work of any prolific philosopher exhibiting great breadth and originality, it is tempting to look for some core “philosophical attractors” that animate, unify, or systematize their work. These attractors may then serve as useful aids to understanding their work and highlighting its most distinctive features.

Perhaps the most familiar form a philosophical attractor might take is that of a doctrine. These “doctrinal attractors” are polarized, pulling in some views while repelling others. Their “magnetic” tendencies are what systematize a thinkers’ thought. In the history of modern philosophy, two obvious examples are Spinoza and Leibniz. Their commitment to the principle of sufficient reason, the doctrine that everything has a reason or cause, underwrites vast swaths of their respective philosophies (Spinoza 1677; Leibniz 1714). A good example in the twentieth century is David Lewis. One can scarcely imagine understanding Lewis’s philosophy without placing at its core the doctrines of Humean supervenience and modal realism (Lewis 1986).

Another form a philosophical attractor might take is that of a methodology. These methodological attractors are also polarized, but they exert their force less on views and more on which data to respect and which to discard, which distinctions to draw and which to ignore, how weighty certain considerations should be or not, and the like. Hume is an example in the history of modern philosophy. His commitment to respecting only that which makes an observable difference guides much of his philosophy (Hume 1739). Saul Kripke is an example in the twentieth century. One can scarcely imagine understanding his philosophy without placing at its core a respect for common sense and intuitions about what we should say of actual and counterfactual situations (Kripke 1972).

There is no question that Fine is well-known for his association with certain doctrines or topics. These include: actualism, arbitrary objects, essentialism, ground, hylomorphism, modalism, procedural postulationism, semantic relationism, (formerly) supervaluationism, three-dimensionalism, and truthmaker semantics. But as important as these may be to understanding Fine’s work, they do not serve individually or jointly as doctrinal attractors in the way that, for example, Humean supervenience or modal realism did so vividly for Lewis.

Instead, Fine’s work is better understood in terms of a distinctive “Finean” cluster of methodological attractors. Fine himself has not spelled out the details of the cluster explicitly. But some explicit discussion of it can be found in his early work (1982c: §A2). There are also discussions suggestive of the cluster scattered across many of his later works. But perhaps the strongest impression emerges by osmosis from sustained engagement with a range of his work.

The Finean cluster may be roughly summarized by the following methodological “directives”:

    1. Provide a rigorous account of the appearances first before trying to discern the reality underlying them.
    2. Focus on the phenomenon itself and not just how we represent or express it in language or thought.
    3. Respect what’s at issue by not allowing worries about what we can mean from preventing us from accepting the intelligibility of notions that strike us as intelligible.
    4. Be patient with the messy details even when they resist tidying or systematization.
    5. Don’t allow epistemic worries about how we know what we seem to know interfere with or distract us from clarifying what it is that we seem to know.

Some of these directives interact or overlap. Even so, separating them helps highlight their different emphases. Bearing them in mind both individually and jointly is crucial to understanding Fine’s distinctive approach to the vast array of topics covered in his work.

Sometimes the influence of the directives is rather explicit. For example, the first directive clearly influences Fine’s views on realism and the nature of metaphysics. Implicit in this directive is a distinction between appearance and reality. Fine suggests that each is the focus of its own branch of metaphysics. Naïve metaphysics studies the appearances whereas foundational metaphysics studies their underlying reality. Because we have not yet achieved rigorous clarification of the appearances, Fine believes it would be premature to investigate the reality underlying them.

Other times, however, the directives exert their influence in more implicit ways. To illustrate, consider the first directive’s emphasis on providing a rigorous account of the appearances. Although Fine’s tremendous technical skill is clear in his work in formal logic, it also suffuses his philosophical work. Claims or ideas are often rigorously formalized in appendices or sometimes in the main text. Even when Fine’s prose is informal at the surface, it is evident that his technical acuity and logical rigor support it from beneath.

The second directive is perhaps most evident in Fine’s focus on the phenomena. Even in our post-positivistic times, some philosophers still lose their nerve when attempting to do metaphysics and, instead, retreat to our language or thought about it. An aversion to this is implicit throughout Fine’s work. Sometimes Fine makes his aversion explicit (2003a: 197):

…in this paper…I have been concerned, not with material things themselves, but with our language for talking about material things. I feel somewhat embarrassed about writing such a strongly oriented linguistic paper in connection with a metaphysical topic, since it is my general view that metaphysics is not best approached through the study of language.

Behind Fine’s remarks is a view that the considerations relevant to language often differ from those relevant to its subject matter. Only confusion can result from this sort of mismatch. So Fine’s apology is perhaps best explained by his unapologetic insistence that our interest is in the phenomena. However esoteric or unruly they may be, we should boldly resist swapping them out for the pale shadows they cast in language or thought.

The third directive is implicit in Fine’s frequent objections to various doctrines for not properly respecting the substantiveness, or even intelligibility, of certain positions. To illustrate, Fine defends his famous counterexamples against modal conceptions of essence by applying the third directive (1994b: 5):

Nor is it critical to the example that the reader actually endorse the particular modal and essentialist claims to which I have made appeal. All that is necessary is that he should recognize the intelligibility of a position which makes such claims.

Even if the claims are incorrect, their intelligibility is still enough to establish that there is a genuine non-modal conception of essence. Considerations like these illustrate Fine’s ecumenical approach. But this ecumenicity does not imply that anything goes, as Fine makes clear elsewhere when discussing fundamentality (2013a: 728):

Of course, we do not want to be able to accommodate any old position on what is and is not fundamental. The position should be coherent and it should perhaps have some plausibility. It is hard to say what else might be involved, but what seems clear is that we should not exclude a position simply on the grounds that it does not conform to our theory…

There appears to be a sort of humility driving Fine’s applications of the fourth directive. Philosophy aspires to the highest standards of clarity, precision, and rigor. This is why philosophical progress is so hard to achieve, and so modest when it is achieved. Thus, at least at this early stage of inquiry, there is a sort of arrogance in justifying one’s disregard for certain positions by appealing to one’s doctrinal commitments. Perhaps this also explains the scarcity of doctrinal attractors in Fine’s work.

The fourth directive often manifests in Fine’s work as an openness—perhaps even a fondness—for drawing many subtle distinctions. To some extent, this is explained by Fine’s keen eye for detail and his respect for nuance.  But a deeper rationale derives from an interaction between the first two directives. For if these subtle distinctions belong to the appearances, then we must ultimately expect a rigorous account of the latter to include the former. This helps explain Fine’s patient and sustained interest in these distinctions, even when they resist analysis, raise difficulties of their own, or are just unpopular.

The fifth directive helps explain what might otherwise seem like a curious gap in Fine’s otherwise broad corpus. With only a few exceptions (2005d; 2018a), Fine has written little directly on epistemology. When Fine’s work indirectly engages epistemology, it is often with ambivalence. And epistemic considerations rarely play any serious argumentative role. For example, one scarcely finds him ever justifying a claim by arguing that it would be easier to know than its competitors. Fine’s distance from epistemic concerns does not stem from any disdain for them. It rather stems from the influence of the other directives. It would be premature to attempt to account for our knowledge of the appearances prior to providing a rigorous account of what they are. As Fine has sometimes quipped in conversation, “Metaphysics first, epistemology last”.

3. Metaphysics

Fine is widely regarded as having played a pivotal role in the recent surge of interest in broadly neo-Aristotelian metaphysics. It is, however, not easy to say just what neo-Aristotelian metaphysics is. One might characterize it as a kind of resistance to the “naturalistic” approaches familiar in much of late 20th century metaphysics. Granted, it is not straightforward how those approaches fit within the Aristotelian tradition. But the complexities of Aristotle’s own approach to metaphysics and the natural world suggest that any such characterization is, at best, clumsy and oversimplified. Another characterization of neo-Aristotelian might associate it with certain distinctive topics, including essence, substance, change, priority, hylomorphism, and the like. Granted, these topics do animate typical examples of neo-Aristotelian metaphysics. But it is also clear that these topics are not its exclusive property. Perhaps the best way to characterize neo-Aristotelian metaphysics is to engage with the metaphysics of one of its most influential popularizers and practitioners in contemporary times: Kit Fine.

What is metaphysics? Fine believes it is the confluence of five features (2011b). First, the subject of metaphysics is the nature of reality. But physics, mathematics, aesthetics, epistemology, and many other areas of inquiry are also concerned with the nature of reality. What distinguishes metaphysics from them is its aim, its methods, its scope, and its concepts. The aim of metaphysics is to provide a foundation for what there is. The method of metaphysics is characteristically apriori. The scope of metaphysics is as general as can be. And the concepts of metaphysics are transparent in the sense that there is no “gap” between the concept itself and what it is about.

The distinction between appearance and reality plays a prominent role in Fine’s conception of metaphysics (1982c: §A2; 2017b). Given such a distinction, one aim of metaphysics is to characterize how things are in reality. In Aristotelian fashion, this begins with the appearances. We start with how things appear, and the task is then to vindicate the appearances as revelatory of the underlying reality, or else to explain away the appearances in terms of some altogether different underlying reality. Both the revelatory and the reductionist projects presuppose the appearances, and so it is vital to get straight on what they are first. Fine calls this project naïve metaphysics. Only once adequate progress has been made on the naïve metaphysics of a subject will we be in a position to consider how it relates to fundamental reality. Fine calls this second project foundational metaphysics. Much of Fine’s work in metaphysics is best regarded as contributing to the naïve metaphysics of various topics (modality, part/whole, persistence) or to clarifying what conceptual tools (essence, reality, ground) will be needed to relate naïve metaphysics to foundational metaphysics. As Fine puts it (2017b: 108):

In my own view, the deliverances of foundational metaphysics should represent the terminus of philosophical enquiry; and it is only once we have a good handle on the corresponding questions within naïve metaphysics, with how things appear, that we are in any position to form an opinion on their reality.

Fine often suggests doubts about our having made anywhere near enough progress in naïve metaphysics to embark yet on foundational metaphysics. Because Fine suspects it would be premature to pursue foundational metaphysics at this early (for philosophy!) stage of inquiry, one should resist interpreting his work as pronouncing upon the ultimate nature of reality or the like. These sentiments are microcosmic embodiments of the five directives characterizing Fine’s philosophical approach.

a. Modality

Much of Fine’s earliest work focused on technical questions within formal logic, especially modal logic. But in the late 1970’s, Fine’s work began increasingly to consider applications of formal methods—especially the tools of modal logic—to the philosophy of modality. This shift produced a variety of influential contributions.

One of Fine’s earliest contributions to modality was to develop an ontological theory of extensional and intensional entities (1977b). The theory assumes a familiar possible worlds account of its intensional entities: properties are sets of world-individual pairs, propositions are sets of worlds, and so on. This approach is often taken to disregard any internal “structure” in the entities for which it accounts. But Fine resourcefully argues that a great deal of “structure” may still be discerned, including existence, being qualitative, being logical, having individual constituents, and being essentially modal. This work, together with Fine’s developments of Prior’s form of actualism (1977a), prefigured the recent debate between necessitists who assert that necessarily everything is necessarily something and contingentists who deny this (Williamson 2013b).

Fine continued exploring the applications of modal logic in the work that followed. The technical development of first-order modal theories is explored in one trio of papers (1978a; 1978b; 1981b). A second trio of papers explores applications of first-order modal theories to the formalization of various metaphysical theories of sets (1981a), propositions (1980), and facts (1982b). The second trio contains a wealth of distinctions and arguments. Some of them, with the benefit of hindsight, prefigure what would later become some of Fine’s more influential ideas.

For one example, the formalizations in 1981a are explicitly  intended to capture plausible essentialist views about the identity or nature of sets. It is not difficult to view some of Fine’s remarks in this paper as anticipating his later celebrated set-theoretic counterexamples to the modal theory of essence (1994b).

For another example, 1982b argues against the still-common view that identifies facts with true propositions. The proposition that dogs bark exists regardless of whether dogs bark, whereas the fact that dogs bark exists only if they do.

In discussing these and related topics, Fine also introduced a general argumentative strategy against a variety of controversial metaphysical views. To illustrate, consider a modal variant of the preceding view that identifies possible facts with possibly true propositions. Suppose possible objects are abstracta. If a possible object is thus-and-so, then possibly it is actually thus-and-so. In particular, a possible donkey is possibly an actual donkey. Now, an actual donkey is a concrete object. So, we then have an abstract object—a possible donkey—that is possibly concrete. But no abstracta is possibly concrete. And so not all possible objects are abstracta. This sort of argument can also be used to show that possible facts are not propositions and that possible worlds are not abstract.

Fine’s work on modality is animated by a commitment to modal actualism (see his introduction to 2005b). This combines two theses. The first, modalism, is that modal notions are intelligible and irreducible to non-modal notions. The second, actualism, is that actuality is prior to mere possibility.

One of modalism’s most infamous detractors was Quine. Fine provides detailed reconstructions of Quine’s arguments against the intelligibility of de re modality and equally detailed criticisms of them (1989c; 1990). Quine’s arguments and Fine’s criticisms involve disentangling delicate issues concerning the modal question of de re modality, the semantic question of singular (or direct) reference, and the metaphysical question of transworld identity. These issues, according to Fine, have often been conflated in the literature (2005e).

One of the main problems facing actualism is to explain how to make sense of discourse about the merely possible, or “possibilist discourse”, given that mere possibilia are ultimately unreal. Fine takes up the challenge of reducing possibilist discourse to actualist discourse in a series of articles (1977a; 1985c; 2002b). A notable theme of Fine’s reductive strategy is a resistance to “proxy reduction”. Roughly, a proxy reduction attempts to reduce items of a target domain by associating them one-by-one with items from a more basic domain. In this case, a proxy reduction of possibilist discourse would reduce a merely possible object by associating it with an actual object. Although it is often assumed that reduction must proceed in this way by “proxy”, Fine argues that it needn’t. Instead, Fine pursues a different approach. The idea is to reduce the claim that a possible object has a feature to the claim that possibly some object (actually) has that feature. Thus, the claim that Wittgenstein’s possible daughter loathed philosophy is reduced to the claim that possibly Wittgenstein’s daughter (actually) loathed philosophy. This is not a proxy reduction because it does not associate Wittgenstein’s possible daughter with any actual object. Criticisms of the approach from Williamson 2013b and others recently prompted Fine to develop a new “suppositional” approach (2016c).

Although modalists often distinguish between various kinds of modality, they have often thought that the varieties can ultimately be understood in terms of a single kind of modality. Fine, however, argues against this sort of “monism” about modality (2002c). Modality is, instead, fundamentally diverse. There are, argues Fine, at least three diverse and irreducible modal domains: the metaphysical, the normative, and the nomological.

In addition to this diversity in the modal domains, Fine also argues that there is diversity within a given modal domain (2005c). This emerges in considering a puzzle of how it is possible that Socrates is a man but does not exist, given that it is necessary that Socrates is a man but possible that Socrates does not exist. Just as there is a distinction between sempiternal truths that hold at each time (for example, ‘Trump lies or not’) and eternal truths that hold regardless of the time (for example, ‘2+2=4’), so too there are worldly necessities that hold at each world (for example, ‘Trump lies or not’) and unworldly or transcendent necessities that hold regardless of the world (for example, ‘2+2=4’). The puzzle can then be resolved by taking ‘Socrates is a man’ to be an unworldly necessity while taking ‘Socrates does not exist’ to be a worldly (contingent) possibility. The distinction between worldly and unworldly necessities provides for three grades of modality. The unextended grade concerns the purely worldly necessities, the extended grade concerns the purely worldly necessities and the purely unworldly necessities, and the superextended grade concerns “hybrids” of the first two grades. Fine argues that the puzzle’s initial appeal depends upon confusing these three grades of modality.

b. Essence

Perhaps one of Fine’s most well-known contributions to metaphysics is to rehabilitate the notion of essence. A notable antecedent was Kripke 1972. Positivism’s antipathy to metaphysics was still exerting much influence on philosophy when Kripke powerfully advocated for the legitimacy of a distinctively metaphysical notion of modality. Kripke used this notion to suggest various essentialist theses. Among them were that a person’s procreative origin was essential to them and that an artifact’s material origin was essential it. These essentialist theses, however, were usually taken to be theses of metaphysical necessity. The implicit background conception of essence was accordingly modal. On one formulation of it, an item has some feature essentially just in case it is necessary that it has that feature. Thus, Queen Elizabeth’s procreative origin is essential to her just in case it is necessary that she have that origin.

One of Fine’s distinctive contributions to rehabilitating essence was to argue against the modal conception of it (1994b). To do so, Fine introduced what is now a famous example. Consider the singleton set {Socrates} (the set whose sole member is Socrates). It is necessary that, if this set exists, then it has Socrates as a member. And so, by the modal conception, the set essentially has Socrates as a member. But, Fine argues, on plausible assumptions, it is also necessary that Socrates is a member of {Socrates}. And so, by the modal conception, it follows that Socrates is essentially a member of {Socrates}. This, however, is highly implausible: it is no part of what Socrates is that he should be a member of any set whatsoever. Fine raises a battery of similar counterexamples to the modal conception. His diagnosis of where it goes awry is that it is insensitive to the source of necessity. It lies in the nature of the singleton {Socrates}, not Socrates, that it has Socrates as a member. This induces an asymmetry in essentialist claims: {Socrates} essentially contains Socrates, but it is not the case that Socrates is essentially contained by {Socrates}. No modal conception of essence can capture this asymmetry because the two claims are both equally necessary.

Even if the modal conception of essence fails, it is not as if essence and modality are unconnected. Indeed, Fine provocatively suggests a reversal of the traditional connection. Whereas the modal approach attempted to characterize essence in terms of modality, Fine suggests instead that metaphysical necessities hold in virtue of the essences of things (1994b).

Whether or not this suggestion is correct, separating essence from modality already implies that the study of essence cannot be subsumed under the study of modality. Instead, it would seem essence must be studied as a subject in its own right. Toward this end, Fine discusses a wealth of distinctions involving essence including the distinctions between constitutive and consequential essence, immediate and mediate essence, and more (1994d).

An especially important application of essence is to the notion of ontological dependence. What something is may depend upon what another thing is. In this ontological sense of dependence, a set may depend on its members, or an instance of a feature may depend upon the individual bearing it. Fine has explored this notion of ontological dependence in detail and used to provide a characterization of substance (1995b). Additionally, he has also developed the formal logic and semantics of essence (1995a; 2000c).

c. Ontology

Ontology is often taken to concern what there is, or what exists. Some, however, have argued that there is a significant difference between being (what there is) and what exists. When being and existence are distinguished, it is often to claim that some things that have being nevertheless do not exist.

A recurring theme in Fine’s work is an openness to consider the being or nature of items regardless of whether they exist (1982b: §1; 1982c: §E1). This is most evident in the case of items that we are convinced do not exist. Like many others, Fine believes that, ultimately, there are no non-existents. But, perhaps unlike many others, Fine also believes that this is no obstacle to exploring their status or their nature (1982c). Fine’s explorations of this are rich in distinctions. The three most prominent are between Platonism and empiricism, literalism and contextualism, and internalism and externalism. The Platonist says non-existents do not depend on us or our activities, whereas the empiricist says they do. The literalist says non-existents literally have the properties they are said to have (for example, Sherlock Holmes literally lives in London), whereas the contextualist says instead that these properties are at most only had in a relevant context (namely, the Holmes stories). The internalist individuates non-existents solely in terms of the properties they have “internally” to the contexts in which they occur, whereas the externalist does not. Fine believes that all eight combinations of views are possible. But he focuses on developing and arguing against the four internalist views. A notable counterexample Fine gives to internalism is a story in which we imagine twins Dum and Dee who are indiscernible internally to the story but are nevertheless distinct. Two follow-up papers developing and defending externalism (Fine’s own favored combination conjoins empiricism, contextualism, and externalism) and comparing it to alternatives were planned but have not yet appeared (although 1984a further discusses related issues in the context of a critical review).

Behind Fine’s openness to considering the being or nature of items regardless of whether they exist is a general conception of ontology (2009). At least since Quine 1948, the dominant view has been that ontology’s central question, “What exists?”, should be understood as the question “What is there?”, and that this in turn should be understood as a quantificational question. Thus, to ask “Do numbers exist?” is to ask “Is there an x such that x is a number?”. Fine argues against this approach. One difficulty is that it seems to mischaracterize the logical form of ontological claims. Suppose we wish to answer “Yes, numbers exist”. It does not seem adequate to the answer that merely some number, say 13, exists. But that is all that is required for the quantificational answer to be correct. Instead, it seems our answer must be that all the numbers exist. This answer has the form “For every x, if x is a number, then x exists”. If ‘x exists’ is understood in the Quinean way in terms of a quantifier (namely: x exists =df. ∃y(x = y)), then it expresses a triviality that fails to capture the intended significance of the ontological question. Fine suggests that the intended significance can be restored by appealing to the notion of reality. The ontological, as opposed to quantificational question “Do numbers exist?” asks whether it is true that “For every x, if x is a number, then there is some feature that, in reality, x has”. This question is not answered by basic mathematical facts, but instead by whether numbers are part of the facts constituting reality.

Many ontologies are “constructional”. Some of their objects are accepted for being constructs of other accepted objects (perhaps with some objects as “given”: accepted but not on the basis of anything else). For example, we may accept subatomic particles into our ontology because they “construct” atoms, and we may also accept forests into our ontology because they are “constructed by” trees. Fine pursues an abstract study of constructional ontologies (1994e). The theory Fine develops can distinguish between actual and possible ontologies, as well as between absolute and relativist ontologies.

Relations have long puzzled philosophers. An especially difficult class of relations are those that appear to be non-symmetric. Unrequited love provides an example: although Scarlett loves Rhett, Rhett does not love Scarlett. It may seem that the relation loves is “biased” in that its first relatum is the lover and the second relatum the beloved. But it seems we must also recognize a converse is loved by relation “biased” in that its first relatum is the beloved and the second relatum the lover. Now, when Scarlett loves Rhett, is this because Scarlett and Rhett in this order stand in the loves relation, or because Rhett and Scarlett in that order stand in the is loved by relation? It seems we must say at least one, but either alone is arbitrary and both together is profligate. Fine develops a solution in terms of unbiased or “neutral” relations (2000b).

d. Mereology

Fine has made a variety of important contributions to abstract mereology (the theory of part and whole) as well as to its applications to various sorts of objects. Sometimes the term ‘mereology’ is used for a specific theory of mereology, namely classical extensional mereology. But an important theme in Fine’s work on mereology is to argue that this theory, and indeed much other thinking on mereology, is unduly narrow. Instead, Fine believes there is a highly general mereological framework that may accommodate a plurality of notions of part-whole (2010c). Different notions of part-whole correspond to different operations that may compose wholes from their parts. The notion of fusion from classical extensional mereological is but one of these compositional operations (and not a uniquely interesting one, he thinks). But there are other compositional operations that may apply even to abstract objects outside space and time. For example, the set-builder operation may be regarded as building a whole (the set) from its parts (its members). (Unlike Lewis 1991’s similar suggestion, Fine does not take the set-builder operation to be the fusion operation.) Fine contends that the general mereological framework for the plurality can be developed in abstraction from any of these particular applications of it.

Much of Fine’s work on mereology, however, has concerned its application to the objects of ordinary life and, in particular, to material things. Many have wanted to regard a material thing as identical with its matter. Perhaps the main objection to this view is the sheer wealth of counterexamples. A statue may be well-made although its matter is not. Fine has defended counterexamples like these at length (2003a). Even if a material thing and its matter are not identical, it may still seem as if they can occupy the same place at the same time. After all, the statue is now where its matter is. And some, including Locke, in 1689, have claimed that it is impossible for any two things (at least of the same sort) to occupy the same place at the same time. But Fine presents counterexamples even to this Lockean thesis (2000a). One can imagine, for instance, two letters being written on two sides of the same sheet of paper (or even written using the same words but which have dual meanings). The two letters then coincide but are distinct.

Even if material things are not identical to their matter, it may still be maintained that they are somehow aggregates of their matter. An aggregate of objects exists at a place or at a time exactly whenever or wherever some of those objects do too. If a quantity of gold, for example, is an aggregate of its left and right parts, then the quantity will exist whenever its left or right parts exist and wherever its left or right parts exist. But, Fine argues, if the left part is destroyed, the quantity will cease to exist although the aggregate will not. In general, then, ordinary material things are not aggregates but are instead compounds (1994a).

These considerations extend to how material things persist through time. A common view is that they persist by having (material) temporal parts. This view takes the existence of objects in time to be essentially like their extension in space: aggregative. Objects persist through time in much the same way as events unfold. But Fine argues, partly on the basis of mereological considerations, that this delivers highly implausible results, and suggests that instead we must recognize that the existence of objects in time is fundamentally different than their extension in space (Fine [1994a]; 2006a).

The lesson Fine draws from the preceding considerations is that a material thing neither is identical with, nor a mere aggregation of, its matter. Instead, Fine believes that the correct mereology of material things will be a version of hylomorphism: a material thing will be a compound of matter and form (2008a). Fine’s first applications of hylomorphism to acts, objects, and events provides an early glimpse of its broad scope (1982a). But the full breadth of its scope only emerged with Fine’s development of a general hylomorphic theory (1999). Its key notion is that of an embodiment. An embodiment may be either timeless (rigid) or temporal (variable). A rigid embodiment r = a,b,c,…/R is the object resulting from the objects a,b,c,… being in the relation R. A rigid embodiment is a hylomorphic compound that exists timelessly just in case its “matter” (the objects a,b,c,…) is in the requisite “form” (the relation R). So, for example, the statue (r) is identical with the hylomorphic compound of its clay parts (a,b,c,…) in the form of a statue (R). By contrast, a variable embodiment corresponds to a principle uniting its manifestations across times. Thus, a variable embodiment v = /V/ is a function V from times to things (which may themselves be rigid embodiments). Thus, for example, the statue over time (v) is a series of states at a time.

e. Realism

Fine has made influential contributions to debates about realism (2001). In general, the realist claims that some domain (for example, the mental or the moral) is real, whereas the antirealist claims that it is unreal. Although debates between realists and antirealists are common throughout philosophy, a precise and general characterization of their debate has been elusive. Fine argues against a variety of approaches familiar from the literature before settling on a distinctively metaphysical approach. What makes it distinctively metaphysical is its essential appeal to a metaphysical (as opposed to epistemic, conceptual, or semantic) notion of reality as well as to relatedly metaphysical notions of factuality and ground.

We may illustrate Fine’s approach by example. Set aside the moral error-theorist who believes that there are no moral facts whatsoever. Suppose, instead, that there are moral facts. One of them might be, we may suppose, that pointless torture is morally wrong. Moral realists and antirealists alike may agree that this fact is moral for containing some moral constituents (such as the property moral wrongness). And, unlike the error-theorist, they may agree that this fact obtains. What they dispute, however, is the fact’s status as real or unreal. Antirealism may come in either of two forms. The antirealist reductionist may, for example, accept the moral fact but insist that it is grounded in non-moral, naturalist facts that do not contain any moral constituents. The moral fact is unreal because it is grounded in non-moral facts. And the antirealist nonfactualist may, for example, accept the moral fact but insist that it is “nonfactual” in the sense that it does not represent reality but is rather a sort of “projection” of our attitudes, expressions, activities, or practices. The moral fact is unreal because it is neither real nor grounded in what is real. By contrast, the realist position consists in taking the moral fact as neither reducible nor nonfactual. The dispute between the realist, the antirealist reductionist, and the antirealist nonfactualist therefore turns on considerations of what grounds the moral facts. And, in general, debates over realism are, in effect, debates over what grounds what and therefore may be settled by determining what grounds what.

The framework Fine devised for debates over realism has proven rich in its implications. For one illustration, the metaphysical notion of reality figures prominently in other parts of Fine’s philosophy. Fine believes that the notion of reality plays a prominent role in ontological questions. And Fine uses the notion of reality to characterize the debate in the philosophy of time over the reality or unreality of tense. But the notion of ground provides an even more vivid illustration. In addition to ground’s central role in realist debates, it has itself become a topic of intense interest of its own.

f. Ground

Ground, as Fine conceives of it, is a determinatively explanatory notion. To say that Aristotle’s being rational and his being animal grounds his being a rational animal is to say that Aristotle is a rational animal because of, or in virtue of, his being rational and his being animal. Not only do questions of ground enjoy a prominent place in realist debates, but also within philosophy as a whole. Are moral facts grounded in naturalist facts? Are mental facts grounded in physical facts? Are facts of personal identity grounded in facts of psychological continuity? These and other questions of ground are among the biggest and most venerable questions in philosophy.

It is therefore a curiosity of recent times that ground has become a “hot topic” with a rapidly-expanding literature (Raven 2020). This is perhaps partly explained by the anti-metaphysical sentiments that swept over 20th century analytic philosophy. Although philosophers did not entirely turn their backs on questions of ground, the anti-metaphysical sentiments created a climate in which many felt the need to reinterpret them as questions of another sort (such as conceptual analysis, supervenience, or truthmaking). Fine, however, played a highly influential role in changing this climate. This is partly because Fine’s work not only discussed ground in its application to other topics (such as realism), but also treated ground as a topic worthy of study in its own right (see Raven 2019 for further discussion). Fine provided a detailed exploration of ground, introducing many now familiar distinctions of ground and its connections to related topics, such as essence (2012c). Additionally, Fine has developed the so-called “pure logic” of ground (2012d). He also problematized ground by discovering some puzzles involving ground and its relation to classical logic (2010b).

Although Fine had recognized certain similarities between essence and ground, he was initially inclined to separate them (2012c: 80):

The two concepts [essence and ground] work together in holding up the edifice of metaphysics; and it is only by keeping them separate that we can properly appreciate what each is on its own and what they are capable of doing together.

But not long after, Fine changed his view (2015b: 297):

I had previously referred to essence and ground as the pillars upon which the edifice of metaphysics rests…, but we can now see more clearly how the two notions complement one another in providing support for the very same structure.

The unification appeals to a conception of constitutively necessary and sufficient conditions on arbitrary objects (1985d). For example, for true belief to be essential to knowledge is for it to be a constitutively necessary condition on an arbitrary person’s knowing something that they truly believe it. And, for another example, for a set’s having no members to ground its being identical with the null set is for it to be a constitutively sufficient condition on an arbitrary set’s having no members that it is identical with the null set.

This previous example illustrates an identity criterion: a statement of the conditions in virtue of which two items are the same. Many philosophers have been tempted to reject identity criteria for being pointless, trivial, or unintelligible. But Fine argues against such rejections and, instead, defends the intelligibility and, indeed, the potential substantivity of identity criteria by appealing to ground and arbitrary objects (2016b). Roughly, an identity criterion states that, given two arbitrary objects, they are the same when the fact that they are identical is grounded in the fact that they satisfy a specified condition. For example, given two arbitrary sets, they are the same when their identity is grounded in their having the same members.

g. Tense

One striking application of Fine’s work on realism and ground is to the philosophy of time. McTaggart 1908 notoriously argued for the unreality of time. Although McTaggart’s argument generated considerable discussion, the general impression has been that whatever challenge it posed to the reality of time can somehow be met. Fine argues that the challenge lurking within McTaggart’s argument is more formidable than usually thought (2005f, of which 2006b is an abridgement). Taking inspiration from McTaggart, Fine formulates his own argument against the reality of tense. The argument relies on four assumptions that each make essential appeal to the notion of reality:

Realism Reality is constituted (at least, in part) by tensed facts.
Neutrality No time is privileged, the tensed facts that constitute reality are not oriented towards one time as opposed to another.
Absolutism The constitution of reality is an absolute matter, not relative to a time or other form of temporal standpoint.
Coherence Reality is not contradictory; it is not constituted by facts with incompatible content.

Reality contains some tensed facts (Realism). Because things change, these will be diverse. Although you are reading, you aren’t always reading. So, one of these tensed facts is that you are reading whereas another of them is that you are not reading. None of these tensed facts are oriented toward any particular time (Neutrality). Nor do they obtain relative to any particular time (Absolutism). So reality is constituted by incompatible facts. But reality cannot be incoherent like that (Coherence). And so the four assumptions conflict. The antirealist reaction is to reject Realism, and so the reality of time. The realist accepts Realism, and so must reject another assumption. The challenge is to explain which. The “standard” realist denies Neutrality by privileging the present time. But Fine argues that there are two overlooked “nonstandard” responses. The relativist denies Absolutism, and so takes the constitution of reality to be irreducibly relative to a time.  The fragmentalist denies Coherence, and so takes reality to divide into incompatible temporal “fragments”. Fine argues that the nonstandard realisms (and, in particular, fragmentalism) are, despite their obscurity, more defensible than standard realism.

Fine relates these considerations to the vexing case of first-personal realism. Standard realism about first-personal facts implausibly privileges a first-personal perspective. Overlooking nonstandard realisms, one may then draw the antirealist conclusion that there are no first-personal facts. But Fine’s apparatus reveals two nonstandard realist options: relativism and fragmentalism. According to Fine, these options (and, in particular, fragmentalism) are especially intuitive in the first-personal case. Indeed, Fine suggests that the question of the reality of tense might have more in common with the question of the reality of the first-personal, despite its more familiar association with the question of the reality of the modal.

4. Philosophy of Language

Fine has made four main contributions to the philosophy of language. The first two are in support of the referentialist tradition. One is to bolster arguments against the competing Fregean tradition. The other is to develop a novel version of referentialism, semantic relationism, that is superior to its referentialist competitors. The third contribution is to the nature of vagueness. And the fourth contribution is the development of an original approach to semantics, truthmaker semantics.

a. Referentialism

The referentialist tradition takes certain terms, especially names, to refer without the mediation of any Fregean sense or other descriptive information. Fine has made two main contributions in support of referentialism.

Fine’s first contribution to referentialism is to bolster arguments against Fregeanism. This includes a variety of supporting arguments scattered throughout his book Semantic Relationism (2007b). Perhaps the most notable of these is a thought experiment against the existence of the senses the Fregean posits (2007b: 36). The scenario involves a person in a universe that is perfectly symmetrically arranged around her center of vision. Her visual field therefore perfectly duplicates whatever is visible on the left to the right, and on the right to the left. When she is approached by two identical twins, she may name each ‘Bruce’. It seems she may refer by name to each. The Fregean can agree only if there is a pair of senses, one for the left ‘Bruce’ and the other for the right ‘Bruce’. But given the symmetry of the scenario, it seems there is no possible basis for thinking that the pair exists.

b. Semantic Relationism

Fine’s second contribution to referentialism is to introduce and develop what he argues is its most viable form: semantic relationism. The view is developed in his book Semantic Relationism which expands on his John Locke Lectures delivered at University of Oxford in 2003 (2007b).

Semantic relationism is representational in that it aims to account for the meanings of expressions in terms of what they represent (objects, properties, states of affairs, and so on). But it differs significantly from other representational approaches. These have typically (and implicitly) assumed that the meaning of an expression is intrinsic to it and so one is never required to consider any other expressions in accounting for the meaning of a given expression. Semantic relationism denies this. Instead, the meaning of (at least some) expressions at least partly consists in its “coordinative” relations to other meaningful expressions. This is different from typical kinds of semantic holism which usually characterize an expression’s meaning in non-representational terms and, instead, in terms of its inferential role.

One of the main benefits of semantic relationism is that it provides solutions to a variety of vexing puzzles, including the antinomy of the variable (2003b), Frege’s puzzle (Frege 1892), and Kripke’s puzzle about belief (Kripke 2011). To illustrate, Frege observed that an identity statement, like ‘Cicero is Cicero’, could be uninformative whereas another, like ‘Cicero is Tully’, could be informative despite the names ‘Cicero’ and ‘Tully’ being coreferential. Frege’s own solution was to bifurcate semantics into a level of sense and a level of reference. This enabled him to claim that the names ‘Cicero’ and ‘Tully’ differ in sense but not in reference. But powerful arguments from Kripke 1972 and others convinced many that the semantics of names only involve reference, not sense. How could one reconcile this referentialism about the semantics of names with Frege’s observation? Semantic relationism offers a novel answer. The pair ‘Cicero’,’Cicero’ in ‘Cicero is Cicero’ are coordinated: it is a semantic requirement that they co-refer. By contrast, the pair ‘Cicero’,‘Tully’ in ‘Cicero is Tully’ are uncoordinated: it is not a semantic requirement that they co-refer. This difference in coordination among the pairs of expressions explains the difference in their informativeness. But it is only by considering the pairs in relation to one another that this difference can even be recognized. The notion of semantic requirement involves a distinctive kind of semantic modality that Fine argues should play a significant role in semantic theorizing (2010a).

c. Vagueness

Fine provided what is widely considered to be the locus classicus for the so-called supervaluationist approach to vagueness (1975d). On this approach, vagueness is a kind of deficiency in meaning. What makes the deficiency specific to vagueness is that it gives rise to “borderline cases”. For example, the vague predicate ‘is bald’ admits of borderline cases. These are cases in which the predicate’s meaning does not settle whether it applies or does not apply to, say, a man with a receding hairline and thinning hair. Borderline cases pose an initial problem for classical logic. For if the predicate ‘is bald’ neither truly applies nor falsely applies in such cases, how could it be true to say ‘That man is bald or is not bald’? Supervaluationism answers by considering the admissible ways in which a vague predicate can be completed or made more precise. The sentence ‘That man is bald’ is “super-true” if true under every such “precisification”, “super-false” if false under every “precisification”, and neither otherwise. It can then be argued that ‘That man is bald or is not bald’ will be super-true because it will be true under every precisification, despite neither disjunct being super-true. This in turn helps supervaluationism provide a response to the Sorites Paradox.

In more recent work, Fine has given up on supervaluationism and instead developed an alternative approach. Fine’s reasons for rejecting supervaluationism are not specific to it but rather derive from a more far-reaching argument. Fine presents an apparent proof of the impossibility of vagueness (2008b). The challenge is to explain where the proof goes awry, since there is no question that vagueness is possible. But, Fine argues, standard accounts of vagueness, including especially supervaluationism, cannot satisfactorily meet this challenge. So, an alternative account is needed.

Fine develops such an alternative account that relies on a distinction between global and local vagueness (2015a). Global vagueness is vagueness over a range of cases, such as a series of indiscernible but distinct color tiles arranged incrementally from orange to red. Local vagueness is vagueness in a single case, such as in a single vermilion tile midway between the orange and red tiles. Given the distinction, there is a strong temptation to reduce global vagueness to local vagueness. But Fine argues against this. His own “globalist” approach, he argues, not only is able to meet the challenge of explaining the possibility of vagueness, but also why it does not succumb to the Sorites Paradox.

d. Truthmaker Semantics

In a series of articles, Fine develops a novel semantic approach he calls truthmaker semantics. The approach is in some ways like the more familiar possible-worlds semantics and, especially, situation semantics. But truthmaker semantics diverges from both. The contrast with possible-worlds semantics is especially vivid. On the latter approach, the truth-value of a sentence is evaluated with respect to a possible world in its entirety, no matter how irrelevant parts of that world might be to making the sentence true. Thus, ‘Fido barks’ will be true with respect to an entire possible world just in case it is a world in which Fido barks. Such a world includes much that is irrelevant to Fido’s barking, including sea turtle migration, weather patterns in sub-Saharan Africa, and distant galaxies. Truthmaker semantics departs in two ways from this. First, and like situation semantics, it replaces worlds with states which may, to a first approximation, be regarded as parts of worlds. So, for example, it is not the entire world—sea turtle migration, sub-Saharan weather, and distant galaxies included—that verifies or makes ‘Fido barks’ true, but rather instead just the state of Fido’s barking. What’s more, this state, unlike the entire world itself, does not verify any truths about sea turtles, sub-Saharan weather, and distant galaxies. Second, and unlike situation semantics, it is required that a state verifying a sentence must be wholly or exactly relevant to its truth. So, for example, the state that Fido barks and it’s raining in Djibouti will not verify ‘Fido barks’ because it includes an irrelevant part about Djibouti’s weather.

The general framework of truthmaker semantics is developed over the course of numerous articles (but see 2017c for an overview). An important feature of it is its abstractness. The semantics is specified in terms of a space of states, or a state space. The state space is assumed to have some mereological structure. But the assumptions are minimal and, in particular, no assumptions are made about the nature of the states themselves. This makes the framework highly abstract. This in turn grants the framework enormous flexibility in its potential range of applications. Indeed, Fine believes the main benefits of the general framework emerge from its wealth of applications to a wide variety of topics. These include: analytic entailment (2016a), counterfactuals (2012a; 2012b), ground (2020a), intuitionistic logic (2014b), semantic content (Fine [2017a,2017b]), the is-ought gap (2018b), verisimilitude (2019d; 2020b), impossible worlds (2019c), deontic and imperative and imperative statements (2014a; 2019a; 2019b), and more. This is not the place for a comprehensive survey of these applications. Still, one may get a sense of them by considering three applications in more detail.

First, consider counterfactuals. The standard semantics for counterfactuals derives from Stalnaker 1968 and Lewis 1973. According to Lewis’ version of it, the counterfactual ‘If A then it would be that C’ is true just in case no possible world in which A but not C is true is closer to actuality than any in which both A and C are true. Fine’s opposition to this semantics is evident from his critical notice (1975a) of Lewis’s book. There Fine introduced the so-called “future similarity objection”. It takes the form of a counterexample showing that small changes can make for great dissimilarities. Fine’s celebrated case was the counterfactual ‘If Nixon had pressed the button, then there would have been a nuclear holocaust’. Although it seems true, the standard semantics struggles to validate it. The great dissimilarities of a world where Nixon pressed the button causing nuclear holocaust ensure it is further from actuality than a world where Nixon pressed the button without nuclear holocaust. Fine’s critical notice also contained the seeds of ideas that later emerged in his work on truthmaker semantics. There he also objects that the standard semantics is committed to unsound implications because it permits the substitution of tautologically equivalent statements. This objection was prescient for anticipating a similar difficulty later developed in greater detail against the standard semantics (2012a; 2012b). Fine argues that the difficulty can be avoided by providing a truthmaker semantics for counterfactuals. Roughly, ‘If A then it would be that C’ is true just in case any possible outcome of a state verifying A also contains a state verifying C.

Second, consider intuitionistic logic. Realists and antirealists alike tend to agree that certain technical aspects of intuitionistic logic provide a natural home for antirealism. This would be a mistake, however, if intuitionistic logic could be given a realist semantic foundation. Fine shows how truthmaker semantics can be used to provide just such a realist semantics for intuitionistic logic (2014b).

Third, consider the is-ought gap. Hume 1739 famously argued for a gap between ‘is’ and ‘ought’ statements: one cannot validly derive any statement about what ought to be from any statements about what is. Despite the appeal of such a gap, it has not been easy to formulate it clearly. What’s more, standard formulations are vulnerable to superficial but resilient counterexamples (Prior 1960). Fine shows how truthmaker semantics can be used to formulate the gap in a way that avoids such superficial counterexamples (2018b).

5. Logics and Mathematics

Fine has made a variety of seminal technical contributions to formal logic as well as to philosophical logic and the philosophy of mathematics. These contributions may be organized, respectively, into three major groups: formal logic (especially modal logics), arbitrary objects, and the foundations of mathematics (broadly construed so as to include the theory of sets and classes).

a. Logics

Most of Fine’s earliest work focused on technical questions within formal logic, especially on modal logics. A detailed synopsis of Fine’s technical work is beyond the scope of this article. But a very brief summary of them can be given here:

  • various results about modal logics with propositional quantifiers (1970 which presents results from Fine’s Ph.D. dissertation 1969);
  • a completeness proof for a predicate logic without identity but with primitive numerical quantifiers (1972a);
  • early developments of graded modal logic (1972b);
  • various results about S4 logics (those with reflexive and transitive Kripke frames) and certain extensions of them (1971; 1972c; 1974a; 1974b);
  • the application of normal forms to a general completeness proof for “uniform” modal logics (1975b);
  • a seminal “canonicity theorem” for modal logics (1975c);
  • completeness results for logics containing K4 (those with transitive Kripke frames) (1974c; 1985a);
  • failure of Craig’s interpolation lemma for various quantified modal logics (1979);
  • the underivability of a quantifier permutation principle in certain modal systems without identity (1983b);
  • an exploration into whether truth can be defined without the notion of satisfaction (joint work with McCarthy 1984b);
  • incompleteness results for standard semantics for quantified relevance logic and an alternative semantics for it that is complete (1988; 1989a);
  • the development of stability (or “felicitous”) semantics for the conception of “negation as failure” in logic programming and computer science (1989b); and
  • general results about how properties of “monomodal” logics containing a single modal operator may transfer to a “multimodal” logic joining them (joint work with Schurz 1996).

In addition, Fine also wrote several articles in economic theory (1973a; 1972d), including two with his brother, economist Ben Fine (1974d; 1974e).

b. Arbitrary Objects

We often speak of arbitrary objects—an arbitrary integer, an arbitrary American, and so on. But at least since Berkeley 1710, the notion of an arbitrary object has been thought to be dispensable, if not outright incoherent. But in Fine’s book Reasoning with Arbitrary Objects, he argued that familiar opposition to arbitrary objects is misplaced and that they can, contrary to received wisdom, be given a rigorous theoretical foundation (1985d and its abridgements 1983a; 1985b).

The matter is not a mere intellectual curiosity. For it turns out, according to Fine, that arbitrary objects have various important applications. One salient application is to natural deduction and, especially, the logic of generality (1985d; 1985b). To illustrate, consider how one might explain the rule of universal generalization to students of a first formal logic course. One might say that if one can show that an arbitrary item a satisfies some condition f, then one may deduce that every item whatsoever satisfies that condition: “xf(x). Standard glosses on the rule ultimately attempt to avoid any appeal to the arbitrary item in favor of some alternative construal. But given Fine’s defense of arbitrary objects, there is no need to avoid appealing to them, and, in fact, it may be argued that they provide a more direct and satisfying account of the rule than alternative accounts do. Fine has also explored other applications to mathematical logic, the philosophy of language, and the history of ideas are also explored (1985d).

More recently, Fine has found new applications for arbitrary objects. One is to Cantor’s abstractionist constructions of cardinal numbers and order types. The constructions have faced formidable objections. But, according to Fine, the objections can be overcome by appealing to the theory of arbitrary objects (1998). In a belated companion article, Fine argues that his theory of arbitrary objects combined with the Cantorian approach can be extended to provide a general theory of types or forms, of which structural universals end up being a special case (2017a). And Fine also puts arbitrary objects to use when attempting to provide a paradox-free construction of sets or classes that allows for the existence of a universal class and for the Frege-Russell cardinal numbers (2005a), characterizing identity criteria (2016b) as well as unified foundations for essence and ground (2015b). Fine is currently preparing a revised version of Reasoning with Arbitrary Objects.

c. Philosophy of Mathematics

Most of Fine’s contributions to the philosophy of mathematics concern various foundational issues. Much recent interest in these issues derives from Frege’s infamous attempt to secure the foundations of mathematics by deriving it from logic alone. Frege’s attempt foundered in the early 1900s with the discovery of the set-theoretic paradoxes. Much of Fine’s work in the philosophy of mathematics concern the prospects for reviving Frege’s project without paradox.

At the heart of Frege’s own attempt was the notion of abstraction. Just as we may abstract the direction of two lines from their being parallel, so too we may abstract the number of two classes from their equinumerosity. Frege’s own use of abstraction ultimately led to paradox. But since then, neo-Fregeans (such as Fine’s colleague Crispin Wright and Bob Hale) have attempted to salvage much of Frege’s project by refining the use of abstraction in various ways. Fine has provided a detailed exploration of a general theory of abstraction as well as its prospects for sustaining neo-Fregean ambitions (2002a).

The discovery of the set-theoretic paradoxes generated turmoil within the foundations of mathematics and for associated philosophical programs. Since then, there have been a variety of attempts to provide a paradox-free construction of sets or classes. These attempts usually assume a notion of membership in their construction of the ontology. But Fine reverses the direction and constructs notions of membership in terms of the assumed ontology. This, Fine argues, has various advantages over standard constructions (2005a).

Many have thought that a central lesson of the aforementioned set-theoretic paradoxes is that quantification is inevitably restricted. Were it possible to quantify unrestrictedly over absolutely everything, then paradox would result. Instead, we may indefinitely extend the range of quantification without ever paradoxically quantifying over absolutely everything. So, it seems, quantification is always restricted, albeit indefinitely extendible. A persistent difficulty in sustaining this point of view, however, is the apparent arbitrariness of any restriction. Fine argues that the difficulty can be avoided (2006c). Quantification’s being absolute and its being unrestricted are often conflated. But Fine argues that they are distinct. Distinguishing them allows us to conceive of the possibility of quantification that is unrestricted but not absolute.

A recurring theme in some of the preceding papers is an approach to mathematics that Fine calls procedural postulationism. Traditional versions of postulationism take the existence of mathematical items and the truths about them to derive from certain propositions we postulate. But Fine’s procedural postulationism takes these postulates to be imperatival instead (e.g. “For each item in the domain that is a number, introduce another number that is its successor”). Fine believes this one difference helps postulationism provide a more satisfactory metaphysics, semantics, and epistemology of mathematics. Although procedural postulationism is hinted at in the previous articles, it is discussed in more detail in the context of discussing knowledge of mathematical items (2005d). Fine has indicated that he believes the core ideas of procedural postulationism may extend more generally, and briefly discusses their application to the metaphysics of material things (2007a).

6. History

It is not hard to find Aristotle’s influence in much of Fine’s work. But in addition to developing various Aristotelian themes, Fine has also directly contributed to more exegetical scholarship on Aristotle’s own work. These contributions have primarily focused on developing an account of Aristotle’s views on substance and what we may still learn from them. This begins with an attempt to formalize Aristotle’s views on matter (1992). Fine later raises a puzzle for Aristotle (and other neo-Aristotelians) concerning how the matter now composing one hylomorphic compound, say Callias, could later come to compose another hylomorphic compound, say Socrates (1994c). According to Aristotle, the world contains elements that may compose mixtures, and these mixtures in turn compose substances. Fine argues against conceptions of mixtures that take them to be at the same level as the elements composing them and, instead, defends a conception on which they are at a higher level (1995d). Finally, Fine argues that the best interpretation of a vexing discussion in Metaphysics Theta.4 is that Aristotle was attempting to introduce a novel conception of modality (2011a).

Additionally, Fine has written on Husserl’s discussions from the Logical Investigations on part and whole and the related topics of dependence, necessity, and unity (1995c). Fine also has work in preparation on Bolzano’s conception of ground.

7. References and Further Reading

  • Berkeley, George. 1710. A Treatise Concerning the Principles of Human Knowledge.
  • Fine, Kit. 1969. For Some Proposition and So Many Possible Worlds. University of Warwick.
  • Fine, Kit. 1970. “Propositional Quantifiers in Modal Logic.” Theoria 36 (3): 336-46.
  • Fine, Kit. 1971. “The Logics Containing S4.3.” Zeitschrift für Mathematische Logik und Grundlagen der Mathematik 17 (1): 371-76.
  • Fine, Kit. 1972a. “For So Many Individuals.” Notre Dame Journal of Formal Logic 13 (4): 569-72.
  • Fine, Kit. 1972b. “In So Many Possible Worlds.” Notre Dame Journal of Formal Logic 13 (4): 516-20.
  • Fine, Kit. 1972c “Logics Containing S4 without the Finite Model Property.” In Conference in Mathematical Logic–London ’70, edited by W. Hodges. New York: Springer-Verlag.
  • Fine, Kit. 1972d. “Some Necessary and Sufficient Conditions for Representative Decision on Two Alternatives.” Econometrica 40 (6): 1083-90.
  • Fine, Kit. 1973a. “Conditions for the Existence of Cycles under Majority and Non-minority Rules.” Econometrica 41 (5): 889-99.
  • Fine, Kit. 1974a. “An Ascending Chain of S4 Logics.” Theoria 40 (2): 110-16.
  • Fine, Kit. 1974b. “An Incomplete Logic Containing S4.” Theoria 40 (1): 23-29.
  • Fine, Kit. 1974c. “Logics Containing K4 – Part I.” The Journal of Symbolic Logic 39 (1): 31-42.
  • Fine, Kit. 1975a. “Critical Notice: Counterfactuals, by David Lewis.” Mind 84 (335): 451-58. Reprinted in Modality and Tense: Philosophical Papers.
  • Fine, Kit. 1975b. “Normal Forms in Modal Logic.” Notre Dame Journal of Formal Logic 16 (2): 229-34.
  • Fine, Kit. 1975c. “Some Connections Between Elementary and Modal Logic.” In Proceedings of the Third Scandinavian Logic Symposium, edited by S. Kanger. Amsterdam: North-Holland.
  • Fine, Kit. 1975d. “Vagueness, Truth and Logic.” Synthese 30: 265-300.
  • Fine, Kit. 1977a “Prior on the Construction of Possible Worlds and Instants.” In Worlds, Times and Selves, edited by A. N. Prior and K. Fine. London: Duckworth. Reprinted in Modality and Tense: Philosophical Papers.
  • Fine, Kit. 1977b. “Properties, Propositions and Sets.” Journal of Philosophical Logic 6: 135-91.
  • Fine, Kit. 1978a. “Model Theory for Modal Logic – Part I: The De Re/De Dicto Distinction.” Journal of Philosophical Logic 7 (1): 125-56.
  • Fine, Kit. 1978b. “Model Theory for Modal Logic – Part II: The Elimination of De Re Modality.” Journal of Philosophical Logic 7 (1): 277-306.
  • Fine, Kit. 1979. “Failures of the Interpolation Lemma in Quantified Modal Logic.” The Journal of Symbolic Logic 44 (2): 201-06.
  • Fine, Kit. 1980. “First-order Modal Theories II – Propositions.” Studia Logica 39 (2/3): 159-202.
  • Fine, Kit. 1981a. “First-order Modal Theories I – Sets.” Noûs 15 (2): 177-205.
  • Fine, Kit. 1981b. “Model Theory for Modal Logic – Part III: Existence and Predication.” Journal of Philosophical Logic 10 (3): 293-307.
  • Fine, Kit. 1982a. “Acts, Events and Things.” In Language and Ontology, edited by W. Leinfellner, E. Kraemer and J. Schank. Wien: Hölder-Pichler-Tempsky, as part of the proceedings of the Sixth International Wittgenstein Symposium 23rd to 30th August 1981, Kirchberg/Wechsel (Austria).
  • Fine, Kit. 1982b. “First-order Modal Theories III – Facts.” Synthese 53: 43-122.
  • Fine, Kit. 1982c. “The Problem of Non-existents.” Topoi 1: 97-140.
  • Fine, Kit. 1983a. “A Defence of Arbitrary Objects.” Proceedings of the Aristotelian Society, Supplementary Volume 57: 55-77.
  • Fine, Kit. 1983b. “The Permutation Principle in Quantificational Logic.” Journal of Philosophical Logic 12 (1): 33-37.
  • Fine, Kit. 1984a. “Critical Review of Parsons’ Non-Existent Objects.” Philosophical Studies 45 (1): 95-142.
  • Fine, Kit. 1985a. “Logics Containing K4 – Part II.” The Journal of Symbolic Logic 50 (3): 619-51.
  • Fine, Kit. 1985b. “Natural Deduction and Arbitrary Objects.” Journal of Philosophical Logic 14: 57-107.
  • Fine, Kit. 1985c “Plantinga on the Reduction of Possibilist Discourse.” In Alvin Plantinga, edited by J. E. Tomberlin and P. van Inwagen. Dordrecht: Reidel. Reprinted in Modality and Tense: Philosophical Papers.
  • Fine, Kit. 1985d. Reasoning with Arbitrary Objects. Oxford: Blackwell.
  • Fine, Kit. 1988. “Semantics for Quantified Relevance Logic.” Journal of Philosophical Logic 17 (1): 27-59.
  • Fine, Kit. 1989a “Incompleteness for Quantified Relevance Logics.” In Directions in Relevant Logics, edited by R. Sylvan and J. Norman. Dordrecht: Kluwer.
  • Fine, Kit. 1989b “The Justification of Negation as Failure.” In Proceedings of the Congress on Logic, Methodology and the Philosophy of Science VIII, edited by J. Fenstad, T. Frolov and R. Hilpinen. Amsterdam: Elsner Science Publishers B. V.
  • Fine, Kit. 1989c. “The Problem of De Re Modality.” In Themes from Kaplan, edited by J. Almog, J. Perry and H. Wettstein. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
  • Fine, Kit. 1990. “Quine on Quantifying In.” In Proceedings of the Conference on Propositional Attitudes, edited by C. A. Anderson and J. Owens. Stanford: CSLI. Reprinted in Modality and Tense: Philosophical Papers.
  • Fine, Kit. 1992. “Aristotle on Matter.” Mind 101 (401): 35-57.
  • Fine, Kit. 1994a. “Compounds and Aggregates.” Noûs 28 (2): 137-58.
  • Fine, Kit. 1994b. “Essence and Modality.” Philosophical Perspectives 8: 1-16.
  • Fine, Kit. 1994c “A Puzzle Concerning Matter and Form.” In Unity, Identity, and Explanation in Aristotle’s Metaphysics, edited by T. Scaltsas, D. Charles and M. L. Gill. Oxford: Oxford University Press.
  • Fine, Kit. 1994d “Senses of Essence.” In Modality, Morality and Belief: Essays in Honor of Ruth Barcan Marcus, edited by W. Sinnott-Armstrong. Cambridge: Cambridge University Press.
  • Fine, Kit. 1994e. “The Study of Ontology.” Noûs 25 (3): 263-94.
  • Fine, Kit. 1995a. “The Logic of Essence.” Journal of Philosophical Logic 24: 241-73.
  • Fine, Kit. 1995b. “Ontological Dependence.” Proceedings of the Aristotelian Society 95: 269-90.
  • Fine, Kit. 1995c. “Part-Whole.” In The Cambridge Companion to Husserl, edited by B. Smith and D. Woodruff. Cambridge: Cambridge University Press.
  • Fine, Kit. 1995d. “The Problem of Mixture.” Pacific Philosophical Quarterly 76 (3-4): 266-369.
  • Fine, Kit. 1998. “Cantorian Abstraction: A Reconstruction and Defense.” The Journal of Philosophy 95 (12): 599-634.
  • Fine, Kit. 1999. “Things and Their Parts.” Midwest Studies in Philosophy 23: 61-74.
  • Fine, Kit. 2000a. “A Counter-example to Locke’s Thesis.” The Monist 83 (3): 357-61.
  • Fine, Kit. 2000b. “Neutral Relations.” The Philosophical Review 109 (1): 1-33.
  • Fine, Kit. 2000c. “Semantics for the Logic of Essence.” Journal of Philosophical Logic 29 (6): 543-84.
  • Fine, Kit. 2001. “The Question of Realism.” Philosophers’ Imprint 1 (2): 1-30.
  • Fine, Kit. 2002a. The Limits of Abstraction. Oxford: Clarendon Press.
  • Fine, Kit. 2002b. “The Problem of Possibilia.” In Handbook of Metaphysics, edited by D. Zimmerman. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
  • Fine, Kit. 2002c. “The Varieties of Necessity.” In Conceivability and Possibility, edited by T. S. Gendler and J. Hawthorne. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
  • Fine, Kit. 2003a. “The Non-Identity of a Material Thing and Its Matter.” Mind 112 (446): 195-234.
  • Fine, Kit. 2003b. “The Role of Variables.” The Journal of Philosophy 50 (12): 605-31.
  • Fine, Kit. 2005a. “Class and Membership.” The Journal of Philosophy 102 (11): 547-72.
  • Fine, Kit. 2005b. Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
  • Fine, Kit. 2005c. “Necessity and Non-existence.” In Modality and Tense: Philosophical Papers.
  • Fine, Kit. 2005d. “Our Knowledge of Mathematical Objects.” In Oxford Studies in Epistemology, edited by T. S. Gendler and J. Hawthorne. Oxford: Clarendon Press.
  • Fine, Kit. 2005e. “Reference, Essence, and Identity.” In Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
  • Fine, Kit. 2005f. “Tense and Reality.” In Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
  • Fine, Kit. 2006a. “In Defense of Three-Dimensionalism.” The Journal of Philosophy 103 (12): 699-714.
  • Fine, Kit. 2006b. “The Reality of Tense.” Synthese 150 (3): 399-414.
  • Fine, Kit. 2006c “Relatively Unrestricted Quantification.” In Absolute Generality, edited by A. Rayo and G. Uzquiano. Oxford: Clarendon Press.
  • Fine, Kit. 2007a. “Response to Kathrin Koslicki.” dialectica 61 (1): 161-66.
  • Fine, Kit. 2007b. Semantic Relationism. Oxford: Blackwell Publishing.
  • Fine, Kit. 2008a. “Coincidence and Form.” Proceedings of the Aristotelian Society, Supplementary Volume 82 (1): 101-18.
  • Fine, Kit. 2008b. “The Impossibility of Vagueness.” Philosophical Perspectives 22 (Philosophy of Language): 111-36.
  • Fine, Kit. 2009. “The Question of Ontology.” In Metametaphysics: New Essays on the Foundations of Ontology, edited by D. Chalmers, D. Manley and R. and Wasserman. Oxford: Oxford University Press.
  • Fine, Kit. 2010a. “Semantic Necessity.” In Modality: Metaphysics, Logic, and Epistemology, edited by B. Hale and A. Hoffmann. Oxford: Oxford University Press.
  • Fine, Kit. 2010b. “Some Puzzles of Ground.” Notre Dame Journal of Formal Logic 51 (1): 97-118.
  • Fine, Kit. 2010c. “Towards a Theory of Part.” The Journal of Philosophy 107.
  • Fine, Kit. 2011a. “Aristotle’s Megarian Manoeuvres.” Mind 120 (480): 993-1034.
  • Fine, Kit. 2011b. “What is Metaphysics?” In Contemporary Aristotelian Metaphysics, edited by T. E. Tahko. Cambridge: Cambridge University Press.
  • Fine, Kit. 2012a. “Counterfactuals without Possible Worlds.” The Journal of Philosophy 59 (3): 221-46.
  • Fine, Kit. 2012b. “A Difficulty for the Possible Worlds Analysis of Counterfactuals.” Synthese 189 (1): 29-57.
  • Fine, Kit. 2012c “Guide to Ground.” In Metaphysical Grounding: Understanding the Structure of Reality, edited by F. Correia and B. Schnieder. Cambridge: Cambridge University Press.
  • Fine, Kit. 2012d. “The Pure Logic of Ground.” The Review of Symbolic Logic 25 (1): 1-25.
  • Fine, Kit. 2013a. “Fundamental Truths and Fundamental Terms.” Philosophy and Phenomenological Research 87 (3): 725-32.
  • Fine, Kit. 2014a. “Permission and Possible Worlds.” dialectica 68 (3): 317-36.
  • Fine, Kit. 2014b. “Truth-Maker Semantics for Intuitionistic Logic.” Journal of Philosophical Logic 43: 549-77.
  • Fine, Kit. 2015a. “The Possibility of Vagueness.” Synthese 194 (10): 3699-725.
  • Fine, Kit. 2015b. “Unified Foundations for Essence and Ground.” Journal of the American Philosophical Association 1 (2): 296-311.
  • Fine, Kit. 2016a. “Angellic Content.” Journal of Philosophical Logic 45 (2): 199-226.
  • Fine, Kit. 2016b. “Identity Criteria and Ground.” Philosophical Studies 173 (1): 1-19.
  • Fine, Kit. 2016c. “Williamson on Fine on Prior on the Reduction of Possibilist Discourse.” Canadian Journal of Philosophy 46 (4-5): 548-70.
  • Fine, Kit. 2017a. “Form.” The Journal of Philosophy CXIV (10): 509-35.
  • Fine, Kit. 2017b. “Naive Metaphysics.” Philosophical Issues 27: 98-113.
  • Fine, Kit. 2017c. “Truthmaker Semantics.” In A Companion to the Philosophy of Language, edited by B. Hale, C. Wright and A. Miller. West Sussex: Wiley-Blackwell.
  • Fine, Kit. 2018a. “Ignorance of Ignorance.” Synthese 195 (9): 4031-45.
  • Fine, Kit. 2019c. “Constructing the Impossible.” In to appear in a collection of papers for Dorothy Edgington.
  • Fine, Kit. 2020a. “Semantics.” In The Routledge Handbook of Metaphysical Grounding, edited by M. J. Raven. New York: Routledge.
  • Fine, Kit, and Ben Fine. 1974d. “Social Choice and Individual Rankings I.” Review of Economic Studies 41: 303-22.
  • Fine, Kit, and Ben Fine. 1974e. “Social Choice and Individual Rankings II.” Review of Economic Studies 41: 459-75.
  • Fine, Kit, and Timothy McCarthy. 1984b. “Truth without Satisfaction.” Journal of Philosophical Logic 13 (4): 397-421.
  • Fine, Kit, and Gerhard Schurz. 1996. “Transfer Theorems for Multimodal Logics.” In Logic and Reality: Essays on the Legacy of Arthur Prior, edited by J. Copeland. Oxford: Clarendon.
  • Frege, Gottlob. 1892. “On Sense and Reference.” In Translations from the Philosophical Writings of Gottlob Frege, edited by P. T. Geach and M. Black. Oxford: Blackweel.
  • Hume, David. 1739. “A Treatise of Human Nature”, edited by L. A. Selby-Bigge and P. H. Nidditch. Oxford: Clarendon Press.
  • Kripke, Saul. 1972. Naming and Necessity. Cambridge, MA: Harvard University Press.
  • Kripke, Saul. 2011. “A Puzzle about Belief.” In Philosophical Troubles: Collected Papers, Volume I. Oxford: Oxford University Press.
  • Leibniz, Gottried Wilhelm. 1714. Monadology.
  • Lewis, David. 1973. Counterfactuals. Oxford: Blackwell Publishers.
  • Lewis, David. 1986. On the Plurality of Worlds. Oxford: Blackwell Publishers.
  • Lewis, David. 1991. Parts of Classes. Oxford: Blackwell.
  • Locke, John. 1689. An Essay Concerning Human Understanding.
  • McTaggart, J. M. E. 1908. “The Unreality of Time.” Mind 17: 457-74.
  • Prior, A. N. 1960. “The Autonomy of Ethics.” Australasian Journal of Philosophy 38 (3): 199-206.
  • Quine, Willard Van Orman. 1948. “On What There is.” Review of Metaphysics 2: 21-38. Reprinted in From a Logical Point of View, 2nd ed., Harvard: Harvard University Press, 1980, 1-19.
  • Raven, Michael J. 2019. “(Re)discovering Ground.” In Cambridge History of Philosophy, 1945 to 2015, edited by K. M. Becker and I. Thomson. Cambridge: Cambridge University Press.
  • Raven, Michael J., ed. 2020. The Routledge Handbook of Metaphysical Grounding. New York: Routledge.
  • Spinoza, Baruch. 1677. Ethics, Demonstrated in Geometrical Order.
  • Stalnaker, Robert. 1968. “A Theory of Conditionals.” In Studies in Logical Theory, edited by N. Rescher. Oxford: Blackwell.
  • Williamson, Timothy. 2013b. Modal Logic as Metaphysics. Oxford: Oxford University Press.

 

Author Information

Mike Raven
Email: mike@mikeraven.net
University of Victoria
Canada

This document is generated by “Embed Any Document”

Loader Loading...
EAD Logo Taking too long?

Reload Reload document
| Open Open in new tab

Immanuel Kant: Logic

For Immanuel Kant (1724–1804), formal logic is one of three paradigms for the methodology of science, along with mathematics and modern-age physics. Formal logic owes this role to its stability and relatively finished state, which Kant claims it has possessed since Aristotle. Kant’s key contribution lies in his focus on the formal and systematic character of logic as a “strongly proven” (apodictic) doctrine. He insists that formal logic should abstract from all content of knowledge and deal only with our faculty of understanding (intellect, Verstand) and our forms of thought. Accordingly, Kant considers logic to be short and very general but, on the other hand, apodictically certain. In distinction to his contemporaries, Kant proposed excluding from formal logic all topics that do not properly belong to it (for example, psychological, anthropological, and metaphysical problems). At the same time, he distinguished the abstract certainty (that is, certainty “through concepts”) of logic (and philosophy in general) from the constructive evidence of mathematical knowledge. The idea of formal logic as a system led Kant to fundamental questions, including questions about the first principles of formal logic, redefinitions of logical forms with respect to those first principles, and the completeness of formal logic as a system. Through this approach, Kant raised some essential problems that later motivated the rise of modern logic. Kant’s remarks and arguments on a system of formal logic are spread throughout his works (including his lectures on logic). Nonetheless, he never published an integral, self-contained presentation of formal logic as a strongly proven doctrine. A lively dispute has thus developed among scholars about how to reconstruct his formal logic as an apodictic system, in particular concerning his justification of the completeness of his table of judgments.

One of Kant’s main results is his establishment of transcendental logic, a foundational part of philosophical logic that concerns the possibility of the strictly universal and necessary character of our knowledge of objects. Formal logic provides transcendental logic with a basis (“clue”) for establishing its fundamental concepts (categories), which can be obtained by reinterpreting the logical forms of judgment as the forms of intuitively given objects. Similarly, forms of inference provide a “clue” for transcendental ideas, which lead to higher-order and meta-logical perspectives. Transcendental logic is crucial to and forms the largest part of Kant’s foundations of metaphysics, as they are critically investigated and presented in his main work, the Critique of Pure Reason.

This article focuses on Kant’s formal logic in the systematic order of logical forms and outlines Kant’s approach to the foundations of formal logic. The main characteristics of Kant’s transcendental logic are presented, including his system of categories and transcendental ideas. Finally, a short overview is given of the subsequent role of Kant’s logical views.

Table of Contents

  1. Introduction
  2. The Concept of Formal Logic
  3. Concept
  4. Judgment
    1. Quantity and Quality
    2. Relation
    3. Modality
    4. Systematic Overview
  5. Inference
    1. Inference of Understanding (Immediate Consequence)
    2. Inference of Reason (Syllogism)
    3. Inference of the Power of Judgment (Induction and Analogy)
    4. Fallacious Inference
  6. General Methodology
  7. The Foundations of Logic
  8. Transcendental Logic (Philosophical Logic)
    1. A Priori–A Posteriori; Analytic–Synthetic
    2. Categories and the Empirical Domain
    3. Transcendental Ideas
  9. Influences and Heritage
  10. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction

 Presentations of the history of logic published at the beginning of the 21st century seem to positively re-evaluate Kant’s role, especially with regard to his conceptual work that led to a new development of logic (see, for example, Tiles 2004).  Although older histories of logic written from the standpoint of mathematical logic did appreciate Kant’s restitution of the formal side of logic, they ascribed to Kant a relatively unimportant role. They criticized him for what seemed to be his view on logic as principally not exceeding the traditional, Aristotelian boundaries (Kneale and Kneale 1991) and for his principled separation of logic and mathematics (Scholz 1959). Nevertheless, during the 20th century, some Kant scholars have confirmed and extensively elaborated on his relevance to mathematical logic (for example, Wuchterl 1958, Schulthess 1981). Moreover, it is significant that several founders of modern logic (including Frege, Hilbert, Brouwer, and Gödel) explicitly referred to and built upon aspects of Kant’s philosophy.

According to Kant, formal logic appears to be an already finished science (accomplished by Aristotle), in which essentially no further development is possible (B VIII). In fact, some of Kant’s statements leave the impression that his views of formal logic may have been largely compiled from contemporary logic textbooks (B 96). Nonetheless, Kant mentions that the logic of his contemporaries was not free of insufficiencies (Prolegomena IV:323). He organized the existing material of formal logic in a specific way; he separated the extraneous (for instance, the psychological, anthropological, and metaphysical) material from formal logic proper. What is particularly important for Kant are his redefinitions of logical forms in terms of formal unity and consciousness. These redefinitions are indispensable for his main contributions: his systematic view of formal logic and the application of this view in transcendental logic.

It also became apparent, primarily due to K. Reich’s 1948 monograph, that Kant’s systematic view of formal logic assumed, as an essential component, a justification of the completeness of formal logic with respect to the forms of our thinking. This conforms with Kant’s critique of Aristotle for his unsystematic, “rhapsodical” approach in devising the list of categories, since Kant intended to repair this deficiency by setting up a system of categories specifically on the basis of formal logic.

Finally, the contemporary development of logic, where logic has far exceeded the shape of a standard (“classical”) mathematical logic, has made it technically possible to explore some features of Kant’s logic that have largely escaped the attention of the earlier, “classically” based perception of Kant’s logic.

Although formal logic is the starting point of Kant’s philosophy, there is no separate text in which Kant systematically, in a strictly scientific way, presented formal logic as a doctrine. Essential parts of this doctrine, however, are contained in his published works, especially those on the foundations of metaphysics, in his handwritten lecture notes on logic (with the addition of Jäsche’s compilation), and in the transcripts of Kant’s lectures on logic. These lectures are based primarily on the textbook by G. F. Meier; and, according to the custom of the time, they include a large amount of material that does not strictly pertain to formal logic. Kant’s view was that it was harmful to beginners to receive instruction in a highly abstract form, in contrast to their concrete and intuitive way of thinking (compare II:305‒306). Nevertheless, many places in Kant’s texts and lectures are pertinent to or reflect the systematic aspect of logic. On this ground, it is possible to reconstruct and describe most of the crucial details of Kant’s doctrine of formal logic.

The reason Kant did not write a systematic presentation of formal logic can be attributed to his focus on metaphysics and the possibility of its foundations. Besides, he might have presumed that the systematic doctrine of formal logic could be recognized from the sections and remarks he had included about it in his written work, at least to the extent to which formal logic was necessary to understand his argument on the foundations of metaphysics. Furthermore, Kant thought that once the principles were determined, a formal analysis (as is required in logic) and a complete derivation of a system could be relatively easily accomplished with the additional help of existing textbooks (see B 27‒28, 108‒109, A XXI: “more entertainment than labor”).

We first present Kant’s doctrine of formal logic, that is, his theory of concepts, judgments and inference and his general methodology. Then, we address the question of the foundations of logic and its systematic character. Finally, we outline Kant’s transcendental logic (that is, logical foundations of metaphysics), especially in relation to formal logic, and give a brief overview of his historical influence.

2. The Concept of Formal Logic

What we here term “formal logic” Kant usually calls “general logic” (allgemeine Logik), in accordance with some of his contemporaries and predecessors (Jungius, Leibniz, Knutzen, Baumgarten). Kant only rarely uses the terms “formal logic” (B 170, also mentioned by Jungius) or “formal philosophy” (Groundwork of the Metaphysics of Morals IV:387), and he preferred to define “logic” in this general sense as a science of the “formal rules of thinking,” rather than merely a general doctrine of understanding (Verstand) (XVI refl. 1624; see B IX, 78, 79, 172). Let us note the distinction between Kant’s use of the term “formal philosophy” and its contemporary use (philosophy in which modern formalized methods are applied).

The following are the essential features of Kant’s formal logic (see B 76‒80):

(1) Formal logic is general inasmuch as it disregards the content of our thought and the differences between objects. It deals only with the form and general rules of thought instead and can only be a canon for judging the correctness of thought. In distinction, a special logic pertains to a special kind of objects and is conjoined with some special science as its organon to extend the content of knowledge.

(2) Formal logic is pure, as it is not concerned with the psychological empirical conditions under which we think and that influence our thought. These psychological conditions are dealt with in applied logic. In general, pure logic does not incorporate any empirical principles, and according to Kant, it is only in this way that it can be established as a science that proves its propositions with strong certainty.

Formal logic should abstract from the distinction of whether the content to which logical forms apply is pure or empirical. Therefore, formal logic is distinguished from transcendental logic, which is a special logic of pure (non-empirical) thinking and which deals with the origin of our cognitions that is independent of given objects. However, transcendental logic is, in a sense, also general, because it deals with the general content of our thought—that is, with the categories that determine all objects.

It is clear that Kant conceives logical forms, as forms of thought, in mentalistic, albeit not in psychological terms. For him, forms of thought are ways of establishing a unity of our consciousness with respect to a given variety of representations. In this context, consciousness comes into play quite abstractly as the most general instance of unity, since ultimately it is we ourselves, in our own consciousness, who are uniting and linking representations given to us. This abstract (non-empirical) unity is to be distinguished from a mere psychological association of representations, which is dispersed and dependent on changing subjective states, and thus cannot establish unity.

By using a mentalistic approach, Kant stresses the operational character of logic. For him, a logical form is a result of the abstract operations of our faculty of understanding (Verstand), and it is through these operations that a unity of our representations can be established. In connection with this, Kant defines function as “the unity of the action [Handlung] of ordering different representations under a common one” (B 93) and he considers logical forms to be based on functions. We see in more detail below how Kant applies his concept of function to logical forms. Further historical development and modifications of Kant’s notion of function can be traced in Frege’s notion of “concept” and Russell’s “propositional functions.”

3. Concept

According to Kant, the unity that a concept establishes from a variety of representations is a unity in a common mark (nota communis) of objects. The form of a concept as a common mark is universality, and its subject matter is objects. Three types of operations of understanding bring about a concept: comparison, reflection, and abstraction.

(1) Through comparison, as a preparatory operation, we become conscious of the identity and difference of objects, and come to an identical mark that is contained in representations of many things. This is a common mark of these things, which is a “partial concept” contained in their representations; other marks may also be contained in these representations, making the things different from one another.

(2) Through reflection, which is essential for concept formation, we become conscious of a common mark as belonging to and holding of many objects. This is a “ground of cognition” (Erkenntnisgrund) of objects, which universally holds of them. Universality (“universal validity”) is the form through which we conceive many objects in one and the same consciousness.

(3) Through abstraction, we leave out (“abstract from”) the differences between objects and retain only their common mark in our consciousness.

Kant characterizes the sort of unity that is established by a concept in the following, foundational way. Each concept, as a common mark that is found in many representations, has an analytic unity (identity) of consciousness “on itself.” At the same time, the concept is presupposed to belong to these, possibly composed, representations, where it is combined (synthesized) with the other component marks. That is, each concept presupposes a synthetic unity of consciousness (B 134 footnote).

On the ground of this functional theory of concepts, Kant explains the distinction between the content (intension) and the extension (sphere) of a concept. This distinction stems from the so-called Port-Royal logic (by A. Arnauld and P. Nicole) of the 17th century and has since become standard in so-called traditional logic (that is, in logic before or independent of its transformation starting with Boole and Frege’s methodology of formalization). According to Kant, concept A has a content in the sense that A is a “partial concept” contained in the representation of an object; concept A has extension (sphere) in the sense that A universally holds of many objects that are contained under A (Jäsche Logic §7 IX:95, XVI refl. 2902, Reich 1948 p. 38). The content of A can be complex, that is, it can contain many marks in itself. The content and extension of a concept A stand in an inversely proportional relationship: the more concept A contains under itself, the less A contains in itself, and vice versa.

A traditional doctrine (mainly originating from Aristotle) of the relationship between concepts can also be built on the basis of Kant’s theory of concepts. A concept B can be contained under A if A is contained in B, that is, as Kant says, if A is a note (a “ground of cognition”) of B. In this case, Kant calls A a higher concept with respect to B, and B a lower concept with respect to A. Kant also says that A is a “mark of a mark” B (a distant mark). Obviously, A is not meant as a second-order mark but rather as a mark of the same order as B. Also, A is a genus of B, while B is a species of A. Through abstraction, we ascend to higher and higher concepts; through determination, we descend to lower and lower concepts. The relationship between higher and lower concepts is subordination, and the relationship between lower concepts among themselves without mutual subordination is coordination. According to Kant, there is no lowest species, because we can always add a new mark to a given concept and thus make it more specific. Finally, with respect to extension, a higher concept is wider, and a lower concept is narrower. The concepts with the same extension are called reciprocal.

4. Judgment

Judgment is for Kant the way to bring given representations to the objective unity of self-consciousness (see B 141, XVI refl. 3045). Because of this unifying of a manifold (of representations) in one consciousness, Kant conceives judgment as rule (Prolegomena §23 IV:305, see Jäsche Logic §60 IX:121). For example, the objective unity is the meaning of the copula “are” in the judgment “All bodies are heavy; what is meant is not our subjective feeling of heaviness, but rather the objective state of affairs that bodies are heavy (see B 142), which is representable by a thinking agent (“I) irrespective of the agent’s changeable psychological states.

As Kant points out, there is no other logical use of concepts except in judgments (B 93), where a concept, as a predicate, is related to objects by means of another representation, a subject. No concept is related to objects directly (like intuition). In a judgment, a concept becomes an assertion (predicate) that is related to objects under some condition (subject) by means of which objects are represented. A logical unity of representations is thus established in the following way: many objects that are represented by means of some condition A are subsumed under some general assertion B, under which other conditions A’, A”, . . . too can possibly be subsumed. The unity of a judgment is objective, since it is conditioned by a representation (a subject concept or a judgment) that is objective or related to objects. The objective unity in a judgment is generalized by Kant so as to hold not merely between concepts (subject and predicate), but also between judgments themselves (as parts of a hypothetical or a disjunctive judgment).

According to Kant, the aspects and types of the unity of representations in a judgment can be exhaustively and systematically described and brought under the four main “titles”: quantity, quality, relation, and modality. This is a famous division of judgments that became standard in traditional logic after Kant.

a. Quantity and Quality

 The assertion of a judgment can be related to its condition of objectivity without any exception or with a possible exception. In the first case, the judgment is universal (for example, All A are B”), and in the second case, it is particular (for example, Some A are B”).

With respect to a given condition of objectivity, an assertion is combined or not combined with it. In the first case, the judgment is affirmative (for example,Some A are B”), while in the second case, it is negative (for example,Some A are not B”).

If taken together, quantity and quality yield the four traditionally known (Aristotelian) types of judgment: universal affirmative (All A are B,” AaB), universal negative (No A is B,” AeB), particular affirmative (Some A are B,” AiB), and particular negative (Some A are not B,” AoB).

b. Relation

In a judgment, an assertion is brought under some condition of objective validity. There are three possible relations of the condition of objective validity to the assertion—subject–predicate, antecedent–consequent, and whole–members—each one represented by an appropriate exponent (“copula” in a wider sense).

(1) In a categorical judgment, a concept (B) as a predicate is brought under the condition of another concept (A) that is a subject that represents objects. Predicate B is an assertion that obtains its objective validity by means of the subject A as the condition:

x, which is contained under A, is also under B (XVI refl. 3096, Jäsche Logic §29 IX:108, symbols modified).

The relation of a categorical judgment is represented by the copulais.” A categorical judgment stands under the principle of contradiction, which is formulated by Kant in the following way:

 No predicate contradictory of a thing can belong to it (B 190).

Hence, there is no violation of the principle of contradiction in stating “A is B and non-B” as far as B or non-B does not contradict A. Only, “and” is not a logical operator for Kant, since it can be relativized by time: “A is B” and “A is non-B” can both be true, but at different moments in time (B 192). (Thus, Kant’s logic of categorical judgments can be considered as “paraconsistent,” in the sense that p and not-p, not violating the law of contradiction, do not entail an arbitrary judgment.)

(2) In a hypothetical judgment, some judgment (say, categorical), q, is an assertion that obtains its objective validity under the condition of another judgment, p: q is called a consequent, p its antecedent (ground), while their relation is what Kant calls (in accordance with other logics of the time) consequence. The exponent of the hypothetical judgment is “if . . . then . . .,” but it need not correspond to the main operator of a judgment in the sense of the syntax in modern logic. This means that a hypothetical judgment is not simply a conditional, since, for instance, it should also include universally quantified propositions like “If the soul is not composite, then it is not perishable,” which could be naturally formalized as “x ((Sx ˄ ¬Cx) → ¬Px) (compare Dohna-Wundlacken Logic XX-II:763; see examples in LV-I:203, LV-II:472).  Let us note that “If something is a human, then it is mortal” is for Kant a hypothetical judgment, in distinction to the categorical judgment “All humans are mortal” (Vienna Logic XX-II:934, Hechsel Logic LV-II:31).

A hypothetical judgment stands under the principle of sufficient reason:

Each assertion has its reason.

Not having a reason contradicts the concept of assertion. By this principle (to be distinguished from Leibniz‘s ontological principle of the same name), q and not-q are excluded as consequents of the same antecedent: they cannot be grounded on one and the same reason. As can be seen, only now do we come to a version of the Aristotelian principle of contradiction, according to which no predicate can “simultaneously” belong and not belong to the same subject. On the other hand, we have no guarantee that there will always be an antecedent sufficient to decide between some p and not-p as its possible consequents. (In this sense, it could be said that Kant’s logic of assertions is “paracomplete.”)

(3) In a disjunctive judgment, the component judgments are parts of some whole (the disjunctive judgment itself) as their condition of objective validity. That is, the objectively valid assertion is one of the mutually exclusive but complementary parts of the whole, for example:

x, which is contained under A, is contained either under B or C, etc. (XVI refl. 3096, Jäsche Logic §29 IX:108).

The exponent of the disjunctive relation is “either . . . or . . .” in the exclusive sense, and, again, it should not be identified with the main operator in the modern sense. To see this, let us take Kant’s example of a disjunctive judgment, “A learned man is learned either historically or rationally,” which would, in a modern formalization, give a universally quantified sentence “x (Lx ® (Hx ˅ Rx)) (Jäsche Logic §29 IX:107).

In a disjunctive judgment, under the condition of an objective whole, some of its parts hold with the exclusion of the rest of the parts. A disjunctive judgment stands under the principle of excluded middle between p and not-p, since it is a contradiction to assert (or to deny) both p and not-p.

Remark. With respect to relation, a judgment is gradually made more and more determinate: from allowing mutually contradictory predicates, to excluding such contradictions on some ground but allowing undecidedness among them, to positing exactly one of the contradictory predicates by excluding the others. Through the three relations in a judgment, we step by step upgrade the conditions of a judgment, improve its unity, and strengthen logical laws, starting from paraconsistency and paracompleteness to finally come to a sort of classical logic.

In general, we can see that relation is what the objective unity of consciousness in a judgment basically consists in: it is a unifying function that (in three ways) relates a manifold of given representations to some condition of their objectivity. Since judgment is generally defined as a manner of bringing our representations to the objective unity of consciousness, the relation of a judgment makes the essential aspect of a judgment.

c. Modality

 This is one of the most distinctive parts of Kant’s logic, revealing its purely intensional character. One and the same judgment structure (quantity, quality, and relation of a judgment) can be thought by means of varying and increasing strength as possible, true, and necessary. Correspondingly, Kant distinguishes

(1) problematic,

(2) assertoric, and

(3) apodictic

judgments (assertoric judgment is called “proposition,” Satz). For example, the antecedent p of a hypothetical judgment is thought merely as problematic (“if p”); secondly, p can also occur outside a hypothetical judgment as, for some reason, an already accepted judgment, that is, as assertoric; finally, p can occur as necessarily accepted on the ground of logical laws, thus apodictic.

These modes of judgment pertain just to how a judgment is thought, that is, to the way the judgment is accepted by understanding (Verstand). Kant says that (1) problematic modality is a “free choice,” an “arbitrary assumption,” of a judgment; (2) assertoric modality (in a proposition) is the acceptance of a judgment as true (logical actuality); while (3) apodictic modality consists in the “inseparable” connection with understanding (see B 101).

There is no special exponent (or operator) of modality; modality is just the “value,” “energy,” of how the existing exponent of a relation in a judgment is thought. Modality is in an essential sense distinguished from the quantity, quality, and relation, which, in distinction, constitute the logical content of a judgment (see B 99‒100; XVI refl. 3084).

Despite a very specific nature of modality, it is in a significant way—through logical laws—correlated with the relation of a judgment:

(1) logical possibility of a problematic judgment is judged with respect to the principle of contradiction—no judgment that violates this principle is logically possible;

(2) logical actuality (truth) of an assertoric judgment is judged with respect to the grounding of the judgment on some sufficient reason;

(3) logical necessity of an apodictic judgment is judged with respect to the decidability of the judgment on the ground of the principle of excluded middle

(see Kant’s letter to Reinhold from May 19, 1789 XI:45; Reich 1948 pp. 73‒76).

The interconnection of relation and modality is additionally emphasized by the fact that Kant sometimes united these two aspects under the title of queity (quaeitas) (XVI refl. 3084, Reich 1948 pp. 60‒61).

d. Systematic Overview

 

  1. Systematic Overview

 Kant gives an overview of his formal logical doctrine of judgments by means of the following table of judgments:

In his transcendental logic, Kant adds singular and infinite judgments as special judgment types. In formal logic (as was usual in logic textbooks of Kant’s time), they are subsumed under universal and affirmative judgments, respectively (see B 96‒97). A characteristic departure from the custom of 17th- and 18th-century logic textbooks is Kant’s (generalized) aspect of relation, which is not reducible to the subject–predicate relation, and directly comprises categorical, hypothetical, and disjunctive judgments—bypassing, for example, subdivision into simple and compound judgments. Another divergence from the custom of the time is Kant’s understanding of modality as independent of explicit modal expressions (“necessarily,” “contingently,” “possibly,” “impossibly”). Instead, Kant understands modality as an intrinsic moment of each judgment (for example, the antecedent and the consequent of a hypothetical judgment are as such problematic, and the consequence between them is assertoric), in distinction to the customary division into “pure” and “modal” propositions. The result of this was a more austere system of judgments that is reduced to strictly formal criteria in Kant’s sense and avoids the admixture of psychological, metaphysical, or anthropological aspects (B VIII).

Kant’s table of judgments has a systematic value within his formal logic. The fact that Kant uses the tabular method to give an overview of the doctrine of judgments shows, according to his methodological view on the tabular method (Section 6), that he is only summarizing a systematic whole of knowledge. Formal logic, as a system, is a “demonstrated doctrine” (Section 6), where everything “must be certain completely a priori” (B 78, compare many other places like B IX; A 14; Prolegomena IV:306; Groundwork of the Metaphysics of Morals IV:387; XVI refl. 1579 p. 21, 1587, 1620 p. 41, 1627, 1628; Preisschrift XX:271). Kant’s text supports the view that his formal logic should include a systematic, a priori justification of his table of judgments, despite dispute among scholars about how this justification can be reconstructed (see Section 7).

5. Inference

In an inference, a judgment is represented as “unfailingly” (that is, a priori, necessarily) connected with (and “derived” from) another judgment that is its ground (see B 360).

Kant distinguishes two ways we can derive a judgment (conclusion) from its ground:

(a) by the formal analysis of a given judgment (ground, premise), without the aid of any additional judgment—such an inference, which is traditionally known as immediate consequence, Kant calls an inference of understanding (Verstandesschluß, B 360);

(b) by the subsumption under some already accepted judgment (major premise) with the aid of some mediate judgment (additional, minor premise)—this is an inference of reason (Vernunftschluß), that is, a syllogism (B 360, compare, for example, XVI refl. 3195, 3196, 3198, 3201).

Kant distinguishes between “understanding” (Verstand) and “reason” (Vernunft) in the following way: understanding is the faculty of the unity of representations (“appearances”) by means of rules, while “reason” is the faculty of the unity of rules by means of principles (see B 359, 356, 361). Obviously, inference of understanding essentially remains at the unity already established by means of a given judgment (rule), whereas inference of reason starts from a higher unity (principle) under which many judgments can be subsumed.

Additionally, we can infer a conclusion by means of a presumption on the ground of already accepted judgments. This inference Kant names inference of the power of judgment (Schluß der Urteilskraft), but he does not consider it to belong to formal logic in a proper sense, since its conclusion, because of possible exceptions, does not follow with necessity.

a. Inference of Understanding (Immediate Consequence)

This part of Kant’s logical theory includes a variant of the traditional (Aristotelian) doctrine of immediate consequence, but as grounded in Kant’s previously presented theory of judgment. According to Kant, in an inference of understanding, we merely analyze a given judgment with respect to its logical form. Thus, Kant divides inference of understanding in accordance with his division of judgments:

(a) with respect to the quantity of a judgment, an inference is possible by subalternation: from a universal judgment to its corresponding particular judgment of the same quality (AaB / AiB, AeB / AoB);

(b) with respect to the quality of a judgment, an inference is possible according to the square of opposition (which usually includes subalternation): of the contradictories (AaB and AoB, AeB, and AiB), one is true and another false; of the contraries (AaB and AeB), at least one is false; of the subcontraries (AiB and AoB), at least one is true;

(c) with respect to the relation of a judgment, there is an inference by conversion (simple or changed): if B is (not) predicated of A, then A is (not) predicated of B (AaB / BiA, AeB / BeA, AiB / BiA);

(d) with respect to modality, an inference is possible by contraposition (for example AaB / non-BeA); Kant assigns contraposition to modality because the contraposition changes the logical actuality of the premise (proposition) to the necessity of the conclusion; that is, granted the premise, the conclusion expresses the exclusion (opposite) of self-contradiction (XVI refl. 3170, Hechsel Logic LV-II:448): granted AaB, non-B contradicts A (also, granted AeB or AoB, universal exclusion of non-B contradicts A, that is, non-BiA follows).

These inferences are valid on the ground of Kant’s assumption of the non-contradictory subject concept. Otherwise, if the subject concept is self-contradictory (nothing can be thought by it), then both contradictories would be false. For example, “A square circle is round” and “A square circle is not round” are both false due to the principle of contradiction (Prolegomena §52b IV:341, B 821: “both what one asserts affirmatively and what one asserts negatively of the object [of an impossible concept] are incorrect”; see B 819, 820‒821).

b. Inference of Reason (Syllogism)

 Kant considers inference of reason within a variant of traditional theory of syllogisms, which includes categorical syllogism (substantially reduced to the first syllogistic figure), hypothetical syllogism, and disjunctive syllogism, everything shaped and modified in accordance with his theory of judgments and his conception of logic in general.

Each syllogism starts from a judgment that has the role of the major premise. In Kant’s view, the major premise is a general rule under the condition of which (for example, of its subject concept) a minor premise is subsumed. Accordingly, the condition of the minor premise itself (for example, its subject concept) is subsumed in the conclusion under the assertion of the major premise (for example, its predicate) (B 359‒361, B 386‒387). The major premise becomes in a syllogism a (comparative) principle from which other judgments can be derived as conclusions (see B 357, 358). Since there are three species of judgments with respect to relation, Kant distinguishes three species of syllogisms according to the relation of the major premise (B 361, XVI refl. 3199):

(a) Categorical syllogism. Kant starts from a standard doctrine of first syllogistic figure, where the major concept (predicate of the major premise) is put in relation to the minor concept (subject of the minor premise) by means of the middle concept (the subject of the major and the predicate of the minor premise): MaP, SaM / SaP; MeP, SaM / SeP; MaP, SiM / SiP; MeP, SiM / SoP. Kant insists that only the first figure of the categorical syllogism is an inference of reason, whereas in other figures there is a hidden immediate inference (sometimes reductio ad absurdum is needed) by means of which a syllogism can be transformed into the first figure (B 142 footnote, XVI refl. 3256; see The False Subtlety of the Four Syllogistic Figures in II).

(b) Hypothetical syllogism. The major premise is a hypothetical judgment, in which the antecedent and the consequent are problematic. Subsumption is accomplished by means of the change of the modality of the antecedent (or of the negation of the consequent) to an assertoric judgment (minor premise), from where in the conclusion the assertoric modality of the consequent (or of the negation of the antecedent) follows. The inference from the affirmation of the antecedent to the affirmation of the consequent is modus ponens, and the inference from the negation of the consequent to the negation of the antecedent is modus tollens of the hypothetical syllogism.

(c) Disjunctive syllogism. The major premise is a disjunctive judgment, where the disjuncts are problematic. Subsumption is carried out by the change of the problematic modality of some disjuncts (or their negations) to assertoric modality, from where in the conclusion the assertoric modality of the negation of other disjuncts (the assertoric modality of other disjuncts) follows. The inference from the affirmation of one part of the disjunction to the negation of the other part is modus ponendo tollens, and the inference from the negation of one part of the disjunction to the affirmation of the other part is modus tollendo ponens of the disjunctive syllogism.

In hypothetical and disjunctive syllogisms, there is no middle term (concept). As explained, the subsumption under the rule of the major premise is carried out just by means of the change of the modality of one part (or of its negation) of the major premise (see XVI refl. 3199).

In Kant’s texts, we can find short indications on how a theory of polysyllogisms should be built (for example, B 364, B 387‒389). Inference can be continued on the side of conditions by means of a prosyllogism, whose conclusion is a premise of a given syllogism (an ascending series of syllogisms), as well as on the side of what is conditioned by means of an episyllogism, whose premise is the conclusion of a given syllogism (a descending series of syllogisms). In order to derive, by syllogisms, a given judgment (conclusion), the ascending totality of its conditions should be assumed (either with some first unconditioned condition or as an unlimited but unconditioned series of all conditions) (B 364). In distinction, a descending series from a given conclusion could be only a potential one, since the acceptance of the conclusion, as given, is already granted by the ascending totality of conditions (B 388‒389). By requiring a given, completed ascending series of syllogisms, we advance towards the highest, unconditioned principles (see B 358). In this way, the logical unity of our representations increases towards a maximum: our reason aims at bringing the greatest manifold of representations under the smallest number of principles and to the highest unity (B 361).

c. Inference of the Power of Judgment (Induction and Analogy)

The inference of the power of judgment is only a presumption (“empirical inference”), and its conclusion a preliminary judgment. On the ground of the accordance in many special cases that stand under some common condition, we presume some general rule that holds under this common condition. Kant distinguishes two species of such an inference: induction and analogy. Roughly,

(a) by induction, we conclude from A in many things of some genus B, to A in all things of genus B: from a part of the extension of B to the whole extension of B;

(b) by analogy, we conclude from many properties that a thing x has in common with a thing y, to the possession by x of all properties of y that have their ground in C as a genus of x and y (C is called tertium comparationis): from a part of a concept C to the whole concept C

(see XVI refl. 3282‒3285).

What justifies such reasoning is the principle of our power of judgment, which requires that many cases of accordance should have some common ground (by means of belonging to the extension of the same concept or by having the marks of the same concept). However, since we do not derive this common ground with logical necessity, no objective unity is established, but only presumed, as a result of our subjective way of reflecting.

d. Fallacious Inference

For Kant, fallacious inferences should be explained by illusion (Schein, B 353): an inference may seem to be correct if judged on the ground of its appearance (species, Pölitz Logic XXIV-II:595, Warsaw Logic LV-II:649), although the real form of this proposed inference may be incorrect (just an “imitation” of a correct form, B 353, 354). Through such illusions, logic illegitimately becomes an organon to extend our knowledge outside the limits of the canon of logical forms. Kant calls dialectic the part of logic that deals with the discovery and solutions of logical illusions in fallacious inferences (for example, B 390, 354), in distinction to mere analytic of the forms of thought. Formal logic gives only negative criteria of truth (truth has to be in accordance with logical laws and forms), but cannot give any general material criterion of truth, because material truth depends on the specific knowledge about objects (B 83‒84). Formal logic, which is in itself a doctrine, becomes in its dialectical part the critique of fallacies and of logical illusion. In his logic lectures and texts, Kant addresses some traditionally well-known fallacies (for example, sophisma figurae dictionis, a dicto secundum quid ad dictum simpliciter, sophisma heterozeteseos, ignoratio elenchi, Liar). Below, in connection with Kant’s transcendental logic, we mention some of his own characteristic, systematically important examples of fallacies.

6. General Methodology

Since, according to Kant, formal logic abstracts from the differences of objects and hence cannot focus on the concrete content of a particular science, it can only give a short and very general outline of the form of a science, as the most comprehensive logical form. This outline is a mere general doctrine on the formal features of a method and on the systematic way of thinking. On the other hand, many interesting distinctions can be found in Kant’s reflections on general methodology that cast light on Kant’s approach to logic, philosophy, and mathematics.

Building on his concept of the faculty of reason, Kant defines method in general as the unity of a whole of knowledge according to principles (or as “a procedure in accordance with principles,” B 883). By means of a method, knowledge obtains the form of a system and transforms into a science. Non-methodical thinking (without any order), which Kant calls “tumultuar,” serves in combination with a method the variety of knowledge (whereas method itself serves the unity). In a wider sense, Kant speaks of a fragmentary (rhapsodical) method, which consists only in a subjective and psychological connection of thinking (it does not establish a system, but only an aggregate of knowledge, not a science, but merely ordinary knowledge).

In further detail, Kant’s general methodology includes the doctrine of definition, division, and proof—mainly a redefined, traditionally known material, with Kant’s own systematic form.

Let us first say that for Kant a concept is clear if we are conscious of its difference from other concepts. Also, a concept is distinct if its marks are clearly known. Now, definition is, according to Kant, a clear, distinct, complete, and precise (“reduced to a minimal number of marks”) presentation of a concept. Since all these requirements for a definition can be strictly fulfilled only in mathematics, Kant distinguishes various forms of clarification that only partially fulfill the above-mentioned requirements, as exposition, which is clear and distinct, but need not be precise and complete (see XVI refl. 2921, 2925, 2951; B 755‒758). Division is the representation of a manifold under some concept and as interrelated, by means of mutual opposition, within the whole sphere of the concept (see XVI refl. 3025).

Proof provides certainty to a judgment by making distinct the connection of the judgment with its grounds (see XVI refl. 2719). Proofs can be distinguished with respect to the grade of certainty they provide. (1) A proof can be apodictic (strong), in a twofold way: as a demonstration (proof by means of the construction in an intuition, in concreto, as in mathematics) or as a discursive proof (by means of concepts, in abstracto, as in philosophy). In addition, a strong proof can be direct (ostensive), by means of the derivation of a judgment from its ground, or indirect (apagogical), by means of showing the untenability of a consequent of the judgment’s contradictory. In his philosophy, Kant focuses on the examples where indirect proofs are not applicable due to the possibility of dialectical illusion (contraries and subcontraries that only subjectively and deceptively appear to be contradictories, which is impossible in mathematics, B 819‒821). (2) Sometimes the grounds of proof give only incomplete certainty, for instance, empirical certainty (as in induction and analogy), probability, possibility (hypothesis), or merely apparent certainty (fallacious proof) (see Critique of Judgment §90 V:463).

Furthermore, Kant distinguishes the syllogistic and tabular methods. The syllogistic method derives knowledge by means of syllogisms. An already established systematic whole of knowledge is presented in its whole articulation (branching) by the tabular method (as is the case, for example, with Kant’s tables of judgments and categories; see, for example, Pölitz Logic XXIV-II:599, Dohna-Wundlacken Logic XXIV-II:80, Hechsel Logic LV-II:494). In addition, the division of the syllogistic method into the synthetic (progressive) and analytic (regressive) is important. The former proceeds from the principles to what is derived, from elements (the simple) to the composed, from reasons to what follows from them, whereas the latter proceeds the other way around, from what is given to its reasons, elements, and principles. (For the application of these two syllogistic methods in metaphysics, see, for instance, B 395 footnote.)

Finally, Kant comes to the following three general methodological principles (B 685‒688):

(1) the principle of “homogeneity of the manifold under higher genuses”;

(2) the principle of specification, that is, of the “variety of the homogeneous under lower species”;

(3) the principle of continuity of the transition to higher genuses and to lower species.

These principles correspond to the three interests of the faculty of reason: the interests of unity, manifold, and affinity. Again, all three principles are just three sides of one and the same, most general, principle of the systematic (thoroughgoing) unity of our knowledge (B 694).

The end result of the application of methodology in our knowledge is a “demonstrated doctrine,” which derives knowledge by means of apodictic proofs. It is accompanied by a corresponding discipline, which, by means of critique, prevents and corrects logical illusion and fallacies.

7. The Foundations of Logic

As stated by Kant, formal logic itself should be founded and built according to strict criteria, as a demonstrated doctrine. It should be a “strongly proven,” “exhaustively presented” system (B IX), with the “a priori insight” into the formal rules of thinking “through mere analysis of the actions of reason into their moments” (B 170). Since in formal logic “the understanding [Verstand] has to do with nothing further than itself and its own form” (B IX), formal logic should be grounded in the condition of the possibility of the understanding in the formal sense, and this condition is technically (operationally) defined by Kant as the unity of pure (original) self-consciousness (apperception) (B 131, compare XVI 21 refl. 1579: logical rules should be “proven from the reason [Vernunft]”). This unity is the fundamental, qualitative unity of the act of thinking (“I think”) as opposed to a given manifold (variety) of representations. The operational “one-many” opposition, as well as the further analysis of its general features and structure, should be appropriate as a foundational starting point from which a system of logic could be strongly derived. The basic step of the analysis of this fundamental unity is Kant’s distinction between the analytic and synthetic unity of self-consciousness (see, for example, B §§15‒19): at first, the act of thinking (“I think”) appears simply to accompany all our representations. It is the identity of my consciousness in all my representations, termed by Kant analytic unity of self-consciousness. But this identity of consciousness would, for me (as a thinking subject), not be possible if I would not conjoin (synthesize) one representation with another and be conscious of this synthesis. Thus, the analytic unity of self-consciousness is possible only under the condition of the synthetic unity of self-consciousness (B 133). Kant further shows that the synthetic unity is objective, because it devises a concept of object with respect to which we synthesize representations into a unity. This unity is necessary and universally valid, that is, independent of any changeable, psychological state.

In Kant’s words: “the synthetic unity of apperception is the highest point to which one must affix all use of the understanding, even the whole logic and, after it, transcendental philosophy; indeed this faculty is the understanding itself” (B 134 footnote; see A 117 footnote and Opus postumum XXII:77). (For a formalization of Kant’s theory of apperception according to the first edition of the Critique of Pure Reason, see Achourioti and Lambalgen 2011.)

Kant himself did not write a systematic presentation of formal logic, and the form and interpretation of Kant’s intended logical system are disputed among Kant scholars. Nevertheless, it is evident that each logical form is conceived by Kant as a type of unity of given representations, that this unity is an act of thinking and consciousness, and that each logical form is therefore essentially related to the “original” unity of self-consciousness. Some scholars, starting from the concept of the original unity of self-consciousness—that is, from the concept of understanding (as confronted with a given “manifold” of our representations)—proposed various lines of a reconstruction of Kant’s assumed completeness proof of his logical forms (or supplied such a proof on their own), in particular, of his table of judgments (see a classical work by Reich 1948, and, for example, Wolff 1995, Achourioti and van Lambalgen 2011, Kovač 2014). There are authors who offer arguments that the number and the species of the functions of our understanding are for Kant primitive facts, and can be at most indicated (Indizienbeweis) on the ground of the “functional unity” of a judgment (Brandt 1991; see a justification of Kant’s table of judgments in Krüger 1968).

8. Transcendental Logic (Philosophical Logic)

Besides formal logic, Kant considers a branch of philosophical logic that deals with the foundations of ontology and the rest of metaphysics and shows how objects are constituted in our knowledge by means of logical categorization. This branch of logic Kant names “transcendental logic.”

a. A Priori–A Posteriori; Analytic–Synthetic

Kant’s transcendental logic is based on two important distinctions, which exerted great influence in the ensuing history of logic and philosophy: the distinction between a priori and a posteriori knowledge, and the distinction between synthetic and analytic judgments (see B 1‒3).

Knowledge is a priori if it is possible independently of any experience. For instance, “Every change has its cause.” As the example shows, knowledge can be a priori, but about an empirical concept, like “change,” since given a change, we independently of any experience know that it should have a cause. A priori knowledge is pure if it has no empirical content, like, for example, mathematical propositions.

Knowledge is a posteriori (empirical) if it is possible only by means of experience. An example is “All bodies are heavy,” since we cannot know without experience (just from the concept of body) whether a body is heavy.

Kant gives two certain, mutually inseparable marks of a priori knowledge: (1) it is necessary and derived (if at all) only from necessary judgments; (2) it is strictly universal, with no exceptions possible. In distinction, a posteriori knowledge (1) permits that the state of affairs that is thought of can also be otherwise, and (2) it can possess at most assumed and comparative universality, with respect to the already perceived cases (as in induction) (B 3‒4).

Analytic and synthetic judgments are distinguished with respect to their content: a judgment is analytic if it adds nothing to the content of the knowledge given by the condition of the judgment; otherwise, it is synthetic.

That is, analytic judgments are merely explicative with respect to the content given by the condition of the judgment, while synthetic judgments are expansive with respect to the given content

(see Prolegomena §2a IV:266, B 10‒11). Kant exemplifies this distinction on affirmative categorical judgments: such a judgment is analytic if its predicate does not contain anything that is not contained in the subject of the judgment; otherwise, the judgment is synthetic: its predicate adds to the content of the subject what is not already contained in it. An example of analytic judgments is “All bodies are extended” (“extended” is contained in the concept “body”); an example of synthetic judgments is the empirical judgment “All bodies are heavy” (“heavy” is not contained in the concept “body”).

We note that Kant’s formal logic should contain only analytic judgments, although its laws and principles refer to and hold for all judgments (analytic and synthetic) in general (see Reich 1948 14‒15, 17). Conversely, analytic knowledge is based on formal logic, affirming (negating) only what should be affirmed (negated) on pain of contradiction. Let us remark that for Frege, unlike for Kant, this notion of analytic knowledge holds also for arithmetic.

b. Categories and the Empirical Domain

 The objective of Kant’s transcendental logic is pure forms of thinking in so far as they a priori refer to objects (B 80‒82). That is, necessary and strictly universal ways should be shown for how our understanding determines objects, independently of, and prior to, all experience. In Kant’s technical language, this means that transcendental logic should contain synthetic judgments a priori.

According to Kant’s restriction on transcendental logic, objects can be given to us only in a sensible intuition. These objects can be conceived as making Kant’s only legitimate, empirical domain of theoretical knowledge. Hence, the task is to discover which pure forms of our thought (categories, “pure concepts of understanding”), and in which way, determine the empirically given objects. Kant obtains categories from his table of logical forms of judgment (“metaphysical deduction of categories,” B §10, see §§20, 26) because these forms, besides giving unity to a judgment, are also what unite a sensibly given manifold into a concept of an object. Technically expressed, a form of a judgment is a “function of unity” that can serve to synthesize a manifold of an intuition. The manifold is synthesized into a unity that is a concept of an object given in the intuition. To “deduce” categories, Kant introduces some small emendations into his table of the logical functions in judgments. These emendations are needed because what is focused on in transcendental logic is not merely the form of thought, but also the a priori content of thought. Thus, Kant extends the division of “moments” under the titles of quantity and quality of judgments by adding singular and infinite judgments, respectively (for instance, “Plato is a philosopher”; “The soul is non-mortal”). He also changes the term “particular judgment” for “plurative,” since the intended content is not an exception from totality (which is the logical form of a particular judgment), but plurality independently from totality. With respect to the content, Kant reverses the order under the title of quantity (Prolegomena §20 footnote IV:302).

In correspondence with the 12 forms of judgments, Kant obtains 12 categories:

 (Prolegomena §21 IV:303).

Sometimes, the order of the categories of quality is also changed: reality, limitation, full negation (Prolegomena §39 IV:325). In the Critique of Pure Reason, the table is more explicative. Under “Relation,” Kant lists:

(a) inherence and subsistence (substantia et accidens);

(b) causality and dependence (cause and effect);

(c) community (interaction between agent and patient).

Under “Modality,” he adds negative categories of impossibility, non-existence, and contingency (B 106). (For a possible reconstruction of a deduction of categories from the synthetic unity of self-consciousness as the first principle, see Schulting 2019.)

Kant further shows that all objects of a sensible intuition in general (be it in space and time or not) presuppose a synthetic unity (in self-consciousness) of a manifold according to categories. On the ground of this premise, he also shows that all objects of our experience, too, stand under categories. Briefly, in the proof of this result, Kant shows, first, that each of our empirical intuitions presupposes a synthetic unity according to which space and time are determined in this intuition.  We then abstract from the space-time form of our empirical intuition, isolate just the synthetic unity, and, by subsumption under the first premise (on intuitions in general), conclude that this synthetic unity is based on the categories, which are applicable to our space-time intuition (“transcendental deduction of categories,” B §§20, 21, 26, B 168‒169).

In addition, transcendental logic comprises a theory of judgments a priori and of their principles. These principles determine how categories, which are pure concepts, are applied to objects given in our intuition and make our knowledge of objects possible. For Kant, there is no way to come to a theoretical knowledge of objects other than by means of experience, which includes, as its formal side, categories as well as space and time. Accordingly, there are a priori judgments about how categories can have objective validity in application to what can be given in our space-time intuition. As Kant puts it: the conditions (including categories) of the possibility of experience are at the same time the conditions of the possibility of the objects of experience, and thus have objective validity (B 197).

Kant systematically elaborates the principles of the pure faculty of understanding in consonance with his table of judgments. According to these principles, different moments that constitute our experience (1. intuition; 2. sensation; 3. perception of permanence, change, and simultaneity; 4. formal and material conditions in general) are subsumed under corresponding categories (1. extensive magnitude, 2. intensive magnitude, 3. categories of relation, 4. modal categories).

Kant emphasizes that concepts themselves cannot be conceived as objects (noumena) in the same (empirical) domain of objects (appearances, phaenomena) to which they as concepts apply. That is, in modern terms, we can speak of noumena only within a second-order regimentation of domains, with the lower (empirical) domain as ontologically preferred.

c. Transcendental Ideas

There are further concepts to which we are led, not by our faculty of understanding and the forms of judgment, but by our faculty of reason and its forms of inference. In distinction to categories, which are applicable to the domain of our experience, the concepts of the faculty of reason do not have their corresponding objects given in our intuition; their correspondents can only be purported objects “in themselves” (Dinge an sich), which transcend all our experience. A concept of the “unconditioned” (“absolute,” referring to the totality of conditions) for a given “conditioned” thing or state is termed by Kant a transcendental idea. Transcendental ideas, although going beyond our experience, have a regulative role to direct and lead our empirical thought towards the paradigm of the unconditioned synthetic unity of knowledge. According to the three species of inference of reason (categorical, hypothetical, and disjunctive), there are three classes of transcendental ideas (B 391, 438‒443):

(1) the unconditioned unity of the subject (the idea of the “thinking self”) that is not a predicate of any further subject;

(2) the unconditioned unity of the series of conditions of appearance (the idea of “world”), which further divides into four ideas in correspondence with the four classes of categories:

(a) the unconditioned completeness of the composition of the whole of appearances,

(b) the unconditioned completeness of the division of a given whole in appearance,

(c) the unconditioned completeness of the causal series of an appearance,

(d) the unconditioned completeness of the dependence of appearances regarding their existence;

(3) the unconditioned unity of the ground of all objects of thinking, in accordance with the principle of complete determination of an object regarding each possible predicate (the idea of “being of all beings”).

These transcendental ideas are in a natural way connected with a dialectic of our faculty of reason, where reason aims towards the knowledge of empirically unverifiable objects (B 397‒398).

(1) Through transcendental paralogisms, we come to think of the formal subject of our thought as of a substance.

(2) Through the antinomies of pure reason, the following opposites (seeming contradictions) remain undecided:

(a) the world has a beginning – the world is infinite;

(b) each composed thing consists of simple parts – there is nothing simple in things (they are infinitely divisible);

(c) there is a causality of freedom – everything happens according to the laws of nature;

(d) there is an absolutely necessary being – everything is contingent.

(3) The ideal of pure reason leads us to found the principle of complete determination on the idea of the most perfect being. In addition, Kant assumes here that “existence” is not a real predicate—that is, it does not contribute to the determination of a thing.

Kant insists on separating and excluding (1) the formal logical subject (“I think”) of all our thought from the empirical objects (substances) about which the subject can think; (2) the domain of experience from the members of this domain; and (3) the totality of concepts applicable to the domain from these concepts themselves. Thus, Kant’s transcendental dialectic includes and deals with logical problems connected with the possible disregarding of what we could today call type-theoretical distinctions and the distinction between a theory and its metatheory.

Let us add a methodological remark about the relationship between mathematical and transcendental logical knowledge. The rigor of mathematical evidence (intuitive certainty, B 762) is based, according to Kant, on the possibility of constructing mathematical concepts in intuition. This construction can be ostensive (geometric) or “symbolic” (“characteristic”, B 745, 762, as in arithmetic and algebra). However, as Kant points out, this is not available for transcendental logic, where knowledge should also be apodictic and a priori, but confined to the abstract, conceptual “exposition” (without a construction in intuition, albeit with an application of concepts to intuition). For this reason, definitions and demonstrations in the strictest sense are possible in mathematics, but not in transcendental logic (B 758‒759, 762‒763).

9. Influences and Heritage

Although Kant’s logic, if taken literally, is in form and content largely traditional as well as significantly dependent on the science of his time, it offered new essential and foundational perspectives that are deeply (and often unknowingly) built into modern logic.

Kant required a formal, though not mathematical, rigor in logic, purifying it of psychological and anthropological admixtures. This rigor was required in two ways: (a) in the sense of functionally defined logical forms, and (b) in the sense of a systematic, scientific form of logic. Kant’s transcendental logic is characterized by the strict distinction of formal logical and metaphysical aspects of concepts, as well as by defined standards of the justification of concepts and of their application in an empirical model of knowledge. Nevertheless, Kant strictly separated mathematical and philosophical rigor. It is in the aspect of the possibilities of the “symbolic construction” of concepts that modern logic has made great advances in comparison to Kant’s logic.

Let us give some examples of Kant’s influence on the posterior development of logic and philosophy.

Kant’s table of judgments influenced a large part of traditional or reformed traditional logic deep into the 20th century. Besides, although Frege criticized Kant’s table of judgment as contentual and grammatical, in Frege’s distinction between “judging” and the content of judgment, Kant’s distinction between modality and the logical content of the judgment can be traced. Kant’s restriction of the importance of categorical judgments, with an emphasis also on the logical relation between judgments, announced the future development of truth-functional propositional logic. Kant’s criterion of sensible intuition for the givenness of objects inspired Hilbert’s finitistic formalism with “concrete signs” and their shapes as the immediately intuitively given of his metamathematics. Kant’s foundational theory of the unity of apperception (in application to time) inspired the emergence of intuitionism (Brouwer). Kant’s undecidability of geometry by analytic means, properly corrected and reinterpreted, anticipates Gödel’s incompleteness results.

Kant’s distinctions of the analytic and the synthetic, and of the a priori and the a posteriori, had a deep impact on philosophical and mathematical logic, and have delineated an important part of philosophical discussions after Kant. Frege especially praised Kant’s analytic-synthetic distinction, despite his departure from Kant, according to whom arithmetic was, like geometry, synthetic. The analytic-synthetic distinction was a crucial subject of discussion and revision, for example, in Carnap‘s, Gödel’s, Quine‘s, and Kripke’s philosophies of logic, language, and knowledge.

Kant’s duality of the conceptual system and empirical model, with differentiated logical (and ontological) orders of concepts and their (intended) corresponding objects, already leads into the area of solving logical antinomies and of incompleteness (see Tiles 2004). With his conception of successively upgrading logical laws (from the law of contradiction, to the law of sufficient reason, to the law of excluded middle), Kant implicitly offered a general picture of possible logics that exceeds classical logic—as far as it was possible with the tools available to him. His logical foundations of philosophy can still inspire modern logical-philosophical investigations.

10. References and Further Reading

a. Primary Sources

  • Kant, Immanuel. 1910–. Kant’s gesammelte Schriften. Königlich Preussische Akademie der Wissenschaften (ed.). Berlin: Reimer, Berlin and Leipzig: de Gruyter. Also Kants Werke I–IX, Berlin: de Gruyter, 1968 (Anmerkungen, 2 vols., Berlin: de Gruyter, 1977).
  • Cited by volume number (I, II, etc.); Kritik der reinen Vernunft, 1st ed. = A, 2nd ed. = B.
  • Kant, Immanuel. 1998. Critique of Pure Reason. Cambridge, UK: Cambridge University Press. Transl. and ed. by Paul Guyer and Allen W. Wood.
  • Kant, Immanuel. 1992. Lectures on Logic. Cambridge, UK: Cambridge University Press. Transl. and ed. by J. Michael Young.
  • Kant, Immanuel. 1998. Logik-Vorlesungen: Unveröffentlichte Nachschriften I‒II. Hamburg: Meiner. Ed. by T. Pinder.
  • Cited as LV.
  • Kant, Immanuel. 2004. Prolegomena to Any Future Metaphysics. Cambridge, UK: Cambridge University Press. Transl. and ed. by Gary Hatfield.

b. Secondary Sources

  • Achourioti, Theodora and van Lambalgen, Michiel. 2011. “A Formalization of Kant’s Transcendental Logic.” The Review of Symbolic Logic. 4: 254–289.
  • Béziau, Jean-Yves. 2008. “What is ʻFormal Logicʼ?” in Proceedings of the XXII Congress of Philosophy, Myung-Hyung-Lee (ed.), Seoul: Korean Philosophical Association, 13: 9–22.
  • Brandt, Reinhard. 1991. Die Urteilstafel. Kritik der reinen Vernunft A 67‒76; B 92‒101. Hamburg: Meiner.
  • Capozzi, Mirella and Roncaglia, Gino. 2009. “Logic and Philosophy of Logic from Humanism to Kant” in Leila Haaparanta (ed.), The Development of Modern Logic. New York: Oxford University Press, pp. 78–158.
  • Conrad, Elfriede. 1994. Kants Vorlesungen als neuer Schlüssel zur Architektonik der Kritik der reinen Vernunft. Stuttgart-Bad Cannstatt: Frommann-Holzboog.
  • Friedman, Michael. 1992. Kant and the Exact Sciences. Cambridge (Ma), London: Harvard University Press.
  • Kneale, William and Kneale, Martha. 1991. The Development of Logic. Oxford: Oxford University Press. First published 1962.
  • Kovač, Srećko. 2008. “In What Sense is Kantian Principle of Contradiction Non-classical”. Logic and Logical Philosophy. 17: 251–274.
  • Kovač, Srećko. 2014. “Forms of Judgment as a Link between Mind and the Concepts of Substance and Cause” in Substantiality and Causality, Mirosław Szatkowski and Marek Rosiak (eds.), Boston, Berlin, Munich: de Gruyter, pp. 51–66.
  • Krüger, Lorenz, 1968. “Wollte Kant die Vollständigkeit seiner Urteilstafel beweisen.” Kant-Studien. 59: 333–356.
  • Lapointe, Sandra (ed.), 2019. Logic from Kant to Russell: Laying the Foundations for Analytic Philosophy. New York, London: Routledge.
  • Longuenesse, Beatrice. 1998. Kant and the Capacity to Judge: Sensibility and Discursivity in the Transcendental Analytic of the Critique of Pure Reason. Princeton: Princeton University Press. Transl. by Charles T. Wolfe.
  • Loparić, Željko. 1990. “The Logical Structure of the First Antinomy.” Kant-Studien. 81: 280–303.
  • Lu-Adler, Huaping. 2018. Kant and the Science of Logic: A Historical and Philosophical Reconstruction. New York: Oxford University Press.
  • MacFarlane, John. 2002. “Frege, Kant, and the Logic in Logicism.” The Philosophical Review. 111: 25–65.
  • Mosser, Kurt. 2008. Necessity and Possibility: The Logical Strategy of Kant’s Critique of Pure Reason. Washington, DC: Catholic University of America Press.
  • Newton, Alexandra. 2019. “Kant’s Logic of Judgment” in The Act and Object of Judgment, Brian Ball and Christoph Schuringa (eds.), New York, London: Routledge, pp. 66–90.
  • Reich, Klaus. 1948. Die Vollständigkeit der kantischen Urteilstafel. 2nd ed. Berlin: Schoetz. (1st ed. 1932).
  • English: The Completeness of Kant’s Table of Judgments, transl. by Jane Kneller and Michael Losonsky, Stanford University Press, 1992.
  • Scholz, Heinrich. 1959. Abriß der Geschichte der Logik. Freiburg, München: Alber. (1st ed. 1931).
  • Schulthess, Peter. 1981. Relation und Funktion: Eine systematische und entwicklungsgeschichtliche Untersuchung zur theoretischen Philosophie Kants. Berlin, New York: de Gruyter.
  • Schulting, Dennis. 2019. Kant’s Deduction from Apperception: An Essay on the Transcendental Deduction of Categories. 2nd revised ed. Berlin, Boston: de Gruyter.
  • Stuhlmann-Laeisz, Rainer. 1976. Kants Logik: Eine Interpretation auf der Grundlage von Vorlesungen, veröffentlichten Werken und Nachlaß. Berlin, New York: de Gruyter.
  • Tiles, Mary. 2004. “Kant: From General to Transcendental Logic” in Handbook of the History of Logic, vol. 3, Dov M. Gabbay and John Woods (eds.). Amsterdam etc: Elsevier, pp. 85–130.
  • Tolley, Christian. 2012. “The Generality of Kant’s Transcendental Logic.” Journal of the History of Philosophy. 50: 417‒446.
  • Tonelli, Giorgio. 1966. “Die Voraussetzungen zur Kantischen Urteilstafel in der Logik des 18. Jahrhunderts” in Kritik und Metaphysik, Friedrich Kaulbach and Joachim Ritter (eds). Berlin: de Gruyter, pp. 134–158.
  • Tonelli, Giorgio. 1994. Kant’s Critique of Pure Reason within the Tradition of Modern Logic: A Commentary on its History. Hildesheim, Zürich, New York: Olms.
  • Wolff, Michael. 1995. Die Vollständigkeit der kantischen Urteilstafel: Mit einem Essay über Frege’s Begriffsschrift. Frankfurt a. M.: Klostermann.
  • Wuchterl, Kurt. 1958. Die Theorie der formalen Logik bei Kant und in der Logistik. Inaugural-Dissertation, Ruprecht-Karl-Universität zu Heidelberg.

 

Author Information

Srećko Kovač
Email: skovac@ifzg.hr
Institute of Philosophy, Zagreb
Croatia

testy

How now brown cow?

 

In LaTeX, in order to place a subscript q on a capital F, try \(F_q\), which will make the ‘q’ smaller and slightly lower but will force it to be in italics. Otherwise, try some of the font size modifiers to reduce the size of the ‘q’ by doing \(\small q\) or \( \tiny q \), but unfortunately this also forces an italics.

Table of Contents

  1. Life
  2. The Modern Turn
    1. Against Scholasticism
      1. Descartes’ Project

1. Life

(Formater: Insert paragraphs for this section here.)

2. The Modern Turn

(Formater: Insert paragraphs for this section here.)

a. Against Scholasticism

(Formater: Insert paragraphs for this section here.)

i. Descartes’ Project

(Formater: Insert paragraphs for this section here.)

  • Reference 1
  • Reference 2
  • Reference 3

 

Author Information

Tesla Edison
Email: x@x.edu
Near-Earth orbit

Bernardino Telesio (1509—1588)

Dubbed “the first of the new philosophers” by Francis Bacon in 1613, Bernardino Telesio was one of the most eminent thinkers of Renaissance Italy, along with figures such as Pico, Pomponazzi, Cardano, Patrizi, Bruno, Doni, and Campanella.

The young Telesio spent the early decades of his life under the guidance of his uncle Antonio (1482-1534), a fine humanist who was determined to go beyond the strict disciplinary division between literary and philosophical texts. Before the printing of the first edition of his principal work, De natura iuxta propria principia (On the Nature of Things According to their Own Principles) (Rome, 1565), Telesio assimilated the basics of ancient scientific thought (both Greek and Latin), as well as those of Plato’s and Aristotle’s Scholastic commentators. In the second half of the 16th century, he began to be recognized as an adversary of Aristotle’s thought, insofar as he upheld a conception of man and nature that attempted to replace the principles of Aristotle’s natural philosophy. His starting point was the definition of a new role for the notion of sense perception in animal cognition. Using the Stoic notion of spiritus (translating the Greek word pneuma), he criticized Aristotle’s hylomorphism. As a fiery substance and an internal principle of motion, spiritus is the principle of sensitivity: by the way of heat, it pervades the entire cosmos, so that all beings are capable of sensation. In addition to grounding Telesio’s epistemology, then, the notion of spiritus lies at the core of his natural philosophy. During the time span extending from 1565 to 1588, he overturned the traditional conception of the relationship between sensus and intellectus, as championed by the Scholastic followers of Aristotle. Telesio denied that human brain possesses a faculty able to grasp the forms or essentiae of natural beings from simple passive sensible data of experience. On the contrary, sense perception has an active role: it is the first form of understanding the natural world. It is by the “way of senses” that mental representations of natural things are selected and shaped. This process happens in strict cooperation with the corporeal principle of self-organization of the material soul. In human beings as well as in animals, brain is the main source of this principle, which governs the cognitive process without the support of a superior immaterial agent. This active form of “sentience” constitutes the primary causal connection between the brain and the external world. Founded on a reassessment of the categories of sense perception, Telesio’s philosophy of mind led to an empiricist approach to the study of natural phenomena.

Table of Contents

  1. Life and Times
  2. Psychology and Theory of Knowledge
  3. Cosmology
  4. Influence and Legacy
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Life and Times

Bernardino Telesio was born in Cosenza (Northern Calabria) to Giovanni Battista, a noble man of letters, and Vincenza Garofalo, the daughter of a lawyer. Bernardino was the first-born of eight sons, and as a child was sent to his uncle Antonio (1482-1534) to be educated. In 1517, they went to the Duchy of Milan, where the young Telesio became acquainted with the most illustrious pupils of his uncle. He also met some eminent men of letters, like Matteo Bandello (1485-1561), who in his Novelle (1554) will recall Antonio’s knack for entertaining the members of the intellectual circles led by such gentlewomen as Camilla Scarampa Guidoboni (ca.1454-ca.1518) and Ippolita Sforza Bentivoglio (1481-ca.1520).

In 1523 Bernardino and Antonio moved to Rome, entering the intellectual milieu of the papal court and of the Vatican library, which was animated by philosophers and humanists such as Paolo Giovio (1483-1552), Marco Girolamo Vida (1485-1566), Marcello Cervini (1501-1555), Coriolano Martirano (1503- 1558), and Giovanni Antonio Pantusa (1501-1562). Bernardino left Studium Urbis in 1527, soon after the “sack of Rome”. Then he moved to Padua, where his uncle had been appointed professor of Latin by the municipality of Venice (October 17th, 1527).

During his early education, Bernardino was deeply influenced by his uncle. Antonio was a fine humanist, whose works largely circulated across Europe. To name an example, Antonio’s De coloribus libellus (Venice, 1528) rose to great fame. Following the Venetian first edition, at least ten editions of the work were released in Paris by scholar-printers such as Chrestien Wechel, Jacob Gazel and Robert Estienne (Stephanus); and a further five appeared in Basel. In particular, the Basel reprints were released by such renowned humanists as Hieronymus Froben and Johannes Herbst Oporinus. Thus, the young Bernardino could benefit from the mastery of some of the finest Italian connoisseurs of ancient Greek and Latin literature, soon becoming himself an expert reader of classic authors such as Virgil, Cicero, Seneca, Pliny, and Lucretius.

It is important to note that the materialist and empiricist approach Telesio displayed in his early works did not come out at first; the main source was an open-minded reading of the texts written by the early commentators of Aristotle’s works, such as Alexander of Aphrodisia, recently revisited by a new generation of scholars, such as, for example, Pietro Pomponazzi. At the University of Padua, the young Telesio could learn the new critical approach to Aristotle’s works. During the time spent in Padua and Venice, he did not gain the title of magister medicinae et artium, yet he started to develop a serious interest in mathematics, medicine, and natural philosophy.

At the end of the Venetian period (1527-1529), Telesio came back to Calabria. After some time spent in Naples (probably from 1532 or 1533 up to the spring of 1534, when his uncle passed away), Bernardino moved to Rome (1534-1535), living in the papal environment of Paolo III Farnese. Then, between 1536 and 1538 he spent a fruitful period of study at the Benedictine monastery of Seminara in the South of Calabria. There he began to develop his arsenal of anti-Aristotelian arguments, partly taken from Presocratic, Hippocratic, Epicurean and Stoic ideas. From there he went back to Rome, meeting some illustrious members of the papal court. Benedetto Varchi, Annibal Caro, Niccolò Ardinghelli, Ippolito Capilupi, Alessandro Farnese, Gasparo Contarini, Niccolò Gaddi, Giovanni Della Casa, and the Orsini brothers became soon acquainted with the philosopher of Cosenza. The significant number of letters written by these figures in the 1540s allows us to follow Bernardino’s movements between Rome, Naples, and Padua (Sergio 2014; Simonetta 2015). By the early 1540s, Telesio was already renowned as an anti-Aristotelian philosopher.

It was during that time that Telesio started to study Vesalius’s program of reform of the ancient ars medendi, including both Galen’s legacy and the Corpus Hippocraticum. Between 1541 and 1542, he spent some time in Padua, during which he met the anatomist and physician Matteo Realdo Colombo (1516-1559). Telesio’s interest in the nova ars medendi, and, more specifically, in physiology of sense perception, will be attested in a work “Contra Galenum” entitled Quod animal universum ab unica Animae substantia gubernatur, written in the 1560s, and posthumously edited and published by Antonio Persio (1542-1612) in Varii de naturalibus rebus libelli (Telesio 1590, [139-227]).

In the late 1540s he probably lived in the Neapolitan household of Alfonso Carafa (d. 1581), III Duke of Nocera, and in the early 1550s he came back to Cosenza. There, in 1553, he married Diana Sersale (d. 1561), a noblewoman belonging to the municipality of Cosenza. He soon became a leading figure in the city, laying foundations for the future creation of the “Accademia Cosentina” (Lupi 2011). In Cosenza, Telesio had such distinguished pupils as Sertorio Quattromani (1541-1603) and Iacopo di Gaeta (fl. 1550-1600); the philosopher and physician Agostino Doni (fl. 1545-1583); the orientalist Giovanni Battista Vecchietti (1552-1619); the future mayor of Cosenza, Giulio Cavalcanti (1591-1600); and Telesio’s first biographer, Giovan Paolo d’Aquino (d. 1612). In 1554 Telesio was elected mayor of Cosenza. Throughout the 1550s, he worked to improve the initial versions of his works, and, soon after the death of his wife (1561), he probably spent a second period of study at the Benedictine abbey of Santa Maria del Corazzo.

In the early 1560s Telesio became more familiar with the academic environment of Naples, where the works of Vesalius, Colombo, Cardano, Eustachius, Cesalpino, Fracastoro, and Jean Fernel featured prominently in the study of natural philosophy and medicine. There Telesio probably read the works of Giovanni Argenterio (1513-1572), professor of medicine in Naples from 1555 to 1560, one of the major contributors to the diffusion of new medical ideas in Southern Italy. Like Girolamo Fracastoro, Argenterio criticized the Galenic theories of contagion and diseases, contributing to the slow downfall of Galen’s authority. He also probably read the work of Giovanni Filippo Ingrassia (1510-1580), a Sicilian physician who received his scientific education at Padua—studying with Vesalius, Colombo, Eustachius, and Fracastoro—and who was also critical of Galen.

In 1563 Telesio went to Brescia, paying a visit to the Aristotelian Vincenzo Maggi (1498-1564). On that occasion, he submitted to Maggi the manuscript of the first edition of De natura iuxta propria principia. In 1565, Telesio’s masterpiece was published in Rome by the papal printer Antonio Blado. In the same period, he completed the draft of the most important of his medical writings—the aforementioned Quod animal universum ab unica animae substantia gubernatur. In the next year, the Neapolitan printer Matteo Cancer released a short treatise, Ad Felicem Moimonam iris, about the phenomenon of rainbow (Telesio 1566 and 2011). These latter two works were an early testimony to the wide range of Telesio’s philosophical interests, as much as to the originality of his method in the quest for the causes of natural phenomena. In the same year of the publication of De natura, one of Telesio’s brothers, Tommaso (1523-1569), accepted the position of Archbishop of Cosenza, a title initially offered to Bernardino by Pius IV.

Toward the end of the 1560s and the beginning of 1570s, Telesio’s philosophical reputation was becoming more and more widespread. In 1567 the humanist Giano Pelusio wrote a short poem, Ad Bernardinum Thylesium Philosophum, where the philosopher of Cosenza was compared to Pythagoras; during the next few years, Telesio was assisted by his disciple Antonio Persio in the publication of the second edition of De rerum natura (1570). In the same year three pamphlets were printed: De colorum generatione, De his quae in aere fiunt et de terraemotibus, and De mari liber unicus (Telesio 1981 and 2013). They were printed in Naples, where Telesio lived under the patronage of Alfonso and Ferrante Carafa (d. 1593).

Telesio’s fame grew in the 1570s: in 1572, Francesco Patrizi wrote an insightful review of Telesio’s De rerum natura (Objectiones), to which Telesio replied in a letter, Solutiones Tylesij; meanwhile, Antonio Persio wrote a reply entitled Apologia pro Telesio adversus Franciscum Patritium. Patrizi’s letter offered Telesio an occasion to point out some arguments of his cosmology and psychology: a) the universality of sensus rather than of the soul (anima, spiritus); b) the fiery and physical nature of the heavens, wherein the Sun is considered the source of motion as well as of the life of celestial bodies; c) the eternity of “celestial spheres” replacing the Platonic idea of creatio ex nihilo; d) the primacy of sense perception over the intellect in the cognitive process of animal understanding. Telesio’s understanding of nature championed the notion of universal sensibility (pansensism) over that of universal animation of things (panpsychism). What governs nature itself are just internal natural principles: there is no need for a divine intelligence in order to explain its inner processes and the variety of natural phenomena.

In the same years as when the correspondence between Telesio and Patrizi was published, the Florentine Francesco Martelli translated Telesio’s De rerum natura (Delle cose naturali libri due) and the treatises De mari and De his, quae in aere fiunt. Around the same period, the orientalist Giovan Battista Vecchietti spent a brief journey at Pisa, defending Telesio’s doctrines against the Aristotelians of the Studium. The temper of that young Telesian caught the attention of the Duke of Tuscany, Cosimo de’ Medici. Between 1575 and 1576 Antonio Persio published three works: Liber Novarum positionum (1575), Disputationes Libri novarum positionum, and Trattato dell’ingegno dell’huomo (1576). By 1577, Patrizi completed L’amorosa filosofia, a dialogue wherein the philosopher of Cherso mentions the acquaintance he had with Bernardino Telesio. In the late 1570s Telesio came back to Naples, and, at that time, the humanist Bonifacio Vannozzi, the rector of Pisa university, wrote a letter to Telesio, defining him as “our Socrates” (Artese 1998, 191).

Living in Naples in the first half of the 1580s, Telesio immersed himself in the production of the third edition of De rerum natura, printed in 1586. He dedicated the work to Ferrante Carafa, IV Duke of Nocera. In that edition, Telesio unpacked in nine books the earlier and later topics of his thought, from cosmology to psychology and moral philosophy. Meanwhile, his thought came to be renowned in England. During his Grand Tour of Italy (1581-1582), the mathematician Henry Savile bought a copy of the second edition of De rerum natura. Just a few decades later, Telesio’s works will have spread in the cultural circles of the early Jacobean England. James I Stuart and Francis Bacon owned copies of Telesio’s works, as did churchmen and royal physicians like John Rainolds ([Reynolds], 1549-1607; a translator of King James’s Bible) and Sir William Paddy (1554-1634, a fellow of the Royal College of Physicians of London), and aristocrats like Sir Henry Percy (1564-1632), IX Earl of Northumberland. Even if with different views and motivations, they all read Telesio’s writings. Moreover, his thought attracted the attention of the most eminent men of the “Northumberland circle”: Sir Walter Raleigh (1552-1618), Walter Warner (ca.1557-1643), Thomas Harriot (1560-1621), and Nicholas Hill.

In 1587, a Neapolitan lawyer, Giacomo Antonio Marta (1559-1628), wrote a pamphlet against Telesio, titled Pugnaculum Aristotelis contra principia Bernardini Telesii (1587). A few years later, the young Campanella, in his Philosophia sensibus demonstrata (Naples, 1591)—the most remarkable manifesto of Telesian philosophy—launched a fierce attack against Marta’s book. In the first pages of his work, Campanella made a summary of Telesio’s epistemology, pointing out the need to clarify the first grounds of the new method before commencing the inquiry into the main issues of natural philosophy. By means of Telesio, Campanella contributed—in his own way—to the development of the early modern debate about the scientific method (Firpo 1949, 182-183). The empiricist approach adopted by Telesio and Campanella did not yet have the complexity and articulation of Galileo’s method, composed by the “manifest experiences” (sensate esperienze) and “necessary demonstrations” (certe dimostrazioni); nonetheless, a number of early 17th century Italian writers did not hesitate to label the Calabrian thinkers as just as dangerous as the novatores of the Galenic circle.

On July 23rd, 1587, Telesio came back to Cosenza, and wrote his will, most likely because of ill health. He died a year later in October. Among the participants of the burial ceremony were Sertorio Quattromani, Giovan Paolo d’Aquino, the members of the “Accademia Cosentina” (Iacopo di Gaeta, Vincenzo Bombini, Giulio Cavalcanti and others), and the young Tommaso Campanella, at that time a friar of Ordo Praedicatorum, hosted in the convent of San Domenico in Cosenza. For the occasion, Campanella composed some verses dedicated to Bernardino (Al Telesio Cosentino, in Campanella 1622, n° 68).

2. Psychology and Theory of Knowledge

Telesio’s natural philosophy is based on a new methodological approach to the study of nature. This is exactly what he points out in the first pages of his De natura (1565)—a work rightly characterized by some scholars as “Telesio’s masterpiece”. Such approach does not depend uniquely on his alleged modern “empiricism”. The main elements of Telesio’s “modernity” lie in his novel approach to psychology, animal physiology, and theory of science. On the one hand, Telesio offers a number of arguments for the similarity of animals and humans: for example, both animals and humans are able to perceive their own passions through the way of senses. On the other hand, the spiritus of humans is “purer” and “more abundant” than that of animals (Telesio 1586, VIII.14-15). Therefore, humans are better equipped than animals in the art of reasoning.

Telesio’s principal aim was to inquire the causes of natural phenomena without viewing those natural phenomena through the lenses of Platonic and Aristotelian metaphysics. As he states in the incipit of book I of De rerum natura (1570), “the construction of the world and the nature of the bodies contained in it should be not inspected by reason, as the Ancients did, but must be perceived by sense, and grasped from things themselves”. He did not belittle nor underestimate the role of reason. Nonetheless, he prioritizes the direct evidence that comes from the senses. The beginning of his natural philosophy lies in the experience deriving from sense perception, sensus being a cognitive power closer to natural things than reason itself. As Aristotle himself asserted, “there is nothing in the intellect that is not first in sense perception”. The first moves of Telesio’s thought were to develop this principle of classical empiricism in a new and more coherent way.

The perception of a physical object establishes a causal relation to the external world, and the first task of a scientist is to investigate the nature of that relation. In opposition to Aristotle, Telesio affirms that the ability of a sentient creature to reach knowledge of natural things is not “actualized” by the “form” of the perceived thing. He does not believe that all acts of sense perception simply mirror the natural beings of the external world. Rather, he thinks that knowledge of the world depends on the sensible data perceived by the sentient creature. That kind of affection (perceptio passionis) is the very starting point to reach knowledge of the world, as sense perception is the basic and most important property of all animals, while the act of understanding is nothing more than reckoning and reminding similarities and differences between previous sensations. In that perspective, Telesio abandoned the traditional doctrine of species, denying that natural things are the result of the combination of matter and form. According to him, the Peripatetic answer to the problem of human knowledge left unsolved the relation between causes and effects in the cognitive process. Once more, it is the concept of spiritus that lies at the core of Telesio’s psychology. As imperceptible, thin, fiery body, it constitutes our sensible soul (Telesio 1586, VII.4, and V.3); as anima sentiens of human bodies, it is present mainly in the nervous system, in order to guarantee the unity of the perception; consequently, it is the bearer of sensibility and movement (Telesio 1586, V.5., V.10, and V.12). In other words, Telesio provides a theory of mind where the spirit does produce actual internal representations in response to external stimuli—which are considered as passions—and to internal stimuli, which are the affections or the motions of spirit itself. Then, mental representations are simple reconstructions of the world. Telesio upheld that the material soul grasps natural beings by means of a corporeal, physical interaction with them. Consequently, scientific knowledge is not the result of a hierarchical process, nor does it consist in the gradual abstraction of similitudes or species (Telesio 1586, VIII.15-28). In some way, Telesio’s psychology anticipated the empiricist approach of the 17th century critics to Descartes’s doctrine of the cogito: in order to be aware of the knowledge of the natural things, humans do not need the intellectual self-consciousness of the sensible data coming from sense perception. Further, the editing process of sense perception can be improved only by sense perception, supported—when necessary—by the corporeal principle of spiritus.

Reasoning is nothing but the outcome of the self-organization of the “material soul” (spiritus) in cooperation with the “ways” or “means” of sense perception and the principal functions (memory, imagination) of the brain activated by the same principle of material soul. In order to follow their natural aim, that is, self-preservation (conservatio sui), both humans and animals are ruled by the opposed sensations of pleasure and pain, with the key function of the spirit at the core of bodily functions (Telesio 1586, VII.3; IX.2). In his early writings, Telesio did not directly challenge the theory of intelligible species of the Scholastic tradition; however, his opposition to that theory is evident in the basic principles of his psychology. They may be unfolded in five topics:

(a) intellectual cognition is based on a perceived similitude, which does not consist in a mental representation of an external object resulting from the encounter between the active intellect and passive sensation; rather, sense perception is an active operation of the spiritus, the material soul (Telesio 1586, V.34-47);

(b) the sensible data resulting from a perceptive experience has a cognitive role (as Campanella and Hobbes later explained, sensus is already a kind of iudicium; whereas understanding or imagination are nothing but “decaying sense”);

(c) the material agents involved in the cognitive process, from the “ways of sense” to the spiritus (“anima sentiens”), are merely corporeal (Telesio 1586, V.3-5, 10-12);

(d) the spirit is able to perceive because it can be subject to sensible, bodily alterations;

(e) since spiritus is the bearer of motion, a human soul does move in virtue of its own nature; what is at stake is indeed the concept of motion, in some way close to Lucretius’s atomism, even though Telesio himself does not show any intention to claim such a linkage. At the same time, Telesio’s theory excludes any mechanical approach to the physiology of sense perception: motion of  bodies has to be explained through the physics of contact, and yet his theory of motion is still far from such kind of explanation that such modern authors as Gassendi, Descartes, and Hobbes later tried to provide.

Telesio’s naturalistic program, then, took sensation as a material process involving only material agents: (corporeal and) sensible objects, the “ways of sense” and the spiritus. Stating that an animal is ruled by one substance residing in its brain, Telesio abandoned Aristotle’s psychology and his threefold partition of the soul (intellective, sensitive, vegetative), as well as Galen’s partition of “pneumata” (animal, vital, natural) and his theory of “temperaments” (Telesio 1570, II.15). According to the Galenic system, the pneuma as a “transmitter substance” had a tripartite structure: a) the spiritus naturalis (pneuma physei) or vegetative, having its seat in the liver, and responsible for digestion, metabolism, production of blood and semen; b) the spiritus vitalis, localized in the heart, active in all kind of affections and motions; c) the spiritus animalis (psychei), situated in the brain, and responsible for the control and organization of the activity of the soul and of the intellect. Now, in the new system, both psychology and physiology, psyche and physis, were unified in one organic theory. Furthermore, the conception of the spiritus as a principle generated from the semen and diffused through the entire nervous system echoed some lines of Lucretius’s On the Nature of Things. Finally, locating the seat of the spirit in the brain, Telesio rejected Aristotle’s biological cardiocentrism (Telesio 1586, V.27).

In the 1586 edition of De rerum natura, Telesio introduced the notion of a divine soul (a deo immissa) to go along with the “material soul” (e semine educta) of his earlier thought (Telesio 1586, V.2-3). The idea of a divine soul capable of surviving the natural dissolution of the body is a conceptual device Telesio used with a twofold scope: on the one hand, Telesio could not deny that the concept of soul was a theological matter: Sacred Scripture teaches us that humans have a divine origin, infused by the Creator itself. Therefore, it would be unjust if God did not give to humans the prospect of an afterlife, as a remuneration for virtue and for vice experienced during the “mundane” lifetime (in that passage, it is evident that the source of Telesio’s argument is Book XIV of Marsilio Ficino’s Theologia Platonica de immortalitate animarum). On the other hand, Telesio remained faithful to the methodology of his early works: in 1586 he just pointed out the existence of a strict separation between the specific subjects of the philosopher’s and the theologian’s work. As a forerunner of the modern scientist, Telesio thought that the role of the philosopher was uniquely to inquire the secrets of nature “according its own principles”.

Telesio goes on to reject Aristotle’s definition of the soul as forma corporis, that is as the form and entelechy of an organic body (Aristotle, De anima II,1). According to Leen Spruit (1995), what matters here is the main topic of the formal mediation of sensible reality in intellectual knowledge. As it is well known, Aristotle regarded the mind as capable of grasping forms detached from matter (materia sensibilis). Aristotelian medieval commentators grounded that theory in the mediating role of representational forms called “intelligible species”.

According to Telesio, on the other hand, scientific knowledge of the world has to be necessarily mediated through sensible knowledge, which has an active role, whereas according to Aristotle the “materials” of sense perception play a passive role, from which the intellect grasps the form of each substance or natural being. Here lies another echo of Lucretius’s On the Nature of Things, where (in book III, ll. 359-369) he vigorously criticizes those philosophers who consider the senses as passive “gates” used by the soul.

As it is stated in the chapter V.2 of De rerum natura (1586), the spirit is what allows animals to perceive the external world, so it moves sometimes with the whole body, sometimes with single parts thereof. Probably inspired by the Aristotelian tradition of such authors as Alexander of Aphrodisias (on Aristotle’s Meteor. IV.8, for example), Telesio claimed that the “homeomerus” parts of the body (skin, flesh, tissues, blood, bones, and so forth) are the same for animals and humans: they differ in their appetites and needs, not in their calculations (logoi), and importantly, they all have the same kind of sense perceptions. Thus animal and human souls differ in degree, not in kind or quality.

Analogously, whereas Aristotle (in Meteor. book IV) asserted that all sensitive parts of the body must be homogeneous and be a direct composition of the four elements, in Telesio’s view, the variety of dispositions and functions of the different parts of the body had to be explained in the same way that the majority of the natural bodies are. In other words, the “homeomerous” mixtures cannot be considered as the “ultimate” parts of the “anhomeomerous” bodies (organs such as the eye, the heart, the liver, the lungs and so forth). Even though the spiritus is mostly present in the brain and in the nervous system, it is also spread throughout the entire body and, just like a brain, it was responsible for the motions, changes and the combination of different parts of the body. By the way of the sensus, the dynamics of attraction and repulsion provided for the constant balance of the living body.

3. Cosmology

Telesio eschewed metaphysical speculation; in his view, the most important task of the natural philosopher is to give attention to the observable phenomena in the natural world, looking for the causes of “sensible beings” (Telesio 1586, II.3). Thus it was in the spirit of the natural philosopher that he theorized that all natural things are the result of the two active and mutually antagonistic forces, “heat” and “cold”, acting upon matter, and thereby making possible the creation of inanimate and animate beings. Heat is responsible for the phenomena of elasticity, warmth, dryness, combustion, and lightness, as well as rarefaction of matter and motion and velocity of bodies; cold is responsible for the slowness of bodies in motion, and for their condensation, freezing and hardness. All the other natural phenomena (such as humidity or fermentation) are the results of combinations of different degrees of heat and cold. The interaction of heat and cold affects the nature of matter itself, a notion that Telesio intentionally left unclear. Taken per se, the concept of matter cannot be directly sensed, and its existence can simply be postulated, just like the notion of spiritus.

In this way Telesio rejected Aristotle’s view, according to which the two pairs of opposed qualities (cold/heat, and dry/humid), acting on the matter, gave rise to one of the four primary elements of natural beings (earth, air, fire, and water). According to Telesio, matter, as a physical, corporeal subjectum, has a merely passive role. In fact, what is important according to Telesio are the modifications of the subjectum, that is, the results of interactions between heat and cold (Telesio 1586, II.2).

On Telesio’s view, all things act according to their own nature, starting from the primary forces of cold and heat, by means of the ability to perceive each other. In order to sustain themselves, these primary forces and all beings which arise through their antagonistic interaction must be able to perceive themselves as opposite forces. In other words, they have to sense what is convenient and what is inconvenient or damaging for their own survival. Living bodies do not constitute a specific realm, separated from inanimate beings. They are all determined by the solar heat and the terrestrial matter. Again, it is important to note that sensation is not only a property of animate beings. Telesio’s philosophy can thus be described as a kind of pansensism: all beings, both animate and inanimate, are said to have the power of sensation. In fact, in the third edition (1586) of De rerum natura, the motion of celestial bodies will be explained by means of the principle of “self-preservation” (conservatio sui), in other words, the need to sustain the life itself of those bodies.

At the heart of Telesio’s cosmology, then, is the idea that nature is ruled by its own—internal, not external—principles. Consequently, the natural world does not need to be taken care of by any kind of divine intelligence. Heat and cold share the same “desire” to preserve themselves. The celestial spheres are made of matter, heat and cold (ivi, I.11-12, 8-9). Regarding the Ptolemaic system, Telesio rejected it as unnatural, probably because of the growing suspicion—in the 16th century cosmology—that it provided a mathematical device to “save the appearances,”, leaving  unexplained the question of the actual natural causes of the planetary motions, as well as of other celestial phenomena. Beginning with the first edition of De rerum natura, Telesio’s objective was to replace Aristotle’s geocentrism with one of his own. At the cosmological level, the interplay between heat and cold involves the position of the Sun and of the Earth, being the seats and sources of heat and coldness, respectively. Because of its heat, the Sun was propelled into perpetual motion, whereas the Earth is immobile because of its coldness and its great weight. Consequently, the cosmic balance and harmony of the heavens depend on the struggle and equilibrium between the Sun and the Earth. Unlike Aristotle, Telesio upheld the fiery nature of the heavens. That moved the philosopher of Cosenza to deny the Aristotelian principle of a first-mover of the universe. Planetary motions are not the outcome of the patterns of motion between the several regions of the celestial spheres; rather, they are the consequence of a geocentric system ruled by thermal forces, wherein are still valid the ancient notions of densum and rarum.

Thus, Telesio chose heat and cold as the principal agents for knowledge of the world because together with prime matter (moles), they immediately affect bodies and their functions. As said before, the two primary bodies, the Sun and the Earth, are the subjects of Telesio’s argument. The former is the seat of heat, and the latter is that of cold (Telesio 1565, I.1-4). That statement literally expelled the idea of a creatio ex nihilo. Electing the Sun and the Earth as the celestial seats of heat and cold, Telesio defines the boundaries of the universe as the edges of the corporeal world (extrema corpora universi)Life itself depends on the right balance of heat and cold, and they are lastly called “forces of acting natures”, agentium naturarum vires (Telesio 1586 VII.9). The later Telesio, in fact, was firmly convinced that the world depended from the inner uniformity of nature and from its intrinsic virtue or “wisdom”.

Furthermore, against Aristotle, Telesio denied the theory of the locus as the limit of a body, taking into account the atomistic theory of space as an empty place filled by the bodies. By means of the two forces of heat and cold, and by affirming the idea of a space filled by matter, he abolished the Aristotelian theory of a cosmos divided into a sublunary world, in which generation and corruption take place, and a superlunary world with timeless regular movements. Moreover, he developed a critique of the Peripatetic theory of natural locus, pointing out that Aristotelians did not explain well the reason why the motion of heavy bodies becomes uniformly accelerated.

4. Influence and Legacy

With the publication of his early works (1565, 1566, 1570), Telesio established himself as a key figure in the intellectual milieu of the late 16th to early 17th century Italy. Some of his theses were read, commented on, and debated by a number of Italian philosophers, physicians, and amateurs of science, such as Francesco Patrizi, Antonio Persio, Agostino Doni, Giordano Bruno, Giambattista Vecchietti, Latino Tancredi, Tommaso Campanella, Andrea Chiocco, Giulio Cortese, Francesco de’ Vieri, Alessandro Tassoni, and Marco Aurelio Severino. In the early 17th century his writings circulated around Europe, and were read by Francis Bacon, Marin Mersenne, René Descartes, Pierre Gassendi, Jean-Cécile Frey, Charles Sorel, Walter Warner, Thomas Hobbes, and others.

One of the first authors to officially criticize the philosophy of the “Telesians” is Francesco de’ Vieri (1524-1591), lecturer of Aristotelian philosophy at the University of Pisa. In 1573 he published in Florence a work on the vernacular, Trattato delle metheore, in three books. The same work, augmented with a huge fourth book of 200 pages, edited in 1582, contains a rehearsal of the principal topics of the fourth book of Aristotle’s Meteorologica, and, with the purpose of showing his Platonic reading of Aristotle’s philosophy, he took occasion to attack the “Telesians”, with the aim of persuading them with “their own arguments” (p. 227). His critique of Telesio and of the Telesians is particularly significant because he offers a reassessment of the Aristotelian notion of sensus through the key reading of the Platonic concept of pneuma (a word belonging to the Stoic and pre-Aristotelian lexicon). As said above, Telesio translates the Greek word pneuma into the Latin expression of spiritus or anima sentiens. Some pages after the aforementioned quotation, Verino states that God created souls as eternal beings (ab aeterno), because a soul is not grasped from the alteration of matter (p. 247). This is a clear reference to the Telesian idea of a spiritus grasped from a material seed (spiritusex semine educta). In a manuscript kept at the National Library of Florence (Magl. XII.11, f. 23), the same author attacked Telesio and his followers, who erroneously attribute to the sensus “all judgments about the natural things”. It is important to recall that when Francesco de’ Vieri published the 1582 edition of his book, Telesio’s philosophy had already reached, in Tuscany and across Italy, the apex of its fame.

In 1587, a year before Telesio’s death, the Spanish philosopher Oliva Sabuco de Nantes y Barrera published a book, Nueva filosofía de la naturaleza del hombre, where she elaborated on a psychophysiology of the human body deeply influenced by Telesio’s doctrines (Bidwell-Steiner 2012). Then, in 1588 Francesco Muti published a work entitled Disceptationum libri V contra calumnias Theodori Angelutii in maximum philosophum Franciscum Patritium, in quibus pene universa Aristotelis philosophia in examen adducitur, in which he defended Telesio, taking in consideration the quarrel that took place at Ferrara during 1584 and 1585 between Patrizi and Angelucci (Sergio 2013, 71-72, 74). In 1589, Sertorio Quattromani, the new founder of the “Accademia Cosentina,” published a summary of Telesio’s thought called La filosofia di Bernardino Telesioris tretta in brevità e scritta in lingua Toscana  (Naples, 1591).

In the last decade of the 16th century, an important role was played by Antonio Persio (1542-1612). Among Telesio’s disciples, Persio was the one who worked on the Venetian edition of the Varii de naturalibus rebus libelli (Apud Felice Valgrisium, 1590). That edition included both the booklets already published in 1570 (De his, quae in aerefiunt et de terrae motibus; De colorum generatione; De Mari) plus a number of writings Telesio had left unpublished (De cometis et lacteo circulo; De iride; Quod animal universum ab unica animae substantia gubernator; De usure spirationis; De coloribus; De saporibus; De somno). Some years later, one of Telesio’s former disciples, Giovan Paolo d’Aquino, published Oratione in morte di Berardino [sic] Telesio Philosopho Eccellentissimo agli Academici Cosentini (Cosenza, per Leonardo Angrisano, 1596), the first biography of the philosopher of Cosenza.

As noted above, in 1591 Campanella wrote a stunning defense of Telesian philosophy against Giacomo Antonio Marta’s Pugnaculum Aristotelis. In his work, Campanella took occasion to unfold and reassess the principles of Telesio’s naturalism, somehow anticipating (in his Praefatio) the basic essentials of Galileo’s methodology (above all, the alliance between the “sensate esperienze” and the “certe dimostrazioni”). Another early modern thinker to note, Alessandro Tassoni, devoted a number of pages of his works to Telesio’s meteorology (Trabucco 2019).

In the first decades of the 17th century, in Italy, Telesio’s ideas entered a wider scientific context, a constellation populated by a number of scientists interested in the so-called “mathematization of the world”, such as Galileo and the network of his disciples and correspondents. However, the new mathematical trend of natural philosophy did not eclipse Telesio’s merits and the scientific value of his work. Authors such as Latino Tancredi, Colantonio Stigliola, Marco Aurelio Severino, and Tommaso Cornelio will continue to spread his thought. Especially in Southern Italy, Telesio’s name became the distinctive mark of a philosophical tradition dating back to the greatest authors of the ancient, pre-Aristotelian period, such as Pythagoras, Empedocles, Philolaus, Alcmaeon, Timaeus of Locri, and so forth.

Meanwhile, in England, Francis Bacon devoted some pages of his writings to Telesio: firstly in his Advancement of Learning (1605), then in De principiis atque originibus, secundum fabulas Cupidinis et Coeli (1613), and finally in his Sylva sylvarum. Bacon’s reading of Telesio’s philosophy mainly focused on the portrayal of Telesio as the restorer of Parmenides’s philosophy, freezing the Calabrian thinker in the role of an innovator who took inspiration from the Eleatic monism for the setting of his materialistic world-view (Rees 1977, De Mas 1989, Bondì 2001, Garber 2016). At same time, Bacon himself expressed some concerns about the limits of Telesio’s theory of matter. According to Lord Verulam, Telesio’s concept of matter lies unexplained in regards to its specific function in the processes of generation and transformation of natural beings. However, Bacon admired such authors as Telesio, Cardano, and Della Porta with respect to the notion of spiritus, the power of imagination, and the sympathy between animate and inanimate objects (Gouk 1984). In that way, it is fair to say that Bacon contributed to the construction of the mythical conception of Telesio as a freethinker deeply indebted to the pre-Socratic tradition, which is not to say that the myth is altogether misleading (see Giglioni 2010: 70).

Back in the European continent, some 17th century traces of Telesio’s legacy can be found in such authors as Marin Mersenne (Quaestiones celeberrimae in Genesim, Paris, 1623); Gabriel Naudé (Apologie pour tous les grand personnages qui ont esté faussement soupçonnez de magie, Paris, 1625; Advis pour dresser une bibliotheque, Paris, 1627); Jean-Cécile Frey (Cribrum philosophorum qui Aristotelem superiore et hacaetate oppugnarunt, in Opuscula varia nusquamedita, Paris, 1646); Charles Sorel (Le sommaire des opinions les plus estranges des novateurs modernes en la philosophie comme de Telesius, de Patritius, de Cardan, de Ramus, de Campanelle, de Descartes, et autres, in De la perfection de l’homme, où les vrays biens sont considérez et spécialement ceux de l’âme, Paris, 1655; reprinted in La science universelle, vol. 4, 1668); Guy Holland (The grand prerogative of human nature, namely, the soul’s natural or native immortality, and freedom from corruption, London, 1653); and Pierre Gassendi (Syntagma philosophicum, in Opera Omnia, vol. I, Paris, 1658).

Another testimony of the role of Telesio’s legacy in the 17th century Naples is contained in Tommaso Cornelio’s Progymnasmataphysica (Venetiis, 1663): compare the Progymn. II. De initiis rerum naturalium; the Epistolade Platonica Circompulsione, and Epistola M. Aurelij Severini nomine conscripta (repr. Venetiis, 1683, pp. 41-42, 140, 144, 146, and 190-191).

In the French context, Pierre Gassendi was one of the most important authors to give attention to the Cosentine thinker. In his writings such novatores as Telesio and Campanella are mentioned in regards to the theories of space and time as well as the theory of sensory qualities including heat and cold (Syntagma, in Opera, I, 245b), as well as in Gassendi’s tripartite conception of void—that is to say, the inane separatum, that is, the idea of an infinite void expanding beyond the atmosphere; the inane disseminatum, that is, the interparticle void between the basic corpuscles of bodies; and the inane coacervatum, that is, the interparticle void “cobbled” together by experimental means (Opera I, 185a-187a, 192a-196a, 196b-203a). On that subject, in Gassendi’s notion of vacuum coarcevatum, there is no way to explain how bodies may divide and separate at the level of basic particles without the supposition of that kind of void; here, Gassendi found evidently insufficient the explanation of Telesio according to which heat and cold are the active principles of matter (for further details, see Fisher 2005, and Henry 1979).

Finally, a specific debt towards Telesio is also identifiable in Thomas Hobbes’s works. In the first chapter of Leviathan (1651), Hobbes openly rejected the doctrine of species, and in successive chapters he asserted a cohesive relationship between sense, imagination and reasoning, consistent with the Telesian approach (a first trace of that influence dates back to the Elements of Law, Natural and Politic, written in 1640). What is more, the notion of “self-preservation” (conservatio sui) was reassessed in Hobbes’s anthropology. Telesio’s influence became more explicit in 1655 in Hobbes’s De corpore, sect. IV., chap. XXV. In the fifth article of that chapter, Hobbes unfolds the basic properties of sensation and cognition in the simplest structures of the organized matter in motion. In the same place he provides a suggestion which allows us to place his materialism close to the Renaissance pansensism advanced by Telesio and Campanella. After he explained in a general way his physiology of sensation and animal locomotion, he stated:

I know there have been philosophers, and those learned men, who have maintained that all bodies are endued with sense. Nor do I see how they can be refuted, if the nature of sense be placed in reaction only. And, though by reaction of bodies inanimate a phantasm might be made, it would nevertheless cease, as soon as ever the object were removed. For unless those bodies had organs, as living creatures have, fit for the retaining of such motion as is made in them, their sense would be such, as that they should never remember the same (Hobbes 1656, XXV.5, p. 226. On the subject, see Schuhmann 1988: 109-133; Sergio 2008: 298-315).

5. References and Further Reading

a. Primary Sources

  • Telesio, Bernardino, 1565, De natura iuxta propria principia liber primus, et secundus (Romae, Antonium Bladum, 1565) – Ad Felicem Moimonam iris (Rome, Mattheus Cancer, 1566), ed. by R. Bondì, Rome, Carocci, 2011.
  • Telesio, Bernardino, 1570, De rerum natura iuxta propria principia, liber primus, et secundus, denuo editi – Opuscula (Neapoli, Josephum Cacchium, 1570), ed. by R. Bondì, Rome, Carocci, 2013.
  • Telesio, Bernardino, 1572, Delle cose naturali libri due – Opuscoli – Polemiche telesiane (Biblioteca Nazionale Centrale, Florence, Ms. Pal. 844, cc. 12r-204r; Cod. Magl. XII B 39), ed. by A. L. Puliafito, Rome, Carocci, 2013.
  • Telesio, Bernardino, 1586, De rerum natura iuxta propria principia, libri IX (Naples, Horatius Salvianus, 1586), ed. by G. Giglioni, Rome, Carocci, 2013.
  • Telesio, Bernardino, 1590, Varii de naturalibus rebus libelli ab Antonio Persio editi (Venice, F. Valgrisius, 1590), ed. by Miguel A. Granada, Rome: Carocci, 2012.
  • Telesio, Bernardino, 1981, Varii de naturalibus rebus libelli, ed. by L. De Franco, Florence, La Nuova Italia.

b. Secondary Sources

  • d’Aquino, Giovan Paolo, 1596, Oratione in morte di Berardino Telesio, philosopho eccelentissimo, Cosenza, Leonardo Angrisano.
  • Artese, Luciano, 1991, “Il rapporto Parmenide-Telesio dal Persio al Maranta,” Giornale Critico della Filosofia Italiana, 70: 15-34.
  • Artese, Luciano, 1994, “Bernardino Telesio e la cultura napoletana,” Studi Filosofici, 17: 91-110.
  • Artese, Luciano, 1998, “Documenti inediti e testimonianze su Francesco Patrizi e la Toscana,” Bruniana&Campanelliana, 4: 167-191.
  • Bacon, Francis, 1613, De principiis atque originibus, secundum fabulas Cupidinis et Coeli, in The Works of Francis Bacon, ed. by R. L. Ellis, J. Spedding, D. D. Heath, London, Longmans, 1858, vol. 5: 289-346.
  • Barbero, Giliola, Paolini, Adriana, 2017, Le edizioni antiche di Bernardino Telesio: censimento e storia, Paris, Les Belles Lettres.
  • Bianchi, Lorenzo, 1992, “Des novateurs modernes en philosophie: Telesio tra eruditi e libertini nella Francia del Seicento,” in Bernardino Telesio e la cultura napoletana, ed. by R. Sirri and M. Torrini, Naples, Guida: 373-416.
  • Bidwell-Steiner, Marlen, 2010, “Metabolism of the Soul. The Psychology of Bernardino Telesio in Oliva Sabuco’s Nueva filosofía de la naturaleza del hombre,” (1587) in Blood, Sweat and Tears. The Changing concepts of Physiology from Antiquity into Early Modern Europe, ed. by M. Horstmansoff, H. King, C. Zittel, Leiden, Brill: 662-684.
  • Boenke, Michaela, 2005, “Psicologie im System der naturphilosophischen Monismus; Bernardino Telesio,” in Körper, Spiritus, Geist: Psychologie vor Descartes, München, Paderborn: 120-142.
  • Boenke, Michaela, 2013, “Bernardino Telesio,” in Stanford Encyclopedia of Philosophy (http://plato.stanford.edu/entries/telesio/).
  • Bondì, Roberto, 2018a, Il primo dei moderni. Filosofia e scienza in Bernardino Telesio, Rome, Edizioni di Storia e Letteratura.
  • Bondì, Roberto, 2018b, “Dangerous Ideas: Telesio, Campanella and Galileo,” in Copernicus Banned. The Entangled Matter of the anti-Copernican Decree of 1616, ed. by N. Fabbri and F. Favino, Florence, Olschki, 1-27.
  • Campanella, Tommaso, 1622, Al Telesio Cosentino, in Scelta d’alcune poesie filosofiche (1622), n° 68 (available on-line in Archivio Tommaso Campanella, http://www.iliesi.cnr.it/ATC/testi.php?tp=1&iop=Scelta&pg=123).
  • De Franco, Luigi, 1995, Introduzione a Bernardino Telesio, Soveria Manelli: Rubbettino.
  • De Frede, Carlo, 2001, Docenti di filosofia e medicina nella università di Napoli dal secolo XV al XVI, Naples, Lit. Editrice A. De Frede.
  • De Lucca, Jean-Paul, 2012, “Giano Pelusio: ammiratore di Telesio e poeta dell’«età aurea»,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore: 115-132.
  • De Miranda, Girolamo, 1993, “Una lettera inedita di Telesio al cardinale Flavio Orsini,” Giornale Critico della Filosofia Italiana 72: 361-375.
  • Ebbersmeyer, Sabrina, 2013, “Do Humans Feel Differently? Telesio on the Affective Nature of Men and Animals,” in The Animal Soul and the Human Mind. Renaissance Debates, ed. by C. Muratori, Pisa-Roma, Fabrizio Serra Editore, 97-111.
  • Ebbersmeyer, Sabrina, 2016, “Telesio’s Vitalistic Conception of the Passions,” in Sense, Affect and Self-Preservation in Bernardino Telesio (1509-1588), ed. by G. Giglioni and J. Kraye, Dordrecht, Springer.
  • Ebbersmeyer, Sabrina, 2018, “Renaissance Theories of the Passion. Embodied Minds,” in Philosophy of Mind in the late Middle Ages and Renaissance. The History of Philosophy of Mind, vol. 3, ed. by S. Schmid, London, Routledge, 185-206.
  • Firpo, Luigi, 1951, “Filosofia italiana e Controriforma. iv. La proibizione di Telesio,” Rivista di Filosofia, 42/1: 30-47 (see also 41, 1950: 150-173 e 390-401).
  • Fisher, Saul, 2005, Pierre Gassendi’s Philosophy and Science. Atomism for Empiricists, Leiden, Brill.
  • Fragale, Luca Irwin, 2016, “Bernardino Telesio in due inediti programmi giovanili,” in Microstoria e Araldica di Calabria Citeriore e di Cosenza. Da fonti documentarie inedite, Milan, The Writer, 11-32.
  • Garber, Daniel, 2016, “Telesio Among the Novatores: Telesio’s Reception in the Seventeenth Century,” in Early Modern Philosophers and the Renaissance Legacy, ed. by C. Muratori and G. Paganini, Dordhecht, Kluwer, 119-133.
  • Gaukroger, Stephen, 2001, Francis Bacon and the Transformation of Early-Modern Philosophy, Cambridge, Cambridge University Press.
  • Giglioni, Guido, 2010, “The First of the Moderns or the Last of the Ancients? Bernardino Telesio on Nature and Sentience,” Bruniana & Campanelliana 16: 69-87.
  • Gómez López, Susana, 2013, “Telesio y el debate sobre la naturalezza de la luz en el Renacimiento italiano,” in Bernardino Telesio y la nueva imagen de la naturalezza en el Renacimiento italiano, ed. by Miguel Á. Granada, Siruela, Biblioteca de Ensayo, 194-235.
  • Granada, Miguel Ángel, 2013, Telesio y las novedades celestese: la teoría telesiana de los cometas, e Telesio y la Via Láctea, in Bernardino Telesio y la nueva imagen de la naturalezza en el Renacimiento italiano, ed. by Miguel Á. Granada, Siruela, Biblioteca de Ensayo, 116-149 and 150-193.
  • Hatfield, Gary, 1992, “Descartes’ physiology and its relation to his psychology,” in The Cambridge Companion to Descartes, ed. by J. Cottingham, Cambridge, Cambridge University Press, 335-370.
  • Henry, John, 1979, “Francesco Patrizi da Cherso’s Concept of Space and Its Later Influence,” Annals of Science, 36: 549-575
  • Hirai, Hiro, 2012, “Il calore cosmico di Telesio fra il de generazione animalium di Aristotele e il De carnibus di Ippocrate,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore, 71-83.
  • Iovine, Maria Fiammetta, 1998, “Henry Savile lettore di Bernardino Telesio. L’esemplare 537.C.6 del De rerumnatura 1570,” Nouvelles de la République des Lettres 17: 51-84.
  • Leijenhorst, Cees, 2010, “Bernardino Telesio (1509-1588): New fundamental principles of nature,” in Philosophers of the Renaissance, ed. by P. R. Blum, 168-180, Washington, The Catholic University of America Press.
  • Lattis, James M, 1994, Between Copernicus and Galileo: Christoph Clavius and the Collapse of Ptolemaic Cosmology, Chicago, Chicago University Press.
  • Lerner, Michel-Pierre, 1986, “Aristote “oublieux de lui-meme” selon Telesio,” Les Études philosophiques, 3: 371-389.
  • Lerner, Michel-Pierre, 1992, “Le ‘parménidisme’ de Telesio: Origine et limites d’un hypothèse,” in Bernardino Telesio e la cultura napoletana. ed. by R. Sirri and M. Torrini, Naples: Guida, 79-105.
  • Lupi, Walter F., 2011, Alle origini della Accademia Telesiana, Cosenza: Brenner.
  • Mandressi, Rafael, 2009, “Preuve, expérience et témoignage dans les «sciences du corps»,” Communications 84: 103-118.
  • Margolin, Jean-Claude, 1990, “Bacon, lecteur critique d’Aristote et de Telesio,” in Convegno internazionale di studi su Bernardino Telesio, Cosenza, Accademia Cosentina, 135-166.
  • Mulsow, Martin, 1998, Frühneuzeitliche Selbsterhaltung. Telesio und die Naturphilosophie der Renaissance, Tübingen: Max Niemeyer Verlag.
  • Mulsow, Martin, 2002, “Reaktionärer Hermetismus vor 1600? Zum Kontext der venezianischen Debatte über die Datierung von Hermes Trismegistos,” in Das Ende des Hermetismus. Historische Kritik und neue Naturphilosophie in der Spätrenaissance. Dokumentation und Analyse der Debatte um die Datierung der hermetischen Schriften von Genebrard bis Casaubon (1567-1614), ed. by M. Mulsow, Tübingen, Max Niemeyer Verlag, 161-185.
  • Ottaviani, Alessandro, 2010, “Da Antonio Telesio a Marco Aurelio Severino: fra storia naturale e antiquaria,” Bruniana & Campanelliana, 16/1: 139-148.
  • Ottaviani, Alessandro, 2012, “Telesio, Bernardino,” in Il Contributo italiano alla storia del Pensiero – Filosofia (2012) (http://www.treccani.it/enciclopedia/bernardino-telesio_(Il-Contributo-italiano-alla-storia-del-Pensiero:-Filosofia)/).
  • Plastina, Sandra, 2012, “Bernardino Telesio nell’Inghilterra del Seicento,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore, 133-143.
  • Puliafito, Anna Laura, 2013, Introduzione a Telesio 1572, xxxiii-xlv.
  • Pousseur, Jean-Marie, 1990, “Bacon, a Critic of Telesio,” in Francis Bacon’s Legacy of Texts: ‘The Art of Discovery Grows with Discovery’, ed. by W. Sessions, New York, AMS Press, 105-117.
  • Purnell, Fredrick, Jr., 2002, “A Contribution to Renaissance Anti-Hermeticism: The Angelucci-Persio Exchange,” in Das Ende des Hermetismus. Historische Kritik und neue Naturphilosophie in der Spätrenaissance, ed. by M. Mulsow, Tübingen, Max Niemeyer Verlag, 127-160.
  • Rees, Graham, 1977, “Matter Theory: A Unifying Factor in Bacon’s Natural Philosophy?,” Ambix 24: 110-125.
  • Schuhmann, Karl, 1988, “Hobbes and Telesio,” Hobbes Studies 1: 109-133.
  • Schuhmann, Karl, 2004, “Telesio’s Concept of Matter,” in Selected Papers on Renaissance Philosophy and on Thomas Hobbes, ed. by P. Steenbakkers and C. Leijenhorst, Dordrecht, Kluwer, 99-116.
  • Sciaccaluga, Nicoletta, 1997, “Movimento e materia in Bacone: uno sviluppo telesiano,” Annali della Scuola normale superiore di Pisa, classe di lettere e filosofia, Ser. 4, 2, 329-355.
  • Sergio, Emilio, 2007, “Campanella e Galileo in un «English Play» del circolo di Newcastle: «Wit’s Triumvirate, or the Philosopher» (1633-1635),” Giornale Critico della Filosofia Italiana, 86, 2, 298-315.
  • Sergio, Emilio, 2010, “Telesio e il suo tempo. Alcune considerazioni preliminari,” Bruniana & Campanelliana, 16, 1, 111-124.
  • Sergio, Emilio, 2013, Bernardino Telesio: una biografia, Naples: Guida.
  • Sergio, Emilio, 2014, “Bernardino Telesio (1509-1588),” in Galleria dell’Accademia Cosentina – Archivio dei filosofi del Rinascimento, vol. I, ed. by E. Sergio, Rome, CNR-ILIESI, 155-218.
  • Simonetta, Marcello, 2015, “Due lettere inedite del giovane Bernardino Telesio,” Bruniana & Campanelliana, 21, 2, 429-435.
  • Siraisi, Nancy G., 2011, “Giovanni Argenterio and Medical Innovation,” in Medicine and the Italian Universities, 1250-1600, Leiden, Brill, 329-355.
  • Spruit, Leen, 1995, “Bernardino Telesio,” in Species intelligibilis. From Perception to Knowledge, Leiden, Brill, vol. 2, 198-203.
  • Spruit, Leen, 1997, “Telesio’s reform of the philosophy of mind,” Bruniana&Campanelliana, 3: 123-143.
  • Spruit, Leen, 1998, Telesio’s Psychology and the Northumberland Circle, Durham Thomas Harriot Seminar, Occasional paper, Durham University, History of Education Project, 1-36.
  • Spruit, Leen, 2018, “Bernardino Telesio on Spirit, Sense, and Imagination,” in Image, Imagination, and Cognition. Medieval and Early Modern Theory and Practice, ed. by C. Lüthy, C. Swan, P. Bakker, C. Zittel, Brill, Leiden, 94-116.
  • Trabucco, Oreste, 2019, “Telesian Controversies on the Winds and Meteorology,” in Bernardino Telesio and the Natural Sciences in the Renaissance, ed. by P. D. Omodeo, Leiden, Brill.
  • Tutrone, Fabio, 2014, “The body of the soul. Lucretian echoes in the Renaissance theories on the psychic substance and its organic repartition,” Gesnerus, 71, 2, 204-236.

 

Author Information

Emilio Sergio
Email: es.disu@gmail.com
University of Calabria
Italy

Philosophy of Peace

Peace is notoriously difficult to define, and this poses a special challenge for articulating any comprehensive philosophy of peace. Any discussion on what might constitute a comprehensive philosophy of peace invariably overlaps with wider questions of the meaning and purpose of human existence. The definitional problem is, paradoxically, a key to understanding what is involved in articulating a philosophy of peace. In general terms, one may differentiate negative peace, that is, the relative absence of violence and war, from positive peace, that is, the presence of justice and harmonious relations.  One may also refer to integrative peace, which sees peace as encompassing both social and personal dimensions.

Section 1 examines potential foundations for a philosophy of peace through what some of the world’s major religious traditions, broadly defined, have to say about peace.  The logic for this is that throughout most of human history, people have viewed themselves and reality through the lens of religion. Sections 2 through 5 take an historical-philosophical approach, examining what key philosophers and thinkers have said about peace, or what might be ascertained for possible foundations for a philosophy of peace from their work. Section 6 examines some contemporary sources for a philosophy of peace.

Sections 7 through 15 are more exploratory in nature. Section 7 examines a philosophy of peace education, and the overlap between this and a philosophy of peace.  Sections 8 through 15 examine a range of critical issues in thinking about and articulating a philosophy of peace, including paradoxes and contradictions which emerge in thinking about and articulating a philosophy of peace.  Section 16 concludes with how engaging in the practice of philosophy may itself be a key to understanding a philosophy of peace, and indeed a key to establishing peace itself.

Table of Contents

  1. Religious Sources for a Philosophy of Peace
  2. Classical Sources for a Philosophy of Peace
  3. Medieval Sources for a Philosophy of Peace
  4. Renaissance Sources for a Philosophy of Peace
  5. Modern Sources for a Philosophy of Peace
  6. Contemporary Sources for a Philosophy of Peace
  7. The Philosophy of Peace Education
  8. The Notion of a Culture of Peace
  9. The Right to Peace
  10. The Problem of Absolute Peace
  11. Peace and the Nature of Truth
  12. Peace as Eros
  13. Peace, Empire and the State
  14. An Existentialist Philosophy of Peace
  15. Decolonizing Peace
  16. Concluding Comments: Philosophy and Peace
  17. References and Further Reading

1. Religious Sources for a Philosophy of Peace

It is logical that we should examine the theory of peace as set down in the teachings of some of the world’s major religious traditions, given that, for most of human history, people have viewed themselves and the world through the lens of religion. Indeed, the notion of religion as such may be viewed as a modern invention, in that throughout most of human history individuals have seen the spiritual dimension as integrated with the physical world. In discussing religion and peace, there is an obvious problem of the divergence between precept and practice, in that many of those professing religion have often been warlike and violent. Some writers, such as James Aho and René Girard, go further, and see religion at the heart of violence, through the devaluation of the present and through the notion of sacrifice. For the moment, however, we are interested in the teachings of the major world religions concerning peace.

If we examine world religious traditions and peace, it is appropriate that we examine Indigenous spirituality. There are a number of ways that such spirituality may provide grounds for a philosophy of peace, such as the notion of connectedness with the environment, the emphasis on a caring and sharing society, gratitude for creation and the importance of peace within the individual. This is not to deny that Indigenous societies, as with all societies, may be extremely violent at times. This is also not to deny that elements of Indigenous spirituality may be identifiable within other major world religious traditions. Yet many peace theorists look to Indigenous societies and Indigenous spirituality as a reference point for understanding peace.

Judaism enjoys prominence not merely as a world religion in its own right, and arguably the most ancient monotheistic religion in the world, but also as a predecessor faith for Christianity and Islam.  Much of the contribution of Judaism towards theorizing on peace comes from the idea of an absolute deity, and the consequential need for radical ethical commitment. Within the Tanakh (Hebrew Scriptures), the  Torah (Law) describes peace as an ultimate goal and a divine gift, although at times brutal warfare is authorized; the Nevi’im (Prophetic Literature) develops the notion of the messianic future era of peace, when there will be no more war, war-making or suffering; and the Ketuvim (Wisdom Literature) incorporates notions of inner peace into Judaism, such as the idea that a person can experience peace in the midst of adversity, and the notion that peace comes through experience and reflection.

Hinduism is a group of religious traditions geographically centered on the Indian sub-continent, which rely upon the sacred texts known as the Vedas, the Upanishads, and Bhagavad Gita. There are a number of aspects of Hinduism which intersect with peace theory. Karma is a view of moral causality incorporated into Hinduism, wherein good deeds are rewarded either within this lifetime or the next, and by contrast bad deeds are punished in this lifetime or the next. Karma presents a strong motivation to moral conduct, that is, one should act in accordance with the dharma, or moral code of the universe. A further element within Hinduism relevant to a peace theory is the notion of the family of humankind, and accordingly there is a strong element of tolerance within Hinduism, in that the religion tolerates and indeed envelopes a range of seemingly conflicting beliefs.  Hinduism also regards ahimsa, strictly speaking the ethic of doing no harm towards others, and by extension compassion to all living things, as a virtue, and this virtue became central to the Gandhian philosophy of nonviolence.

Buddhism is a set of religious traditions geographically centered in Eastern and Central Asia, and based upon the teachings of Siddharta Gautama Buddha, although the dearth of any specific deity lead some to question whether Buddhism ought to be considered a religion. The significance of Buddhism for peace is the elevation of ahisma, that is, doing no harm to others, as a central ethical virtue for human conduct.  It can be argued that the Buddhist ideal of avoidance of desire is also an important peaceful attribute, given that desire of all descriptions is often cited as a cause of war and conflict, as well as being a cause of the accumulation of wealth, which itself arguably runs counter to the creation of a genuinely peaceful and harmonious society.

Christianity is a set of monotheistic religious traditions, arising out of Judaism, and centered on the life and teachings of Jesus of Nazareth. The relationship of Christianity to a philosophy of peace is complex.  Christianity has often emerged as a proselytizing and militaristic religion, and thus one often linked with violence. Yet there is also a countervailing undercurrent of peace within Christianity, linked to the teachings of its founder and also linked to the fact that its founder exemplified nonviolence in his own life and death. Forgiveness and reconciliation are also dominant themes in Christian teaching. Some Christian theologians have begun to reclaim the nonviolent element of Christianity, emphasizing the nonviolence in the teaching and life of Jesus.

Islam constitutes a further set of monotheistic religious traditions arising out of Judaism, stressing submission to the will of the creator, Allah, in accordance with the teachings of the Prophet Muhammed, as recorded in sacred texts of the Holy Qur’an. As with Christianity, the relationship of Islam to a philosophy of peace is complex, given that Islam also has a history of sometimes violent proselytization. Yet Islam itself is a cognate word for peace, and Islamic teaching in the Qur’an extols forgiveness, reconciliation, and non-compulsion in matters of faith. Moreover, one of the Five Pillars of Islam, Zakat, is an important marker of social justice, emphasizing giving to the poor.

There is an established scholarly tradition that interprets communism, the theory and system of social organization based upon the writings of Karl Marx and Friedrich Engels, as a form of nontheistic religion. Communist theory promises a peaceful future, through the elimination of inequality, the emergence of an ideal classless society, with a just distribution of resources, no class warfare and no international wars, given war in communist theory is often viewed as the result of capitalist imperialism. Communism envisages an end to what Engels described as social murder, premature deaths within a social class due to exposure to preventable yet lethal conditions.

Yet scholars such as Rudolph Rummel have suggested that communist societies have been the most violent and genocidal in human history. Idealism can be lethal. Others point to examples of peaceful communist societies. Importantly, scholars such as Noam Chomsky argue that, far from reflecting the ideals of Marx and Engels, communist societies of the twentieth century, in practice, betrayed those original ideals. Irrespective of this, the example of mass violence in communist societies suggests that a proper theory of peace must encompass not merely a goal or aspiration, but a way of life.

It is useful to enquire what commonalities we might discern in religious traditions regarding peace, and it seems fair to say that peace is usually viewed as the ultimate goal of human existence.  For some religions, this is phrased in eschatological notions such as heaven or paradise, and in other religions this is phrased in terms of an ecstatic state of being. Even in communism, there is an eschatological element, through the creation of a future classless society. There is also an ethical commonality in traditions, in that peaceful existence and actions are set forth as an ethical norm, notwithstanding that there are exceptions to this.

It is in defining and understanding the exceptions that there is a degree of complexity. There is also a common conflict between universalism and particularism within religious traditions, with particularistic emphases, such as in the notion of the Chosen People, arguably embodying the potential for exclusion and violence.

2. Classical Sources for a Philosophy of Peace

The writings of Plato (428/7-348/7 B.C.E.) would not normally be thought of as presenting a source for a philosophy of peace. Yet there are aspects of Plato’s work, based upon the teaching of Socrates, which may constitute such a source. Within his major work Politeia (Republic), Plato focuses on what makes for justice, an important element in any broad concept of peace. Plato, in effect, presents a peace plan based upon his city-state. This ideal society is essentially static, involving three distinct classes, although it is, nevertheless, a society which provides for at least an internally peaceful polis or state.  Plato also develops a theory of forms or ideals, and it is not too difficult to see peace as one of those forms or ideals, and, in contributing to the polis or state, we contribute to the development of that form or ideal.  In his work Nomoi (Laws), Plato enunciates the view that the establishment of peace and friendship constitute the highest duty of both the citizen and the legislator, and in the work Symposium, Plato articulates the idea that it is love which brings peace among individuals.

The writings of Aristotle (384-322 B.C.E.) similarly do not present an obvious reference point for a philosophy of peace. Yet there may be such a reference point in his development of virtue ethics, notably in Ethica Nicomachea (Nichomachean Ethics). Virtue ethics may legitimately be linked to a philosophy or ethics of peace.  The mean of each of the virtues described by Aristotle may be viewed as qualities conducive to peace. In particular, the mean of the virtue of andreia, usually translated as courage or fortitude, may be seen as similar to the notion of assertiveness, a quality which many writers see as important within nonviolence. Aristotle also identifies justice as a virtue, and many peace theorists emphasize the inter-relationship between peace and justice. Further, some writers have specifically identified peace or peacefulness as a virtue in itself. Interestingly, Aristotle sees the telos or goal of life as eudaimonia, or human flourishing, a concept similar to the ideals set forth in writing on a culture of peace.

3. Medieval Sources for a Philosophy of Peace

Saint Augustine of Hippo (354-430 C.E.) was both a bishop and theologian, and he is widely recognized as capably integrating classical philosophy into Christian thought.  His thought is often categorized as late Roman or early medieval. One element of Augustinian thought relevant to a philosophy of peace is his adaptation of the neo-Platonic notion of privation, that evil can be seen as the absence of good.  It is an idea which resonates with notions of positive and negative peace. Negative peace can be seen as the absence of positive peace.  The notion of privation also suggests that peace ought to be seen as a specific good, and that war is the absence or privation of that good.

The best-known contribution of Augustine to a philosophy of peace, however, is his major work De civitate Dei (The City of God).  Within this, Augustine contrasts the temporal human city, which is marked by violent conflict, and the eternal divine city, which is marked by peace. As with many religious writers, the ideal is peace. Augustine is also noteworthy for articulating the notion of just war, wherein Christians may be morally obliged to take up arms to protect the innocent from slaughter. However, this concession is by way of a lament for Augustine, as a mark that Christians are living in a temporal and fallen world. That is a concession which contrasts with the way that others have used just war theory, and in particular the work of Augustine, to justify and glorify war.

Saint Thomas Aquinas (ca.1225-1274) is perhaps best known for his attempt to synthesize faith with reason, for his popularization of Aristotelian thought, and for his focus on virtues. The significant contribution of Aquinas to a philosophy of peace is his major work Summa Theologica (Summary of Theology), and in particular the discussion on ethics and virtues in Part 2 of the work. At Question 29 of Part 2, Aquinas examines the nature of peace, and whether peace itself may be considered a virtue. Aquinas concludes that peace is not a virtue, and further concludes that peace is a work of charity (love). An important qualification, however, is that peace is also described as being, indirectly, a work of justice.  We see here the inter-relationship of peace and justice, something taken up by contemporary peace theorists.  Aquinas also refined the just war theory, including articulating the requirements of proper authority, just purpose, and just intent when resorting to war.

4. Renaissance Sources for a Philosophy of Peace

The Renaissance was a period of a revival of learning in Europe, and it is often identified as a period of transition from the medieval to the modern. The Renaissance is also known for the growth of humanism, that is, an era involving the rediscovery of classical literature, an outlook focusing on human needs and on rational means to solve social problems, and a belief that humankind can shape its own destiny. One central human problem for humanists, and indeed for many thinkers, was and is the phenomenon of war, and Renaissance humanists refused to see war as inevitable and unchangeable. This in itself is an important contribution to a philosophy of peace. Renaissance humanism was not necessarily anti-religious, and indeed most of the humanist writers from this time worked from specifically religious assumptions.  It can be argued that in the 21st century we are still part of this humanist project, and an important part of the humanist project is to solve the problem of war and social injustice.

Erasmus of Rotterdam (ca.1466-1536), otherwise known as Desiderius Erasmus, is perhaps the foremost humanist writer of the Renaissance, and arguably also one of the foremost philosophers of peace. In numerous works, Erasmus advocated compromise and arbitration as alternatives to war. The connection between humanism and peace is perhaps best discernable in Erasmus’ 1524 work De libero arbitrio diatribe sive collatio (The Freedom of the Will), where Erasmus points out that if all that we do is predetermined, there is no motivation for improvement. The principle can apply to social dimensions as well. If everything is predetermined, then there is little point in attempting to work for peace. If we say that war and social injustice are inevitable, then there is little motivation to change. Further, saying that war and social injustice are inevitable serves as a self-fulfilling statement, and individuals will tend not to do anything to challenge war and social injustice.

De libero arbitrio is also useful for pondering a philosophy of peace in that the work presents an example of the idea that peace is a means or method, and not merely a goal. Although Erasmus wrote the work in debate with Martin Luther, Erasmus avoids polemics, is reticent to make assertions, strives for moderation, and is anxious to recognize the limitations of his argument. He points out in the Epilogue that parties to disputes will often exaggerate their own arguments, and it is from the conflict of exaggerated views that violent conflict arises. This statement was prophetic, given the religious wars which engulfed Europe following the Protestant Reformation.

However, the best-known peace tract from Erasmus is perhaps the adagium Dolce bellum inexpertis, (War is Sweet to Those Who Have Not Experienced It).  Erasmus is quoting from the Greek poet Pindar, and in this adagium he is, in effect, presenting a cultural view of war, namely that war is at least superficially attractive. The implication, although Erasmus does not develop this, is that there is an element to peace which lacks the emotive appeal of war. This is an insight which explains much of the complex relationship between war and peace. Later writers would explore this idea to advocate for a vision of peace which would embrace some of the moral challenges associated with war.

Sir Thomas More (1478-1535) was another leading humanist writer of the Renaissance, and a friend and correspondent of Erasmus. In his 1516 book De optimae rei publicae statu deque nova insula utopia (On the Best Government and on the New Island Utopia), More outlines an ideal society based upon reason and equality. In Book One of Utopia, More articulates his concerns about both internal and external violence. Within Europe, and England in particular, there is senseless capital punishment, for instance in circumstances where individuals are only stealing to find something to eat and thus keep themselves alive. Further, there is a world-wide epidemic of war between monarchs, which debases the countries monarchs seek to lead. Book Two of Utopia provides the solution, with a description of an agrarian equalitarian society; where there is no private property; where the young are educated into pacifism; where war itself only resorted to for defensive reasons or to liberate the oppressed from tyranny; where psychological warfare is preferred to battle; and where there are no massacres nor destruction of cities. This utopian society suggested by More reflects a broad theory of peace. One of the interesting ramifications of More’s vision is whether such a peaceful society, and indeed peace, is ever attainable.  The common meaning of the word “utopian” connotes something or a state which is not attainable, although it seems unlikely More would have written his work if he, in common with other humanists of his era and since, did not have at least some belief that the principles he was putting forth were in some way attainable.

5. Modern Sources for a Philosophy of Peace

Thomas Hobbes (1588-1679) was both a writer and a politician, whose writing was motivated by an overarching concern on how to avoid civil war, and the carnage and suffering resulting from this.  He had observed this first-hand in England, and he famously articulated a statist view of peace as a contrast to the anarchy and violence of nature. In his two most noted works, De Civi (The Citizen) and Leviathan, Hobbes articulates a view that human nature is essentially self-interested, and thus the natural state of humankind is one of chaos. Hobbes also sees the essence of war as not merely the action of fighting, but a disposition to fight, and this exists only because there is a dearth of an overarching law-enforcing authority. The only way to introduce a measure of peace is therefore through submission of citizens to a sovereign, or, in more contemporary terminology, the state. Thus, a Hobbesian worldview is often taken to be pessimistic, it holds that the natural condition of humankind is one of violence, and that this violence inevitably predominates where there is no humanizing and civilizing impact of the state.  Hobbes raises the important issue of how important is it to have an overarching external authority for lasting peace to exist. If we accept that such an external authority is necessary for peace, then arguably we have the capacity to invent mechanisms to set in place such an external authority.

Baruch or Benedictus de Spinoza (1632-1677) was a Dutch philosopher, of Jewish background, who wrote extensively on a range of philosophical topics. His relevance for a philosophy of peace in general may be found in his advocacy of tolerance in matters of religious doctrine. It is notable also that in his Tractatus Politicus (Political Treatise), written 1675-6 and published after his death, Spinoza asserts: “For peace is not mere absence of war but is a virtue that springs from force of character”.  This is a definition of peace that anticipates later expositions, especially those that see peace as a virtue, but also twenty-first century peace theory that differentiates positive from negative peace.

John Locke (1632-1704) is arguably one of the most influential contributors to modern philosophy.  Like other philosophers of the time, Locke is important for advancing the notion of tolerance, most clearly in his 1689 Letter Concerning Toleration.  The background of this had been the destructive religious wars of the time, and Locke logically suggests that this violence can be avoided through religious tolerance.  Within the work of Locke one can also discern elements of the idea of the right to peace.  Around 1680, Locke composed his Two Treatises of Government, and, in the second of these at Chapter 2, Locke argues that each individual has a right not to be harmed by another person, that is, a right to life, and it is the role of political authority to protect this right. The right to life and the right not to be harmed arguably anticipate the later notion of the right to peace.

Jean-Jacques Rousseau (1712-1778) was a Genevan philosopher of history, and was both a leader and critic of the European Enlightenment.  The idea of the noble savage, who lives at peace with his/her fellows and with nature, can be found in many ancient philosophers, although the noble savage is most often associated with the work of Rousseau. In his 1750 Discours sur les sciences et les arts (Discourse on the Sciences and the Arts), Rousseau posited that human morality had been corrupted due to culture; in his 1755 Discours sur l’origine et les fondements de l’inégalité parmi les hommes (Origins of the Inequality of Man), he posits that social and economic developments, especially private property, had corrupted humanity; in his 1762 work Du contrat social (The Social Contract), he posits that authority ultimately rests with the people and not the monarch; and in his 1770 Les Confessions (Confessions), Rousseau extols the peace which comes from being at one with nature.  Rousseau anticipates common themes in much peace theory, and especially the counter-cultural and alternative peace movements of the 1960s and 1970s, namely that peace involves a conscious rejection of a corrupting and violent society, a return to a more naturalistic and peaceful existence, and a respect for and affinity with nature. In short, Rousseau suggests that the way to peace is through a more peaceful society, rather than through systems of peace.

Immanuel Kant (1724-1804) is often seen as the modern philosopher who, in his universal ethics and cosmopolitan outlook, has provided what many argue is the most extensive basis for a philosophy of peace. The starting point for the ethics of Kant is the philosophy of duty and an ethics based on duty, and, in particular, the duty to act so that what one does is consistent with what are reasonably desired universal results, what Kant called the categorical imperative.  Kant introduced this notion in his 1785 work Grundlegung zur Metaphysik der Sitten (Foundation of the Metaphysics of Morals), and developed this in his 1788 Kritik der praktischen Vernunft (Critique of Practical Reason). It has been argued by many, including Kant himself, that we have a duty to peace and that we have a duty to act in a peaceful manner, in that we can only universalize ethics if we consider others, and this at the very least implies a commitment to peace.

A second important Kantian notion is that of das Reich der Zwecke, often translated as the realm or kingdom of ends. In Grundlegung zur Metaphysik der Sitten (Foundation of the Metaphysics of Morals), Kant suggests an ethical system wherein persons are ends-in-themselves, and each person is a moral legislator.  It is a notion which has important implications for peace, in that the notion implies that each person has an obligation to regard others as ends-in-themselves and thus not engage in violence towards others. In other words, the notion implies that each person has a responsibility to act in a peaceful manner. If all persons acted in this way, it would also mean that the phenomenon of war, wherein moral responsibility is surrendered to the state, would become impossible.

Finally, Kant’s 1795 essay Zum ewigen Frieden (On Perpetual Peace) is the work most often cited in discussing Kant and peace, and this work puts forward what some call the Kantian peace theory. Significantly, in this work Kant suggests more explicitly than elsewhere that there is a moral obligation to peace. For instance, Kant argues in the Second Definitive Article of the work that we have an “immediate duty” to peace. Accordingly, there is also a duty for nation-states to co-operate for peace, and indeed Kant suggests a range of ways that this can be achieved, including republicanism and a league of nations.  Importantly, Kant also suggests that the public dimension of actions, which can be understood as transparency, is important for international peace.

The work of Georg Wilhelm Friedrich Hegel (1770-1831) is contentious from the perspective of a philosophy of peace, as he holds what might be called a statist view of morality. Hegel sees human history as a struggle of opposites, from which new entities arise.  Hegel sees the state, and by this he means the nation-state, as the highest evolution of human society.  Critics, such as John Dewey and Karl Popper, have seen in Hegel a philosophical rationalization of the authoritarian and even totalitarian state. Yet the reliance on the state as an object of stability and peace does not necessarily mean acceptance of bellicose national policies.  Further, just as human organization is evolving, one could equally argue that evolution towards a supra-national state with the object of world peace may also be consistent with the organic philosophy of Hegel. It is possible to view Hegel as a source for a philosophy of peace.

6. Contemporary Sources for a Philosophy of Peace

William James (1842-1910) was a noted American pragmatist philosopher, and his 1906 essay ‘The Moral Equivalent of War’, originally an oration, was produced at a time when many who had experienced the destruction and loss of life of the American Civil War were still alive. James provides an interesting potential source for a pragmatist philosophy of peace. James argues that it is natural that humans should pursue war, as the exigencies of war provide a unique moral challenge and a unique motivating force for human endeavor. By implication, there is little value in moralizing about war, and moralizing about the need for peace. Rather, what is needed is a challenge which will be seen as an equivalent or counterpoint to war – in other words a moral equivalent of war.  The approach of James is consistent with the notion of positive peace, in that peace is seen to be something which embodies, or should embody, cultural challenges.

Mohandas Karamchand Gandhi (1869-1948) is widely regarded as the leading philosopher of nonviolence and intrapersonal peace. Through his life and teaching, Gandhi continually emphasized the importance of nonviolence, based upon the inner commitment of the individual to truth. Thus, Gandhi describes the struggle for nonviolence as truth-force, or satyagraha.  Peace is not so much an entity or commodity to be obtained, nor even a set of actions or state of affairs, but a way of life.  In Gandhism, peaceful means become united with and indistinguishable from peaceful ends, and thus the call for peace by peaceful means. The thought of Gandhi has been influential in the development of the intrapersonal notion of peace, that peace consists not so much as a set of conditions between those in power, but rather the inner state of a person.  Gandhi is also noteworthy in that he linked nonviolence with economic self-reliance.

The philosopher Martin Buber (1878-1965) is well known for emphasizing the importance of authentic dialogue, which comes about when individuals recognize others as persons rather than entities.  In his influential 1923 book Ich und Du (I and Thou), Buber suggests that we only exist in relationship, and those relationships are necessarily of two types: personal relationships involving trust and reciprocity, which Buber characterized as Ich-Du, or I-Thou relationships; and instrumental relationships, involving things, which Buber characterized as Ich-Es, or I-It relationships.  The book was commenced during the carnage of World War One, and it is not too difficult to see the book as a philosophical reflection on the true nature of peace, in that peace involves dialogue with the other, with war constituting the absence of such dialogue.

There are commonalities between the philosophy of Buber and the ethics of care.  Both indicate that we need to see the other as an individual and as a person, that is, we need to see the face of the other.  If we recognize the other as human, and engage with them in dialogue, then we are less likely to engage in violence against others, and are more likely to seek for social justice for others.  It is also noteworthy that Buber emphasized that all authentic life involves encounter.  Thus, if we are not engaging in dialogue with others, then we ourselves do not have peace, at least not in the positive and full construction of the concept.

Martin Luther King Jr. (1929-1968) is perhaps best known as a civil rights campaigner, although he also wrote and spoke extensively on peace and nonviolence.  These ideals were also exemplified in his life. One could argue that King did not develop any new philosophy as such, but rather expressed ideas of peace and nonviolence in a uniquely powerful way.  Some of the key themes articulated by King were the importance of loving one’s enemies, the duty of nonconformity, universal altruism, inner transformation, the power of assertiveness, the interrelatedness of all reality, the counterproductive nature of hate, the insanity of war, the moral urgency of the now, the necessity of nonviolence in seeking for peace, the importance of a holistic approach to social change, and the notion of evil, especially as evidenced in racism, extreme materialism and militarism.

Gene Sharp (1928-2018) was also an important theorist of nonviolence and nonviolent action, and his work has been widely used by nonviolent activists. Central to his thought are his insights into the power of the state, notably that this power is contingent upon compliance by the subjects of a state. This compliance works through state institutions and through culture.  From this, Sharp developed a program of nonviolent action, which works through subverting state power. Critics of Sharp argue that he was in effect a supporter of an American-led world order, especially as his program of nonviolent struggle was generally applied to countries not complying with US geostrategic priorities or with countries not compliant with corporate interests.

Johan Galtung (1930 -) is widely recognized as the leading contemporary theorist on peace, and he is often described as the founder of contemporary peace theory. Galtung has approached the challenge of categorizing peace through describing violence, and specifically through differentiating direct violence from indirect or structural violence. From this distinction, Galtung has developed an integrated typology of peace, comprising: direct peace, where persons or groups are engaged in no or minimal direct violence against another person or group; structural peace, involving just and equitable relationships in and between societies; and cultural peace, where there is a shared commitment to mutual support and encouragement. More recently, a further dimension has been developed, namely, environmental peace, that is, the state of being in harmony with the environment.

The notions of positive and negative peace derive largely from the work of Galtung.  Direct peace may be seen as similar to negative peace, in that this involves the absence of direct violence.  Structural and cultural peace are similar notions to positive peace, in that these notions invite reflection on wider ideas of what we look for in a peaceful society and in peaceful interactions between individuals and groups.  Similarly, an integrated notion of peace, involving personal and social dimensions of peace, derives substantially from Galtung, in that Galtung sees the notions of peace and war as involving more than an absence of violence between nation-states, which is what people often think of when we speak of a time of peace or a time of war.

The value of the various Galtungian paradigms is that these encourage thinking about the complex nature of peace and violence. Yet a problem with the Galtungian approach is that it can be argued as being too all-encompassing, and thus too diffuse. Peace researcher Kenneth Boulding summed up this problem by suggesting, famously, that the notion of structural violence, as developed by Galtung, is, in effect, anything that Galtung did not like. By implication, Galtung’s notion of peace too can be argued to be too general and too diffuse. Interestingly, Galtung has suggested that defining peace is a never-ending task, and indeed articulating a philosophy of peace might similarly be regarded as a never-ending exercise.

7. The Philosophy of Peace Education

In investigating a philosophy of peace, it is useful to examine writing on what might reasonably constitute a philosophy of peace education.  The reason is that when defining peace education, we are in effect defining peace, as the encouragement and attainment of peace is the ultimate goal of peace education. Just as peace is increasingly seen as a human right, so too peace education may be thought of as a human right.  Thus any philosophy of peace education is very closely linked with what might be seen as a philosophy of peace. For convenience, we can divide approaches to a philosophy of peace education into the deontological and non-deontological.

James Calleja has argued that the philosophical basis for peace education may be found in deontological ethics, that is, we have a duty to peace and a duty to teach peace.  Calleja relies strongly on the work of Immanuel Kant in developing this argument, and, in particular, on the Kantian notion of the categorical imperative, and in the subsequent categorical imperative of peace. The first formulation of the categorical imperative from Kant is that one should act in accordance with a maxim that is universal, that is, one should wish for others what one wishes for oneself. In effect, this is can be seen as a philosophical basis for nonviolence and for universal justice, in that as we would wish for security and justice for ourselves, so too we ought to desire this for others.

James Page has developed an alternative philosophical approach to peace education, identifying virtue ethics, consequentialist ethics, conservative political ethics, aesthetic ethics and care ethics as potential bases for peace education.  Equally, however, each of the above may also be argued as providing an ethical and philosophical basis for a general theory of peace. For instance, peace may be seen as a settled disposition on the part of the individual, that is, a virtue; peace may be seen as the avoidance of the destruction of war and social inequality; peace may be seen as the presence of just and stable social structures, that is, a social phenomenon; peace may be seen as love for the world and the future, that is, an aesthetic disposition; and peace may be seen as caring for individuals, that is, moral action.

8. The Notion of a Culture of Peace

The realization that peace is more than the absence of conflict lies at the heart of the emergence of the notion of a culture of peace, a notion which has been gaining greater attention within peace research in the late twentieth and early twenty-first centuries.  The notion was implicit within the UNESCO mandate, with the acknowledgment that since wars begin in human minds, it follows that the defense against war needs to be established in the minds of individuals. An extensive expression of this notion was set forth in the United Nations General Assembly resolution 53/243, the Declaration and Programme of Action on a Culture of Peace, adopted unanimously on 13 September 1999, which describes a culture of peace as a set of values, attitudes, traditions and modes of behavior and ways of life.  Article 1 of the document indicates that these are based upon a respect for life, ending of violence and promotion and practice of nonviolence through education, dialogue and cooperation.

Any attempt at a philosophy of a culture of peace is complex.  One of the challenges is that conflict is a necessary part of human experience and an important element in the emergence of culture.  Even if we differentiate violent conflict from mere social conflict, this does not solve the problem entirely, as human culture has still been very much dependent upon the phenomenon of war.  A more thorough solution is to admit that war and violence are indeed important factors in human experience and in the formation of human culture, and, rather than denying this, to attempt to seek and foster alternatives to war as a crucial motivating cultural factor for human endeavor, such as William James suggested in his famous essay on a moral equivalent of war.

9. The Right to Peace

Another emerging theme in peace theory has been the notion of peace as a human right. There is some logic to the notion of peace as a human right. The emergence of the modern human rights movement arose very much out of the chaos of global war and the emerging consensus that the recognition of human rights was the best way to establish and maintain peace.  The right to peace may arguably be found in Article 3 of the Universal Declaration of Human Rights, which posits the right to life, sometimes called the supreme right. The right to peace arguably flows from the right to life. This right to peace has been further codified with United Nations General Assembly resolution 33/73, the Declaration on the Preparation of Societies for Life in Peace, adopted on 15 December 1978;  with the United Nations General Assembly resolution 39/11, the Declaration of the Right of the Peoples of the World to Peace, adopted on 12 November 1984; and most recently with the United Nations General Assembly resolution 71/189, the Declaration on the Right to Peace, adopted on 19 December 2016.

In a lecture to the International institute of Human Rights in 1970, Karel Vastek famously suggested categorizing human rights in terms of the motto of the French revolution, namely, “liberté, égalité, fraternité.” Following this analysis, first generation rights are concerned with freedoms, second generation rights are concerned with equality, and third generation rights are concerned with solidarity. The right to peace is often characterized as a solidarity or third generation right. Yet one can take a wider interpretation of peace, for instance, that peace implies the right to development and the enjoyment of individual human rights. In this light, peace can be seen as an overarching human right. It is noticeable that there seems to have been such an evolution in thinking about the human right to peace, in that this is gradually being interpreted to include other rights, such as the right to development.

In examining the philosophical foundations for a human right to peace it is useful to examine some of the philosophical bases for human rights generally, namely, interest theory, will theory, and pragmatic theory.  Interest theory suggests that the function of human rights is to promote and protect fundamental human interests, and securing these interests is what justifies human rights.  What are fundamental human interests?  Security is generally identified as being a basic human interest. For instance, John Finnis refers to “life and its capacity for development” as a fundamental human interest, and that “A first basic value, corresponding to the drive for self-preservation, is the value of life” (1980, p. 86).  The best chance for self-preservation is that there be a norm for non-harm, which is an important element within a culture of peace. The right to peace therefore serves that basic need for life, both in the sense of protection from violence but also in serving the interests of a good life.

Will theory focuses on the capacity of individuals for freedom of action and the related notion of personal autonomy.  For instance, those such as Herbert Hart have argued that all rights stem from the equal right of all individuals to be free. Any right to personal freedom, however, contains an inherent limitation, in that one cannot logically exercise one’s own freedom to impinge upon another person’s freedom.  This is captured in the adage that my right to swing my fist ends at another person’s nose.  Why is that adage correct?  One answer is that within the notion of will theory there is an implicit endorsement of a right to peace, that is, not to harm or do damage to others.

The pragmatic theory of human rights posits that such rights simply constitute a practical way that we can arrive at a peaceful society.  For instance, John Rawls suggests that the laws of people, as opposed to the laws of states, is a set of ideals and principles by which people from different backgrounds can agree on how their actions towards each other should be governed and judged, and through which people can establish the conditions of peace. This is not to deny those critics who point out that human rights can function as a rationale for the powerful to engage in collective violence, and that there can be a tension between human rights and national sovereignty.  Thus, paradoxically, national sovereignty can sometimes serve to promote and provide peace, and human rights can sometimes be used underscore violence.

The importance of the human right to peace is perhaps best summed up by William Peterfi, who has described peace as a corollary to all human rights, such that “without the human right to peace no other human right can be securely guaranteed to any individual in any country no matter the ideological system under which the individual may live” (1979, p.23).  The notion of the human right to peace also changes the nature of discourse about peace, from something to which individuals and groups might aspire, to something which individuals and groups can reasonably demand.  The notion of the human right to peace also changes the nature of the responsibility of those in positions of power, from a vague aspiration that those in power need to provide for peace, to the expectation and duty that those in power will provide peace.

10. The Problem of Absolute Peace

Given the challenges of defining peace, the philosophical problem of peace may be phrased in terms of a question: is there any such thing as absolute peace? Or ought we be satisfied with an imperfect peace?  For instance, can there ever be a complete elimination of all forms of armed conflict, or at least the elimination of reliance on armed force as the ultimate means of enforcement of will? Similarly, one may ask: is there any such thing as absolute co-operation and harmony between individuals and groups, an absolute sense of well-being within individuals, and an absolute oneness with the external environment?

The philosophical solution to this problem may be to point out that there is always an open-ended dimension to peace, that is, if we take a broad interpretation of peace, we will always be moving towards such a goal. Some might articulate this as the eschatological dimension of peace, suggesting that the contradictions which are raised in any discussion on peace can only be resolved, ultimately, at the end of time. It is relevant to note, however, that peace theorists have pointed out that if we assert that a certain outcome, such as peace, is not attainable, our actions will serve to make this a self-fulfilling prophecy.  In other words, if we assert that peace, relative or absolute, is not attainable, then there will be a reduced expectation of this, and a reduced commitment to making this happen.

11. Peace and the Nature of Truth

It is worthwhile looking at the relationship of the theory of peace to the theory of truth. The relationship can be seen to operate at a number of levels. For instance, Mohandas Gandhi described his theory of nonviolence as satyagraha, often translated as truth force.  Similarly, Gandhi entitled his autobiography ‘The Story of My Experiments with Truth’.  Gandhi saw nonviolence, or ahimsa, as the noblest expression of truth, or sat, and argued there is no way to find truth except through nonviolence. For Gandhi, peace was not merely an ideal, rather it was based on what he saw as the truth of the innate nonviolence of individuals, which the institutions of war and imperialism distorted. Further, peace involves authenticity, a notion related to truth, in that the person involved in advocating peace ought to themselves be peaceful. We thus arrive at the Gandhian dictum that there is no way to peace as such, rather peace is the way, that is, peace is an internal life-style commitment on the part of the individual.

Conversely, war arguably operates as a form of untruth. This was summed up succinctly by Erasmus, in his dictum that war is sweet to those who have not experienced it. IN 1985, Elaine Scarry wrote that the mythology of war obscures what war is actually about, namely, the body in pain. Similarly, Morgan Scott Peck has written about a lack of truthfulness, especially in war, as being the essence of evil.  Typically, those advocating war will concede that the recourse to war is not a good option, but suggest that there is no other option, or that war is the least bad option. The empirical history of nonviolence suggests that this is not the case, and that there are almost always alternatives to violence.

If peace is about establishing societies with harmonious and cooperative relationships, then a key component in establishing such societies is arguably knowledge about ourselves, or accepting the truth about ourselves.  Without this, it is unlikely that we will be able to establish peaceful societies, as we will not have resolved the inclinations to violence within ourselves. The notion of what constitutes the true self, or the truth about one’s self, is a complex one.  Carl Gustav Jung usefully wrote about the shadow or the normally unrecognized side of one’s character.  The extent to which the shadow side of our personality can result in participation in and support for violence can be shocking to us.  This is not to say that human nature is irretrievably attracted to violence or cruelty. For instance, the Seville Statement on Violence, sponsored by UNESCO, argues that war is a human invention.  Yet there is a strong argument that peace involves recognition of the potential within one’s self for violence.  Put another way, peace involves peace with one’s self.

12. Peace as Eros

In the work of Sigmund Freud, and especially in his 1930 work Das Unbehagen in der Kulture (Civilization and its Discontents), Eros is the life instinct, which includes sexual instincts and the will to live and survive. The nominal opposite of the life instinct is the death instinct, which is the will to death. Later theorists described this as Thanatos. Freud developed his theory of competing drives in his therapeutic dealings with soldiers from World War One, many of whom were suffering from psychological trauma as a result of their war experiences. It is not too difficult to see Eros as a synonym for peace, in that peace involves all that Eros represents.  Psychiatrist and peace activist Eric Fromm developed this theme further, writing of biophilia as the love of life, from which all peace comes, and necrophilia, as the love of death and destruction, which is the basis of war.

Even if we acknowledge a link between the death instinct and war, the relationship between the life instinct and the death instinct is not simple.  Freud wrote of the basic desire for death seemingly competing with the desire for life. Yet the two instincts may also be viewed as complementary. It is because we are all aware, at least subconsciously, of our impending mortality, that we a driven to risk death, especially in the enterprise we call war. Many writers have explored this complexity. For instance, the psychiatrist Elizabeth Kubler-Ross writes: “Is war perhaps nothing else but a need to face death, to conquer and master it, to come out of it alive—a peculiar form of denial of our own mortality?” (2014, p.13).

If we think of Eros as peace, then a logical extension is to think of human sexuality and the expression of human sexuality as one embodiment of peace. The post-Freudians Herbert Marcuse and Wilhelm Reich both developed this theme, arguing that the origins of war and unjust social organization rested in repressed sexual desire, and that conversely peace implies sexual freedom.  This idea was neatly summed up in the 1960s radical slogan, “Make love not war”.  An important qualification to the peace-as-sexuality theory is that this always involves consensual sexual relationships.  Many writers have identified rape and other exploitative sexual relationships as important components of war and social injustice.

13. Peace, Empire and the State

In considering a philosophy of peace, the phenomenon of empire presents a paradox for peace theory. The establishment of an empire may be seen as establishing a form of peace. It is thus common to refer to Pax Romana, as the form of peace which was established by virtue of the Roman Empire, and Pax Britannica, Pax Sovietica, and Pax Americana, referring to later periods of empire. It is true that within empires, it can be argued that there is no war, at least not in the conventional sense.  Critics of imperialism, however, point to violence being moved to the periphery of the empire; there is the problem of inter-imperial rivalry; and there is also the problem that empires frequently engage in the violent suppression of minorities within the borders of the empire.

Similarly, the phenomenon of the state presents a paradox for peace theory.  The establishment of a stable state generally means that citizens can live and work free from violence, and ideally, at least in democratic states, within a framework of social justice.  Yet, as sociologist Max Weber famously pointed out, it is in the very nature of the state that it claims a monopoly over the legitimate use of violence. The legitimate use of violence finds its ultimate expression in the phenomenon of war.  Thus, anarcho-pacifists argue if one wants to eliminate war, then one needs to eliminate the state, at least in its current nation-state form.

14. An Existentialist Philosophy of Peace

 Existentialism may be defined in philosophical terms as the view that truth cannot be objectified, but rather it can only be experienced.  This is not to deny the objective reality of an entity, but rather to say that the limitations of language are such that this cannot be objectified.  We can apply this to a philosophical analysis of peace, and suggest that ultimately peace cannot be objectified, but rather it can be experienced. Thus, attempts to specify what peace is are likely to be problematic. Rather we can represent peace by way of illustration, to say that peace involves a set of behaviors and attitudes, and we can represent peace by way of negation, to say that peace is not deliberate violence to other persons. Or we can say, in true existentialist fashion, that we can only know peace through encounter or relationship.

Another way of articulating the idea of existentialist peace is by referring to the metaphysics of peace. The existentialist theologian John Macquarrie writes: “By a metaphysical concept, I mean one the boundaries of which cannot be precisely determined, not because we lack information but because the concept itself turns out to have such depth and inexhaustibility that the more we explore it, the more we see that something further remains to be explored” (1973, p.63), and further: ”If peace … is fundamentally wholeness, and if metaphysics seeks to maximize our perception of wholeness and inter-relatedness, then peace and metaphysics may be more closely linked than is sometimes supposed; while, conversely, the fragmented understanding of life may well be connected with the actual fracturing of life itself, a fracturing which is the opposite of peace. But the true metaphysical dimensions of peace emerge because even to seek a wholeness for human life drives us to ask questions which take us to the very boundaries of understanding. What is finally of value? What is real and what is illusory? What conditions would one need to postulate as making possible the realization of true peace?” (1973, p.64).

15. Decolonizing Peace

Postcolonial theory posits, in general terms, that not only has global colonial history determined the shape of the world as we know it today, but the power relationships implicit in colonialism have determined contemporary thinking. Thus, the powerless tend to be marginalized in contemporary thinking. Some writers, such as Victoria Fountain, have suggested there is a need to decolonize peace theory, including taking into account the everyday experience of ordinary people, transcending liberal peace theory which tends to assume the legitimacy of power, and transcending the view that the Global North needs to come to rescue of the Global South. Thus the discourse on peace, so it is argued, needs to be less Eurocentric.  The argument is that the narrative of peace needs to change.

Postcolonial peace theory intersects with much feminist peace theory, represented by writers such as Elizabeth Boulding, Cynthia Enloe, Nel Noddings, and Betty Reardon. The suggestion is often made by such theorists that a feminine or maternal perspective is uniquely personal, caring and peace-oriented.  The corollary to this is that a male perspective tends to be less personal, less caring, and more war-centric. Feminist peace theorists have also pointed out that war and militarism work on patriarchal assumptions, such as women need protecting and it is the duty of men to protect women, and that there is no alternative to the current system of security through power and domination.  The argument is also made that war and patriarchy are part of the same system.

Postcolonial and feminist peace theory are highly contested. For instance, it can be argued that, as current philosophical discourse has evolved from European origins, articulating peace in terms of concepts articulated by European authors is a merely a matter of utilizing this global language. Similarly, one can argue since it is a historical reality that most influential philosophers in history have hitherto been male, therefore the existing narrative will naturally tend to have more male sources and male voices. One can arguably apply a quota system to some areas such as contemporary politics, but it is more difficult to argue that a quota system ought to be applied to narrative and to discourse. Critics of postcolonial peace theory also allege that postcolonial peace theory tends to avoid universalist statements on human rights, which itself is important, given the key role of human rights in peace, and given the emerging human right to peace itself.

16. Concluding Comments: Philosophy and Peace

One interesting way to address the issue of a philosophy of peace is to think of war as representing the absence of philosophy, in that war is prosecuted on the assumption that one person or group itself possesses truth, and that the views of that individual or group ought to be imposed, if necessary, by violent force.  War may also be seen as the absence of philosophy in that war represents an absence of the love of wisdom.  This is not to deny there are philosophies and philosophers who justify war and injustice.  Ultimately, however, these philosophies are not sustainable, as war is an institution which involves destruction of both the self and societies.  Similarly, social injustice is not sustainable, as within social injustice we find the seeds of war and destruction.

Conversely, it can be argued that philosophy itself represents the presence of peace, in that philosophy generally does not or should not involve assumptions that one person or group by itself uniquely possesses truth, but rather the way to truth is through a process of questioning, sometimes called dialectic. Therefore, philosophy by its essence is or should be a tolerant enterprise, and it is also an enterprise which involves or should involve debate and discussion. Philosophy thus presents a template for a peaceful society, wherein differing viewpoints are considered and explored, and which, through the love of wisdom, encourages thinking and exploring about positive and life-enhancing futures. This means that engaging in philosophy may well be a useful start to a peaceful future.

17. References and Further Reading

  • Aho, J. (1981) Religious Mythology and the Art of War.  Westport: Greenwood.
  • Aquinas (1964-1981) Summa Theologiae: Latin Text and English Translation. (T. Gilbey and others, Eds.) Cambridge: Blackfriars, and New York: McGraw-Hill.
  • Aristotle (1984) The Complete Works of Aristotle. The Revised Oxford Translation. (J.Barnes, Ed.) Princeton: Princeton University Press.
  • Aron, R. (1966) Peace and War: A Theory of International Relations (R.Howard and A.B. Fox, Transl.) London: Weidendfeld and Nicholson.
  • Augustine (1972) Concerning the City of God against the Pagans. (H. Bettenson, Transl.) Harmondsworth: Penguin.
  • Boulding, E. (1988) Building a Global Civic Culture: Education for an Interdependent World. San Francisco: Jossey-Bass.
  • Boulding, E. (2000) Cultures of Peace: The Hidden Side of History.  Syracuse: Syracuse University Press.
  • Boulding, K. (1977) Twelve friendly quarrels with Johan Galtung. Journal of Peace Research. 14(1): 75-86.
  • Buber, M. (1984) I and Thou. (R. Gregor-Smith, Transl.) New York: Scribner.
  • Calleja, J.J. (1991) A Kantian Epistemology of Education and Peace: An Evaluation of Concepts and Values. PhD Thesis. Bradford: Department of Peace Studies, University of Bradford.
  • Chomsky, N. (2002) Understanding Power: The Indispensable Chomsky. (P.R. Mitchell and J.Schoeffel, Eds.). New York: The New Press.
  • Ehrenreich, B. (1999) Men Hate War, Too. Foreign Affairs 78 (1): 118–22.
  • Enloe, C. (2007) Globalization and Militarism: Feminists Make the Link. Lanham: Rowman and Littlefield.
  • Erasmus, D. (1974) Collected Works of Erasmus. Toronto: University of Toronto Press.
  • Finnis, J.  (1980) Natural Law and Natural Rights. Oxford: Clarendon Press; New York: Oxford University Press.
  • Fontan, V.C. (2012) Decolonizing Peace. Lake Oswego: Dignity Press.
  • Galtung, J. (2010) Peace, Negative and Positive. In: N.J. Young (Ed.). The Oxford Encyclopedia of Peace. (pp. 352-356). Oxford and New York: Oxford University Press.
  • Galtung, J. (1996) Peace by Peaceful Means. London: SAGE Publications..
  • Gandhi, M.K. (1966) An Autobiography: The Story of my Experiments with Truth. London: Jonathan Cape.
  • Girard, R. (1977) Violence and the Sacred. (P. Gregory, Transl.) Baltimore: John Hopkins University Press.
  • Hobbes, T. (1998) On the Citizen (R.Tuck and M.Silverthorne, Eds.) Cambridge: Cambridge University Press.
  • Hobbes, T. (1994) Leviathan (E. Curley, Ed.) Indianapolis: Hackett.
  • Kant, I. (1992-) The Cambridge Edition of the Works of Immanuel Kant. (P.Guyer and A. Woods, Eds.) Cambridge: Cambridge University Press.
  • King, M.L. (1963) Strength to Love. Glasgow: Collins.
  • Kübler-Ross, E. (2014) On Death and Dying. New York: Scribner.
  • Locke, T. (1988) Two Treatises of Government. (P. Laslett, Ed.) Cambridge: Cambridge University Press.
  • Locke, T. (2010) A Letter Concerning Toleration and Other Writings. (M. Goldie, Ed.) Indianapolis: Liberty Fund.
  • Macquarrie, J. (1973) The Concept of Peace. London: SCM.
  • More, T. (1999) Utopia. (D. Wootten, Ed.) Cambridge: Hackett Publishing.
  • Noddings, N. (1984) Caring: A Feminine Approach to Ethics and Moral Education. Berkeley: University of California Press.
  • Page, J.S. (2008) Peace Education: Exploring Ethical and Philosophical Foundations. Charlotte: Information Age Publishing.
  • Page, J.S. (2010) Peace Education. In: E. Baker, B. McGaw, and P. Peterson (Eds.) International Encyclopedia of Education. (Volume 1, pp. 850–854). Oxford: Elsevier.
  • Page, J.S. (2014) Peace Education. In: D. Phillips (Ed.) Encyclopedia of Educational Theory and Philosophy. (Volume 2, pp. 596-598). Thousand Oaks: Sage Publications.
  • Peck, M.S. (1983) People of the Lie. New York: Simon and Schuster.
  • Peterfi, W. (1979) The Missing Human Right: The Right to Peace. Peace Research, 11(1): 19-25.
  • Plato (1987) Plato: Complete Works. (J.Cooper and D.Hutchinson, Eds.) Indianapolis: Hackett.
  • Rawls, J. (1999) The Law of Peoples. Cambridge: Harvard University Press.
  • Reardon, B. (1993) Women and Peace: Feminist Visions of Global Security.  Albany: State University of New York Press.
  • Roche, D. (2003) The Human Right to Peace. Toronto: Novalis.
  • Rousseau, J. (1990-2010) Collected Writings. (R. Masters and C. Kelly, Eds.) 13 volumes. Dartmouth: University Press of New England.
  • Rummel, R. (1994) Death by Government. New Brunswick: Transaction Press.
  • Scarry, E. (1985) The Body in Pain. New York and London: Oxford University Press.
  • Spinoza, B. (2002) Baruch Spinoza: The Complete Works. (M.L. Morgan, Ed., S. Shirley, Transl.) Indianapolis: Hackett.
  • Watson, P.S. and Rupp. E.G. (Eds.) (1969) Luther and Erasmus: Free Will and Salvation. London: SCM Press.

 

Author Information

James Page
Email: jpage8@une.edu.au
University of New England
Australia

David Lewis (1941–2001)

LewisDavid Lewis was an American philosopher and one of the last generalists, in the sense that he was one of the last philosophers who contributed to the great majority of sub-fields of the discipline. He made central contributions in metaphysics, the philosophy of language, and the philosophy of mind. He also made important contributions in probabilistic and practical reasoning, epistemology, the philosophy of mathematics, logic, the philosophy of religion, and ethics, including metaethics and applied ethics. He published four monographs and over one hundred articles.

Lewis’s contributions in metaphysics include foundational work in the metaphysics of modality, in particular his peculiar view of concrete modal realism. He also developed influential views about properties, dispositions, time, persistence, and causation. In the philosophy of language, he made important contributions to our understanding of conditionals—counterfactuals in particular. He also developed an influential account of what it is for a group of individuals to use a language, based on his similarly influential account of what it is for a group of individuals to adopt a convention. In the philosophy of mind, Lewis gave an important defense of mind-brain identity theory, and also developed an account of mental content that was based on his metaphysics of properties and modality.

This article discusses in detail only Lewis’s most popularized and influential views and arguments in metaphysics, the philosophy of language, and the philosophy of mind. His views on metaphysics are discussed first, but his views on language and mind are no less influential. The focus is on representative examples of his most important views and arguments concerning particular issues. The article begins with a few short remarks about his biography, and it ends with a discussion of some of his other philosophical contributions.

Table of Contents

  1. Life
  2. Modality
  3. Properties
  4. Time and Persistence
  5. Humean Supervenience
  6. Causation
  7. Counterfactuals
  8. Convention
  9. Mind
  10. Other Work and Legacy
  11. References and Further Reading
    1. Primary Sources
    2. Secondary Sources
    3. Further Reading

1. Life

David Kellogg Lewis was born in 1941 in Oberlin, Ohio. He did his undergraduate studies at Swarthmore College in Pennsylvania. He studied abroad for a year in Oxford, where he was tutored by Iris Murdoch, and where he had the opportunity to attend lectures by J. L. Austin. These experiences inspired him to major in philosophy when he returned to Swarthmore. He did his Ph.D. at Harvard, studying under W. V. O. Quine, who supervised his dissertation, which was the basis of his first book, Convention (1969). There he met his wife Stephanie, with whom he ultimately co-authored three papers. He worked at UCLA from 1966 to 1970, moving from there to Princeton, where he remained until his death in 2001. He spent a lot of time visiting and working in Australia from 1971 onward. As a result, his work was deeply influenced by a number of Australian philosophers, and, in turn, his work has made an indelible mark on analytic philosophy in Australia.

2. Modality

If you are looking for what Lewis had to say about modality, you most likely want to learn about his well-known but rather idiosyncratic view, concrete modal realism. The study of modality is the study of the meanings of expressions like ‘necessarily’ and ‘possibly’. One can assert that Socrates was a blacksmith, which is, of course, false. But one can also assert something weaker, that, possibly, Socrates was a blacksmith (that is, Socrates could have been a blacksmith). Or one can assert something stronger, that, necessarily, he was a blacksmith (that is, he could not have failed to be a blacksmith). There are different senses of the words ‘necessarily’ and ‘possibly’. One is related to what someone knows. Perhaps you are unsure whether Socrates was a philosopher. You might say that Socrates could have been a philosopher, meaning that, for all you know, Socrates was a philosopher (that is, nothing you know contradicts it). Or perhaps you are certain that he was a philosopher, in which case you might simply say that Socrates was a philosopher. Or you might say something stronger—that Socrates must have been a philosopher (that is, what you know contradicts his not having been a philosopher). This sort of modality is epistemic modality. The sort of modality Lewis was most concerned with in his development of concrete modal realism is alethic modality, and concerns how things might have been, or how things must be, regardless of what anyone thinks or knows about it.

One of the central questions in the study of (alethic) modality is what ‘necessarily’ and ‘possibly’ mean. Most discussions of modality are framed in terms of modal logic, which is a formal language that is an extension of propositional or first-order logic, generated by adding the modal operators ‘necessarily’ and ‘possibly’, abbreviated by ‘\Box’ (the box) and ‘\Diamond’ (the diamond). One approach to the question of what the modal operators mean is simply not to answer to it, and to take them as primitive, that is, to take their meanings to be unanalyzable. But, the reader might think, this is not all that satisfying an approach to take. And Lewis would agree. One of the first things he does in his seminal work on concrete modal realism, On the Plurality of Worlds (1986b)—hereafter ‘Plurality’, is to argue that the modal operators should not be taken to be primitive, but instead should be given some sort of analysis in non-modal terms. In the mid-20th century, logicians developed semantics for a variety of systems of modal logic. These semantics provide truth conditions for the box and diamond in terms of mathematical objects which came to be called ‘possible worlds’, since they were naturally interpretable as ways that the world could have been. Trump won the 2016 U.S. presidential election. But it could have been otherwise. He could have lost. Imagine that Trump lost the 2016 election, and that as few other facts as possible are different in order for that to have happened. What you are imagining is a possible world. The basic idea behind any possible-worlds-based analysis of the modal operators is rather simple. One can state the conditions in which sentences involving the modal operators are true in terms of possible worlds, by quantifying over them with quantifiers that behave exactly like those of the universal and existential quantifiers from standard first-order logic, as follows:

\Box p =_{df} for every possible world w, p is true at w.

\Diamond p =_{df} for some possible world w, p is true at w.

So a statement is necessarily true if it is true at every possible world and false otherwise. And it is possibly true if it is true at at least one possible world and false otherwise. It is actually true if it is true at the actual world (that is, the possible world which we inhabit).

Lewis was not the first to interpret the objects quantified over in these analyses, and at which propositions are true (and false), as possible worlds. Thus he was not the first to admit possible worlds into his ontology. What sets him apart from many of those who came before was how he conceived of possible worlds. Typically, worlds were thought of as abstract objects, for example, as maximal consistent sets of sentences of some interpreted language (1986b: 142 ff.). A maximal set of sentences is one that contains, for every sentence p, either p or its negation. A consistent set of sentences is one which does not imply a contradiction. So, {grass is green, grass is not green} is not consistent. Nor is {grass is green, if grass is green then snow is white, snow is not white}. However, {grass is green, snow is white}, is consistent, though not maximal. For Lewis, a possible world is not some abstract object like a set of sentences. Instead, it is something akin to our own world—a continuum of spacetime filled with objects of various sorts, like the ones we ourselves are surrounded by—galaxies, stars, mountains, people, chairs, atoms, and so forth. Possible worlds, for Lewis, are concrete, just like this world in which we find ourselves. Strictly speaking, modal realism is just the view that possible worlds exist (whether one thinks they are abstract or concrete). Concrete modal realism is the view that they exist and are concrete objects. It is this latter, more controversial thesis that Lewis is famous for defending.

Lewis’s argument for concrete modal realism has two main parts. The first part consists in arguing for the ‘realist’ part of concrete modal realism, thereby providing reasons against the alternative of taking the modal operators as primitive. His argument for this consists in showing what possible worlds are good for. He highlights some things that can be done, or can more easily be done, if possible worlds are available. He highlights four such things. The first concerns certain modal locutions of natural language (English) that do not appear to be translatable into sentences with just the box and diamond. One sort of such locution involves modal comparisons. The example Lewis gives is: “a red thing could resemble an orange thing more closely than a red thing could resemble a blue thing” (1986b: 13). Lewis’s analysis involves quantification over possible individuals:

For some x and y (x is red and y is orange and for all u and v (if u is red and v is blue, then x resembles y more than u resembles v)). (1986b: 13)

But, he points out, one would not be able to translate the original sentence with just boxes and diamonds, since “formulas [of modal logic] get evaluated relative to a world, which leaves no room for cross-world comparisons” (1986b: 13). A realist about modality like Lewis, according to whom possible worlds, including the things in them, are as real as our own world and the things in it, is able to make these cross-world comparisons, and thus do justice to modal locutions of natural language that the modal primitivist cannot. He points out that this problem extends past natural language and into philosophical quasi-technical language. The basic idea behind supervenience, the philosophical workhorse of Lewis’s day, used to formulate various theses about dependence, is that the Fs supervene on the Gs if and only if there could be no difference in the Fs without a difference in the Gs. But, he notes (1986b: 14 ff.), attempts to capture this basic notion strictly in terms of the modal operators have failed, either resulting in something too weak or too strong.

The other jobs that Lewis thinks possible worlds can do are briefly outlined as follows. The second job is that talk of possible worlds allows us to make sense of the idea that some possibilities are closer to actuality than others (for example, Hillary Clinton’s having won the 2016 election is a closer possibility to actuality than her being in command of a colonial expedition to the Andromeda galaxy). Such comparisons are useful in making sense of counterfactual claims, that is, claims of the form ‘if it were the case that p then it would be the case that q’. Discussion of Lewis’s account of counterfactuals, and the role possible worlds play in it, occurs in section 7. The third job Lewis thinks that possible worlds can do is that they provide us with the resources to formulate what he takes to be the best theory of mental content, that is, the best theory about what our thoughts are about. He thinks such a theory will construe such contents as sets of possibilities, that is, as sets of possible worlds or possible individuals. The fourth job is that Lewis thinks that sets of possible individuals can play the role of properties, a discussion of which occurs in detail in the next section (section 3). One who takes the modal operators as primitive will not be able to accomplish these things—at least not as easily. This is already clear in the case of jobs three and four; a primitivist about modality will simply not have the worlds and individuals hanging around which they can collect up into sets to act as properties or the contents of our thoughts. While some may balk at some of the consequences of modal realism (such as that there exist infinitely many talking donkeys in other possible worlds, in virtue of it being possible that infinitely many talking donkeys exist), Lewis thinks that these theoretical benefits nonetheless provide reason to prefer modal realism to the primitivist alternative.

The second part of Lewis’s argument for concrete modal realism consists in arguing for the ‘concrete’ component of the view, and comprises a number of arguments against various forms of modal realism which regard possible worlds as abstract entities of one sort or another—what he calls ‘ersatz realism’. Often times, Lewis’s strategy is to argue that concrete modal realism does a better job solving certain problems as compared to these ersatzist alternatives. These arguments can be found in chapter three of Plurality. Just one example, conveniently connected to issues already discussed, is Lewis’s first argument against what he calls ‘linguistic ersatzism’, the view, already introduced, that possible worlds are maximal consistent sets of sentences. Lewis’s complaint is that linguistic ersatzism is committed to a primitive conception of modality—something which Lewis has already argued against, and something to which his own view is not similarly committed. Lewis provides two reasons to think linguistic ersatzism is committed to primitive modality, of which only the first is discussed here. The notion of consistency, in part in terms of which the linguistic ersatzist characterizes possible worlds, appears to be a modal notion: “a set of sentences is consistent iff those sentences, as interpreted, could all be true together” (1986b: 151 ital. orig.). Since Lewis’s own view is not committed to primitive modality, he is able to give a complete analysis of modality in terms of his particular brand of possible worlds, while the linguistic ersatzist is not.

Lewis’s view about modality is distinctive not only in that he takes possible worlds to be concrete. It is also distinctive in the way it analyzes possibility and necessity claims about individuals. Consider possibility claims. One might think that, for something to possibly be some way, there is a possible world at which that very thing is that way. So, for example, one might think that, for it to be true that Hubert Humphrey could have won the 1968 United States presidential election, there is a possible world at which Humphrey—the very same person who lost the 1968 election in the actual world—won the 1968 election. This is a very natural way to think about the analysis of possibility claims. The thesis that objects exist in more than one possible world is known as ‘transworld identity’. When worlds are taken to be concrete, transworld identity amounts to the claim that worlds share constituents, and, for this reason, Lewis calls it ‘(concrete) modal realism with overlap’. It is typically understood as the idea that a thing in this world which could have been qualitatively different than it actually is itself inhabits another possible world as well, in which it is qualitatively different. Instead of taking this approach, Lewis elects to reject any overlap among possible worlds, and to analyze possibility and necessity claims about individuals in terms of counterparts. In particular:

\Box Fa =_{df} for every possible world w at which a counterpart of a exists, Fa is true at w.

\Diamond Fa =_{df} for some possible world w at which a counterpart of a exists, Fa is true at w.

Lewis’s analysis of modality in terms of counterparts is known as ‘counterpart theory’. His complete view about modality, then, is what could be called ‘concrete modal realism with counterpart theory’.

Lewis discusses counterpart theory in Plurality, Ch. 4, ‘Counterpart Theory and Quantified Modal Logic’ (1968), and ‘Counterparts of Persons and Their Bodies’ (1971). When do x and y stand in the counterpart relation? Lewis thinks an object’s counterparts will track intrinsic similarity to some extent. But the notions come apart. This is mainly because the counterpart relation is context-sensitive. This is connected to a factor that Lewis thinks constitutes an advantage of counterpart theory to concrete modal realism with overlap, namely, it can help us make sense of variability in our judgments about what properties are essential to an object (1986b: 252–53). Consider a statue of a human being made of clay standing in a grotto. Many are inclined to say that it is essential to the statue that it has the shape it has. Were it another shape (for example, the shape of a horse), it would be a different statue. The lump of clay, however, would have been the same object even if it were shaped differently than it is. One solution to this problem is to say that there are actually two objects in the grotto: the statue, with a certain set of essential properties, and the lump of clay, with a different set. But Lewis took it to be a cost to be saddled with the possibility of multiple objects that occupy exactly the same spatial region. Lewis’s solution was to note that there can be a single object in the grotto but, when we are describing it as a statue (context 1), we are particularly interested in a certain set of the object’s properties, while, when we are thinking of it as a lump of clay (context 2), we are interested in a different set. In context 1, a lump of clay that was sourced from exactly the same place as the lump of clay in our world was sourced will not count as a counterpart of the object in the grotto if it has a different shape. But it will count as a counterpart of the object in the grotto in context 2. This allows Lewis to explain why, in context 1 but not context 2, we are inclined to say that the object has its shape essentially. In every possible world in which the object has a counterpart (described as a statue), that counterpart will have the same shape that it does.

Lewis’s key arguments against concrete modal realism with overlap appear in chapter four of Plurality. Another important argument is based on what Lewis calls ‘the problem of accidental intrinsics’. If possible worlds share parts (like Humphrey), it is not clear, given modal realism with overlap, how Humphrey could have different intrinsic properties at each world. He presumably does so, since, for at least some of the intrinsic properties he actually has, he could have lacked them, and for at least some of those he actually lacks, he could have had them. Lewis’s example concerns Humphrey’s shape. He actually has five fingers on his left-hand. But he could have had six. It will not do, Lewis thinks, for the proponent of overlap to relativize Humphrey’s property instantiation to worlds, saying, for example, that he has five fingers on his left-hand relative to the actual world, but that the world relative to which he has six fingers on his left-hand is a distinct world. This might work for a tower having different cross-sectional shapes on different levels, Lewis says, for example, being square on the third floor but circular on the fourth. But, he points out, it is only a part of the tower that has the shape at each level. According to modal realism with overlap, the whole of Humphrey exists at each world at which Humphrey exists. Similarly, the relativization strategy might work when Humphrey is honest according to one media source and dishonest according to other. The sources represent Humphrey in different ways. This might work for the ersatzist, whose ersatz individuals merely represent actual objects (as would, for example, a collection of predicates which are sufficient to represent Humphrey and no one else). According to the concrete modal realist, however, possible individuals are individuals, not representations of individuals. Finally, the relativization strategy might work with extrinsic relations like being a father of. A man might be father of Ed and son of Fred, that is, he might be father relative to Ed but not to Fred. But Humphrey’s five-fingeredness concerns his shape, and, as Lewis points out, “If we know what shape is, we know that it is a property, not a relation” (1986b: 204).

Counterpart theory is not without its detractors. Saul Kripke (1980: 45, fn. 13), for example, complains that, on Lewis’s view, possibility claims about an individual are not actually about that individual him-, her-, or itself, but, rather, about one of his, her, or its counterparts. When one says, for example, ‘Humphrey could have won the 1968 election’, the complaint goes, one is not saying something about the Humphrey we are acquainted with—that is, one is not strictly saying something about that very individual who, in our actual world, lost the 1968 election. Instead, one is saying something about an individual that exists in some other possible world, who is similar to our actual Humphrey in certain relevant respects and to sufficient degrees, who won the 1968 election in that world. Lewis is unimpressed with this objection (see, for example, Plurality: 196). He thinks that ‘Humphrey could have won the 1968 election’ is about our Humphrey—the Humphrey in the actual world. Granted, the analysis of this claim involves invoking a distinct entity—one of Humphrey’s counterparts. But it is the actual Humphrey who has the modal property of possibly winning. His counterpart, in contrast, has the property of winning (simpliciter) as well.

3. Properties

Lewis was a realist about properties. That is, he thought that properties exist. Properties can be intuitively understood as ways that things can be. Beyond that very general conception, disagreement arises. One major point of disagreement is about whether properties are repeatable—that distinct things which can be truly ascribed to be similar, in some respect, literally share something in common. This sort of property is usually termed a ‘universal’. Those who endorse this view are realists about universals. According to realists, greenness, for example, is a sui generis entity, distinct from any particular green thing, that is had, or instantiated by each green thing. Realists typically seek to explain the similarity among similar things (such as green things), by appealing to the fact that each instantiates the same universal (so each green thing instantiates greenness). Those who deny the claim that properties are repeatable are nominalists about universals. (This form of nominalism is stricter than that most commonly at issue in the philosophy of mathematics, which denies the existence of all abstract entities, including sets.) Nominalists about universals come is many flavors. David Armstrong (1978a) provides a relatively comprehensive taxonomy of them. Of particular relevance to Lewis’s views on the matter are class nominalists, who identify properties with the sets of the individuals that can be truly described as having them. On such a view, the property of greenness, for example, is identified with the set of green things SG, that is, as that set which contains frogs, grass, the Statue of Liberty, and so forth. To instantiate the property of greenness is just, according to the class nominalist, to belong to the set SG.

Lewis is officially a nominalist. He elected to identify properties with sets, and thus his view was a form a class nominalism. (Lewis had perfectly analogous views about relations.) As such, Lewis’s view faces challenges similar to those class nominalists face. Chief among them is the problem of coextensive properties, which is the concern that class nominalism must identify any properties which have the same extension (that is, apply to the same individuals), whether those properties are intuitively the same or not. The set of those organisms which have hearts, for example, is, as it happens, the same as that which have kidneys. As such, the class nominalist is forced to identify the property of being a cordate with that of being a renate. This seems wrong, however. The former property seems to concern one sort of organ, the latter a completely different sort of organ. These properties seem to be distinct.

Lewis’s solution to this problem is made possible by his views on modality. Lewis identifies each property not with the set of individuals in the actual world to which it can be truly ascribed. Rather, he identifies it with the set of individuals in all possible worlds to which it can be truly ascribed. Due to his views about modality, such individuals exist, and are thus available to be members of sets. The result is a class nominalism that is immune to the aforementioned problem. While it is actually true that every cordate is a renate and vice versa, this is an accident—the result of a long and complex series of events in the evolutionary history of life on Earth. But this history could have unfolded differently. Thus there are possible worlds, according to Lewis, which contain organisms which have hearts but which filter toxins in a different way. And there are worlds which contain organisms which have kidneys but deliver oxygen to cells in a different way. The existence of organisms of either sort ensures that the set of cordates is distinct from the set of renates, and so ensures that these properties are distinct. Of course, one might raise the concern that Lewis’s view has a perfectly analogous problem with properties whose extensions are identical in every possible world, as that of being a triangular polygon and being a trilateral (three-sided) polygon presumably are. For more on this issue, see section 2 of Sophie Allen’s article ‘Properties.’

So far, Lewis looks to be nothing more than a class nominalist, if a relatively sophisticated one, owing to the tricks he can draw from his concrete modal realist bag. But he recognizes that universals do important philosophical work. He enumerates the jobs that universals can do in ‘New Work for a Theory of Universals’ (1983a). To take just one example, Lewis admits that universals can serve to distinguish laws of nature from mere accidental regularities. Armstrong (1978b and 1983) employs universals in this way in his theory of lawhood. According to Armstrong, what ensures, for example, that:

 (G1)     All uranium spheres are less than one mile in diameter

is a law of nature, while:

(G2)     All gold spheres are less than one mile in diameter

is not, is that (G1) is made true not just by the contingent fact that there are no uranium spheres that are one mile in diameter or larger. It is made true by a certain fact that holds at certain worlds about the universals being a uranium sphere and being less than one mile in diameter. These universals jointly instantiate a second-order universal (second-order because it relates universals rather than particulars), which relates these two universals in such a way that it guarantees, at any world at which these universals stand in this relationship, that there will never be a uranium sphere with a diameter of one mile or more (since the relationship between the universals will ensure that any such sphere will explode). There is no such fact concerning the universals being a gold sphere and being less than one mile in diameter. What makes (G2) true is a fact that has nothing to do with these universals. Instead, it has to do only with certain historical contingencies about our world that suffice to explain why, in fact, no gold spheres one mile in diameter or larger ever naturally developed or were artificially constructed. With just his properties, Lewis does not have the resources to explain this difference. Lewis’s properties are abundant. Any old collection of things count as a property. Thus Lewis would have no basis on which to say that the property of being a uranium sphere is related to the property of being less than one mile in diameter in any way that is more (or less) significant than the relation between being a gold sphere and being less than one mile in diameter. He can say that the set-theoretic intersection of each pair is empty, that is, the properties do not share any members (remember, for Lewis, properties are sets). But the similarity of being a uranium sphere and being a gold sphere in this respect would provide him with no basis on which to say that the first figures into a law of nature while the second does not.

Lewis rejects Armstrong’s approach to lawhood (along with his commitment to the existence of universals), and instead characterizes a law as a statement of a regularity that belongs to a suitable deductive system, which (i) is true, (ii) is closed under strict implication (that is, whatever is necessarily implied by any set of statements in the system is also in the system), and (iii) is balanced with respect to simplicity and empirical informativeness. In particular, the system must be as simple as it can be without being informationally too impoverished to do justice to the empirical facts about the world, but, to the extent that it does not sacrifice a sufficient degree of simplicity, it must be as informative as it can be. Nonetheless, Lewis recognizes a problem with his view, and, while he does not need to endorse universals to solve it, he requires something more than his ontology of properties. The problem is that there is a way for a deductive system to meet Lewis’s criteria (i)–(iii) that is clearly undesirable. Suppose we have discovered the best system S for describing the actual world. The way scientists have currently formulated it is rather complicated. But some wiseacre comes up with the idea to introduce a new predicate F into our language and stipulate that F is satisfied by all and only those things at the worlds at which S is true. But suppose further that this wiseacre refuses to provide an analysis of F. S can then be axiomatized with the single axiom ‘\forall x Fx’. This theory is very simple, and it is, in a sense, as informationally enriched as it can be, since it perfectly selects the worlds at which S is true. Nonetheless, the theory is useless to the curious inhabitant. It tells them nothing about what their world is like.

The first step of Lewis’s solution to this problem is to adopt some primitive distinctions among properties. There are those that are perfectly natural, those which are natural to some degree (though not perfectly natural), and those which are unnatural. Lewis (1983a: 346 ff.) imagines that the perfectly natural properties will be those properties that would correspond to universals in Armstrong’s metaphysics, which is sparse enough to enable him to distinguish between laws (for example, being made of uranium). Less natural (but still comparatively natural) properties would correspond to families of suitably related universals (for example, being metallic). The spectrum would continue until wholly unnatural, gerrymandered properties are reached (for example, being either the Eiffel Tower or a part of the moon). Lewis notes that admitting universals into one’s ontology can provide the basis for a distinction between more and less natural properties, in the way just gestured at in the comparison with Armstrong’s metaphysics. But he notes that the distinction can be taken to be a primitive one between properties (classes) instead. This is Lewis’s preference; it allows him to avoid realism about universals and thus remain a nominalist. Lewis then solves the problem of the true but useless theory ‘\forall x Fx’ by imposing a further criterion that the most suitable deductive system which sets the laws apart from the non-laws is one whose axioms are stated in a way that refers only to perfectly natural properties.

4. Time and Persistence

Lewis’s most well-known writings about time have to do with the persistence of objects. Lewis was a four-dimensionalist. That is, he believed that there exist four-dimensional objects, extended not just in space, but in time as well. Four-dimensionalism is to be contrasted with three-dimensionalism, according to which the only objects which exist are extended in space only (if they are extended at all that is, so as not to rule out the existence of non-extended points of space). Lewis’s commitment to four-dimensionalism was a result of his endorsement of two theses: (1) unrestricted composition, and (2) eternalism. Unrestricted composition is the thesis that any objects compose some object. So not only do my head, torso, arms, and legs compose an object (me), my head and the near side of the moon compose an object as well. Eternalism is a view about the ontology of time, according to which past, present, and future times, objects, and events are equally real. Eternalism is to be contrasted with presentism, the view that only the present time and present objects and events are real, and with the growing block theory, the view that past and present times, objects, and events are real, but future ones are not. Committing oneself to unrestricted composition and eternalism requires one to countenance four-dimensional objects. Not only do any presently existing objects compose an object, past ones do too. And, crucially, objects which exist at different times compose objects as well, such as the object that is composed of George Washington’s first wig and the sandwich someone just made for lunch.

As strange a view as four-dimensionalism might seem, Lewis has good reasons for adopting it. These reasons concern issues connected to the persistence of objects through time. Lewis is a perdurantist, and as such believes that for an object to persist through an interval of time is for it to perdure, that is, to have proper parts, one of which is wholly present at each moment of that interval. Perdurantism is to be contrasted with endurantism, according to which an object’s persistence through an interval of time amounts to the whole object being wholly present at each moment of that interval. Perdurantism, obviously, requires the truth of four-dimensionalism, at least assuming that some objects do in fact persist through time. This is because any such object must have parts which exist at different times. According to perdurantism (at least Lewis’s version—Theodore Sider develops another version of it in 1996 and 2001), the objects that we refer to with our names and definite descriptions are actually four-dimensional worm-like objects. We are acquainted with them by being acquainted with some of their parts at various times. So, for example, the Taj Mahal is a spacetime worm that extends back to about 1653. I am acquainted with it only insofar as I am acquainted with one of its parts, which extends through time for about two hours, which I toured on November 28, 2015. Even human beings, according to Lewis, are actually spacetime worms. They are not themselves shaped like those objects depicted in anatomy textbooks. Instead, those diagrams depict certain parts of human beings that exist at instants of time.

Lewis’s perdurantism might seem like an odd view, but, he thinks, it solves an important problem. Its competitor endurantism faces an important problem which Lewis calls the ‘problem of temporary intrinsics’ (1986b: 202–04 and 2002), which is analogous to the problem of accidental intrinsics which faces concrete modal realism with overlap (see the discussion in section 2). Everyone agrees that objects change over time. A person may previously have been standing and currently be sitting. The endurantist must say that the very same object has both the property of standing and sitting. This looks, at least at first glance, to be a contradiction. Endurantists typically say that the contradiction is only apparent, and they explain it away in various ways. But Lewis does not think any of those strategies succeed. One strategy endurantists use is to say that what we thought were properties, instantiated by a single object, are actually relations, instantiated by an object and a time. There is no contradiction involved in one’s both standing and sitting, since one is standing in relation to one (past) time and sitting in relation to another (the present time). But Lewis thinks that if an intrinsic property like shape (that is, a property having only to do with an object, and nothing to do with how it is related to other objects) is anything, it is not a relation (see the Lewis quotation at the end of section 2). Another strategy endurantists use to explain away the apparent contradiction resulting from temporary intrinsics is to adopt presentism. Since only the present is real, the person has the property of sitting. They do not have the property of standing. (They did have the property of standing when that moment was present. But it is present no longer, and thus is not real.) But, Lewis thinks, presentism comes at a high cost. The presentist must reject the idea that a person has a past and (typically) a future as well, since, according to presentism, neither the past nor future exists. Lewis points out that perdurantism solves the problem nicely. There is something that has the property of sitting—a part of the person that is wholly present at a certain moment in the past. And there is something that has the property of standing—a part of the person that is wholly present at the present moment. But there is no contradiction since these are distinct parts of this person. Lewis’s perdurantist solution appeals to the same consideration which allows us to say that there is no contradiction in my left hand currently being fist-shaped and my right hand currently being open-palmed. They are different parts of me, and so are distinct objects. There is no contradiction in distinct objects having incompatible properties.

5. Humean Supervenience

Lewis believes that everything in the actual world is material. He also defends a thesis he calls ‘Humean supervenience’. Humean supervenience is the thesis that, in Lewis’s words, “all there is to the world is a vast mosaic of local matters of particular fact, just one little thing and another” (1986c: ix). Hume was known for rejecting the idea that there were hidden connections behind conjoined phenomena which necessitate their conjunction. He was not against there being regularities in the world. His objection was to these regularities being explained by necessary connections (such as Armstrong’s second-order states of affairs relating universals—see section 3). Lewis is sympathetic to this view, and also likes the idea that macroscopic phenomena are reducible to certain basic microscopic phenomena. These microscopic phenomena Lewis takes to be just the geometrical arrangement of the world’s spacetime points, and the instantiation of certain perfectly natural properties at each of those points. Lewis takes this to mean that fundamental entities are point-sized, or, perhaps, that the fundamental entities are the spacetime points themselves.

Lewis is willing to admit that other possible worlds sufficiently different from our own might be different in this last respect. In particular, he thinks that it might take more than just the point-wise distribution of instantiations of perfectly natural properties to determine all of the phenomena in the world. Now the scientifically informed reader might object that our current physical theories show that this is not true even at our world. Some of our most promising physical theories, for example, posit spatially extended fields as being among the fundamental constituents of reality, rather than point-like entities. As Daniel Nolan (2005: 29 ff.) and Brian Weatherson (2016: sec. 5) point out, Lewis is concerned more with illustrating the defensibility of this latter thesis than with its truth. It could be regarded as an idealization or simplification, suitable for philosophical purposes, in terms of which Lewis formulates his thesis of Humean supervenience. If it turns out that the fundamental furniture of the world actually consists of spatially extended entities, rather than point-like entities, Lewis will be content to backpedal a bit, and formulate Humean supervenience in a way that is consistent with that, such as, for example, claiming that what is true at a given world is determined by the geometrical arrangement of its spacetime points and where perfectly natural properties are instantiated at the spacetime regions occupied by the fundamental entities. But, as Lewis suggests in ‘Humean Supervenience Debugged’ (1994a: 474), he expects that, even once we have settled on the nature of the physical world, we will find that the profusion of phenomena at our world can be explained by a comparatively sparse base of simple entities instantiating comparatively basic properties and perhaps also standing in comparatively basic relations.

6. Causation

Lewis is known for his counterfactual analysis of causation. Lewis made significant contributions to the semantics of counterfactuals, which will be discussed in the next section. The following is perhaps the most straightforward way to provide an analysis of causation in terms of counterfactuals, though, as we will see, it is importantly different from Lewis’s account:

x causes y iff x and y occur, and if x had not occurred, then y would not have occurred.

Counterfactual analyses of causation are to be contrasted with productive accounts, according to which x causes y iff x produces some change in properties in y, where the notion of production is typically taken to be primitive. Both sorts of analysis face their own characteristic set of problems. This article discusses only what is the most well-known problem for the above counterfactual account, the problem of causal preemption (or causal redundancy) since it will help the reader understand why Lewis develops his own counterfactual analysis of causation in the way that he does. Suppose that Alice and Bob are throwing rocks at bottles and Alice throws her rock at one of the bottles and hits it, shattering it. Intuitively, Alice’s throw caused the bottle to shatter. But suppose also that Bob was ready to throw his rock at the same bottle just in case Alice did not throw, and, moreover, he has perfect aim. Thus Bob’s rock would have struck the bottle, causing it to shatter, had Alice not thrown. Due to this fact, the right side of the above counterfactual analysis of causation is not satisfied in this case. It is not the case that, had Alice not thrown, the bottle would not have shattered. This is because, given the way the case was set up, Bob’s throw would have ensured that the bottle would shatter. Yet, intuitively, Alice’s throw caused the bottle to shatter. Something seems to be wrong with the above counterfactual analysis of causation.

In order to avoid this problem, in ‘Causation’ (1973a), Lewis distinguishes between causation and causal dependence. The above analysis is actually the analysis Lewis provides of causal dependence. He defines causation in terms of chains of causal dependence (where a chain might, but typically will not, have only two nodes). So, for example, if y causally depends on x, and z causally depends on y, then x causes z, even if z might have occurred even if x had not. Lewis thinks there is independent motivation for this move, as he thinks there are often cases in which it is natural to say that x causes z even when z does not counterfactually depend on x. This is explained by Lewis by positing a chain of causal dependence. In general, counterfactual dependence is not transitive. The light would not have come on if I had not flicked the switch. I would not have flicked the switch if I had been out running errands. But the light may well have come on just then even if I had been out running errands. Another member of my family might have walked into the room and flicked the switch. Lewis deals with cases of causal preemption, like the one involving Alice and Bob, by pointing out that, in such cases, there will nonetheless be a chain of counterfactual (and thus causal) dependence which we can invoke to secure the truth of the causal claims we think are true. Lewis grants that it is not the case that, if Alice had not thrown her rock, then the bottle would not have shattered (since Bob would have fired). But, he thinks this establishes only that the bottle’s shattering doesn’t causally depend on Alice’s throw. Since causes need only be linked by chains of causal dependence to their effects, Lewis can still say that Alice’s throw caused the bottle to shatter. He would note first that:

(CF1) the bottle would not have shattered if Alice’s rock had not been speeding toward it.

This is true because, by the time the rock was speeding toward the bottle, Bob has seen that Alice had thrown her rock, and so has refrained from throwing his own rock. Lewis would note second that:

(CF2) Alice’s rock would not have been speeding toward the bottle if Alice had not thrown it.

This sets up a chain of causal dependence between Alice’s throw and the bottle’s shattering, which is enough, on Lewis’s account, to secure the desired conclusion that Alice’s throw caused the bottle to shatter.

Lewis’s counterfactual account of causation, as just explicated, still has a problem with preemption. This is the problem of late preemption, in which one causal process is preempted by the effect rather than by an event earlier in the process. So, for example, rather than Bob’s throw being preempted by Alice’s throwing her rock, suppose Bob threw his rock a split second after Alice threw hers, and that his rock did not hit the bottle only because the bottle had shattered a split second before Bob’s rock reached the bottle’s former position. In this case (adapted from Hall 2004), (CF1) would be false, and so Lewis would be unable to set up a chain of counterfactual dependence on which he could base a determination that Alice’s throw caused the bottle to shatter. This problem led Lewis to revise his view significantly in ‘Causation as Influence’ (2000a and 2004), wherein he analyzes causation in terms of the notion of influence. Lewis characterizes influence as follows:

C influences E iff there is a substantial range C1, C2,… of different not-too-distant alterations of C (including the actual alteration of C) and there is a range E1, E2,… of alterations of E, at least some of which differ, such that if C1 had occurred, E1 would have occurred, and if C2 had occurred, E2 would have occurred, and so on. Thus we have a pattern of counterfactual dependence of whether, when, and how on whether, when, and how. (2000a: 190 and 2004: 91)

The precise circumstances in which an event occurs, including the exact time at which it occurs, and the manner in which it occurs, are relevant to whether one event influences another. On this characterization, Alice’s throw influenced the bottle’s shattering, since it made a difference, for example, to the exact manner in which it occurred. Let’s say, for example, that her rock hit the right side of the bottle, and that it shattered to the left. But if she had thrown a bit to the left, the bottle would have shattered towards the right. The same is not true of Bob’s throw. If he had thrown a bit to the left, the bottle still would have shattered in the way that it did, since Alice’s rock would still have hit it in the way that it did. This allows Lewis to say that Alice’s throw caused the bottle to shatter, despite the fact that Bob’s rock was on its way to ensure that it shatters in case Alice’s aim happened to be off.

Another sort of problem that gives Lewis trouble involve absences. It is not clear how Lewis’s view can deal with cases like when an absence of light causes a plant to die. There is no event in terms of which we can formulate any counterfactuals of the form ‘if x had not occurred, then y would not have occurred’ in such cases. Lewis (for example, 2000a, sec. X) deals with absences by admitting that there are some instances of causation that do not have causes (understood as events). Instead, he thinks that it is true to say that the absence of light caused the plant to die as long the right sorts of counterfactuals are true, for example, ‘if there had been more light over the past few weeks, the plant would have survived’.

7. Counterfactuals

Lewis makes use of some of the tools of his theory of modality in his contributions to the literature on the semantics of counterfactuals. A counterfactual is a certain type of conditional. A conditional is a sentence synonymous to one of the form ‘if…, then…’. An indicative conditional is a conditional whose verbs are in the indicative mood, for example:

(1) If Tom is skiing, then he is not in his office.

Other conditionals are in the subjunctive mood, for example:

(2) If Tom were a skiing instructor, then he would be in great shape.

Many of the subjunctive conditionals that we use on a day-to-day basis, such as (2), are counterfactual conditionals, that is, conditionals whose antecedents express statements that are contrary to what is actually the case. (Suppose Tom is in fact an accountant.) The material conditional ‘’ from propositional logic can be used to adequately translate many natural language conditionals. Recall that, as an operator that is truth-functional, all there is to the meaning of ‘p \rightarrow q’ is its truth conditions as given by its truth table, according to which it is true if either p is false or q is true, and it is false otherwise (that is, when p is true and q is false).

\begin{tabular}{c c | c}<br /> $p$ & $q$ & $p \rightarrow q$ \\<br /> \hline<br /> T & T & T \\<br /> T & F & F \\<br /> F & T & T \\<br /> F & F & T \\<br /> \end{tabular}

But there are many other natural language conditionals which cannot be adequately translated with the material conditional. Counterfactuals form an important class of such conditionals.

Before Lewis, the most well-worked-out accounts of counterfactuals construed them as strict conditionals meeting certain conditions (in particular, see Goodman 1947 and 1955). A strict conditional is just a material conditional that holds of necessity, that is, a statement of the form ‘\Box (p \rightarrow q)’. The simplest strict-conditional account of counterfactuals (which is admittedly simpler than Goodman’s, but will be sufficient to motivate Lewis’s account) analyzes each counterfactual in terms of the corresponding strict conditional, that is,

p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q’ is true iff \Box (p \rightarrow q).

(Following Lewis in Counterfactuals, (1973b, 1–2), ‘if it had been the case that p then it would have been the case that q’ is abbreviated with ‘p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q’.) This account is inadequate because a strict conditional is like a material conditional insofar as strengthening its antecedent cannot take the entire conditional from being true to being false, whereas this is not so for counterfactuals (see Lewis 1973b: ch. 1, Nolan 2005: 74 ff., and Weatherson 2016: sec. 3.1). Recall from propositional logic that the following inference pattern is valid.

\(<br /> \begin{array}{l} p \rightarrow q \\<br /> \hline (p \land r) \rightarrow q<br /> \end{array}<br /> \)

The analogous inference pattern involving the strict conditional is also valid:

\(<br /> \begin{array}{l} \Box (p \rightarrow q) \\<br /> \hline \Box [(p \land r) \rightarrow q]<br /> \end{array}<br /> \)

But the analogous inference for the counterfactual conditional is not valid:

\(<br /> \begin{array}{l} p \ensuremath{\mathrel{\Box\kern-1.5pt\raise1pt\hbox{$\rightarrow$}}} q \\<br /> \hline (p \land r) \ensuremath{\mathrel{\Box\kern-1.5pt\raise1pt\hbox{$\rightarrow$}}} q<br /> \end{array}<br /> \)

Suppose that the counterfactual (2) above is true, and consider the following strengthening of it:

(3) If Tom were a skiing instructor and he always wore a robotic exoskeleton so that he did not ever expend any energy, then he would be in great shape.

(3) appears to be false. If he never expended any energy, he would not be in great shape. But (3) follows from (2) on the strict conditional account because of the validity of the above inference pattern involving the strict conditional. It does not, however, follow on Lewis’s account.

Lewis analyzes counterfactuals in terms of possible worlds, and the basic idea behind his analysis is similar to that of Robert Stalnaker (1968). Stalnaker proposed the following analysis of counterfactuals in terms of the similarity of worlds:

p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q’ is true iff the most similar p-world to the actual world is also a q-world, where a p-world is just a world at which p is true.

(Technically this only specifies the truth conditions for counterfactuals that are non-vacuously true, that is, when there is at least one p-world most similar to the actual world. But we can ignore vacuously true counterfactuals.) Lewis has a helpful metaphor which he employs when thinking about the similarity between worlds. He thinks about possible worlds as if they were arranged in a space, with the actual world at the center, with larger and smaller degrees of similarity to the actual world being represented by larger and smaller distances from (closeness to) the actual world. Counterfactual (2) above, for example, is true, on Stalnaker’s account, because the most similar (closest) world to the actual world at which Tom is a skiing instructor is one at which he is in great shape. A world in which Tom wears a robotic exoskeleton while teaching people to ski (thus keeping him in poor shape) is plausibly less similar to (farther away from) the actual world than one in which he teaches people to ski using his own muscles. (3), however, requires one to look at the closest world at which both Tom is a skiing instructor and Tom wears a robotic exoskeleton. And in that world, plausibly, Tom is not in great shape. It would require even more changes in the actual facts to ensure that Tom would be in great shape in such a world (for example, Tom has taken a pill—the result of a medical breakthrough that has not occurred at the actual world—that keeps his body in great shape even if he does not exercise).

There are important differences between the analysis Lewis ultimately settles on and Stalnaker’s. For one, Lewis rejects Stalnaker’s assumption that there will always be a unique p-world that is most similar to the actual world. As a result, the analysis that Lewis adopts is closer to the following:

p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q’ is true iff all p-worlds that are most similar to the actual world are also q-worlds.

Lewis also challenges the tempting assumption that there is a closest “sphere” of p-worlds to the actual world (this is the Limit Assumption—see 1973b: 19 ff.). Without it, counterfactuals are best analyzed as follows:

p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q’ is true iff there is a (p \land q)-world that is more similar to the actual world than any (p \land \neg q)-world.

Finally, Lewis questions the tempting assumption that each world is more similar to itself than any other world (1973b: 28 ff.). Making this assumption results in p \land q entailing p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q. So, for instance, ‘Tom is a skiing instructor and Tom is in great shape’ would entail (2). But it would seem odd for this counterfactual to be true if its antecedent were not in fact false. In the end, Lewis sticks with this assumption for technical reasons (cf. Weatherson 2016: sec. 3.2).

Lewis’s analysis of counterfactuals is not without problems. Kit Fine (1975), for instance, argues that Lewis’s account, as it stands, makes the following counterfactual false, though it is presumably true:

(4) If Nixon had pressed the button, there would have been nuclear war.

It seems that any of the worlds in which Nixon pressed the button that are most similar to the actual world are ones in which there was no nuclear war, but in which instead some relatively minor miracle occurred—some violation of the natural laws of our world, perhaps specific to the exact location of the button and the specific time at which Nixon pressed it—which renders the button momentarily useless. To surmount this problem, Lewis says more about similarity in ‘Counterfactual Dependence and Time’s Arrow’ (1979b). He had already noted that similarity would be context-sensitive in his book Counterfactuals. That is, he had already noted that the “distance” that possible worlds are from the actual world might be different for the same counterfactual when it is uttered in different contexts. If, for example, (2) were uttered in a context in which it had already been established that Tom owned a robotic exoskeleton and was considering using it, the closest worlds to the actual world would include those in which he wore it and thus maintained a poor physique, thus rendering the counterfactual false instead of true. But Lewis says little else about similarity there.

To deal with Fine’s challenge, Lewis outlines a number of rules which one should abide by while measuring similarity given a context:

(1) It is of the first importance to avoid big, widespread diverse violations of law.

(2) It is of the second importance to maximize the spatiotemporal region throughout which a perfect match of particular fact prevails.

(3) It is of the third importance to avoid even small, localized, simple violations of law.

(4)  It is of little or no importance to secure approximate similarity of particular fact, even in matters that concern us greatly. (1979b: 472)

Lewis assumes determinism throughout his discussion. That is, he assumes that everything that occurs is necessitated by the events which occurred earlier together with the laws of nature. Lewis thinks that determinism better explains, in comparison to indeterminism, the fact that counterfactuals which concern events which occur at different times exhibit an asymmetry which encodes the fixedness of the past and the openness of the future (1979b: 460). Given the assumption of determinism, and the assumption that Nixon did not press the button in the actual world, any world in which Nixon did press the button must either (i) be a world in which a small miracle occurred to enable Nixon to press the button despite having the same history as the actual world or (ii) be a world that has a completely different history than our own world, to enable Nixon’s pressing of the button to be necessitated by that history. By Lewis’s rules above, type (i) worlds are more similar to the actual world than type (ii) worlds, since the latter violate the more important rule (2). Type (i) worlds are identical to the actual world up to the point at which Nixon is considering pressing the button. Type (ii) worlds have completely different histories. Type (i) worlds violate only the less important rule (3), since they feature a small miracle. Lewis grants that there will be worlds with the same history as the actual world in which Nixon presses the button but no nuclear war ensues because another miracle causes a malfunction in the button, preventing the warheads from launching. But these worlds will have to involve miracles in addition to the one which enables Nixon to press the button. This is a further violation of rule (3). In contrast, a world in which Nixon presses the button and nuclear war ensues will violate the less important rule (4). As a result, Lewis concludes, the most similar worlds to the actual world are worlds in which Nixon presses the button and nuclear war ensues. Lewis’s account, therefore, makes the above counterfactual (4) true, as it should be.

8. Convention

Lewis’s earliest work is devoted to developing an account of what it is for a group of individuals to use a language. The lion’s share of his work on this issue can be found in his first book, Convention (1969) (see also ‘Languages and Language’ (1975)). Lewis makes use of the notion of a convention in his analysis of language use, and a significant part of the importance of this book is due to the account of conventions that he offers. Conventions about language use are by no means the only ones around. It is, for example, a convention in the United States to drive on the right-hand side of the road. An initial picture of convention that one might have is one of convention as the result of agreement. That is, one might think that a convention among some individuals is the result of an agreement they make with one another. However, individuals appear able to make an agreement only in a language. Thus one cannot give an analysis of what it is for a group of individuals to speak a language in terms of convention, understood in terms of agreement, since it would be circular; it would presuppose that these individuals speak a language (cf. Weatherson 2016: sec. 2). Lewis’s analysis of conventions avoids this problem.

What motivates the implementation of conventions are coordination problems. Roughly, a coordination problem is a problem facing two or more people where the best outcome for each person can result only by the coordination of their actions. Suppose, for example, that each member of a group of people is trying to decide which side of the road to drive on. Consider one such individual, Carol. Carol might have her own basic unconditioned preference on which side to drive. She might, for instance, prefer to drive on the right-hand side of the road because the steering wheel of her car is situated on the right-hand side, and she would like to place herself as far from oncoming traffic as possible. Still, she has a conditional preference concerning driving on the left-hand side of the road. She would prefer to drive on the left-hand side of the road on the condition that everyone else drives on the left-hand side of the road. This is rooted in Carol’s desire to minimize the chances she is hit by oncoming traffic. We can suppose that everyone (or at least almost everyone) in the group has the conditional preferences that she prefers to drive on the left (right) side of the road on the condition that everyone else drives on the left (right) side of the road. Notice that there are two ways to solve these individuals’ coordination problem: (1) they might adopt the convention that everyone drive on the left side of the road, and (2) they might adopt the convention that everyone drive on the right side of the road. When everyone in the group settles on one of these options, what results is a coordination equilibrium.

It is important to note that there is more than one equilibrium which the members of the group can adopt to create the best outcome for all of them. It is in such circumstances that a convention must be adopted. In other words, some coordination problems will have only a single solution, in which case there is no need for a convention. People will act in such a way just because it creates the best outcome for them (and for everyone else). Suppose, for example, that there is a group of farmers that sell a certain product, say, coffee, to a population. We can suppose that there is a certain price p below which each farmer will fail to make an adequate profit on each item, which would ultimately drive them out of business. And we can suppose that there is certain price p′ above which consumers will forgo the product, substituting it with another less expensive product, like chicory or tea, available from others, or changing their habits altogether to eliminate a bitter morning drink from their diet. Assuming that p′ > p, we can expect these farmers (each of whom, we are supposing, is acting in her own self-interest) to offer their product somewhere within the price range bounded by p and p′. This outcome is not the result of the adoption of a convention among these farmers. It is instead a result of each farmer acting in her own self-interest, of there being only one way for each farmer to achieve the best outcome for herself, and of her accurately observing the character of her market. Solving other coordination problems, however, such as the question of which side of the road everyone should drive on, requires a convention, since there are two possible ways to achieve the best outcome for everyone involved.

Of course, everyone in Carol’s group could get together and have a vote to decide which side of the road everyone in their group should drive on, in effect making an explicit agreement with one another. Perhaps the majority of car owners have an unconditioned preference like Carol’s, and prefer, for whatever reason, to drive on the right-hand side of the road. In this case, the result will be that everyone agrees to drive on the right-hand side of the road. But, importantly, agreement is not the only way to establish a convention (1969: 33–34). It might be that, as a matter pure chance, the first handful of people on the road with their cars happened to share Carol’s unconditional preference to drive on the right, and this effectively forced the latecomers to drive on the right in order to avoid the preexisting oncoming traffic.

In the spirit of the above considerations, Lewis ultimately settles on the following analysis of a convention:

A regularity R in the behavior of members of a population P when they are agents in a recurrent situation S is a convention if and only if it is true that, and it is common knowledge in P that, in almost any instance of S among members of P,

        1. almost everyone conforms to R;
        2. almost everyone expects almost everyone else to conform to R;
        3. almost everyone has approximately the same preferences regarding all possible combinations of actions;
        4. almost everyone prefers that any one conform to R, on condition that almost everyone conform to R;
        5. almost everyone would prefer that any one conform to R′, on condition that almost everyone conform to R′,

where R′ is some possible regularity in the behavior of members of P in S, such that no one in almost any instance of S among members of P could conform both to R and to R′. (1969: 78)

One aspect of this analysis worth noting immediately is its tolerance for a certain number of exceptions (embodied by the consistent appearance of occurrences of ‘almost’). This is to prevent the analysis from failing to count as a convention what we would think should be counted as one. Of course, from time to time, there are, unfortunately, those who drive on the wrong side of the road. But these isolated incidents should not preclude the existence of a convention in the population to which these individuals belong, even if it did not come about as a result of an agreement. Suppose that the convention to drive on the right side of the road in Carol’s group arose by chance as described above, with all later drivers conforming to the preference of the first few drivers to drive on the right-hand side of the road. After weeks of this, we would not expect a single individual driving a single time on the left side of the road, for whatever the reason (whether the result of negligence or an intentional act of rebellion), to prevent the regularity that had emerged in the behavior of drivers in the group from being a convention. The convention is still there. It is just that this individual has failed, on this occasion, to act in accordance with it.

Another thing worth noting about Lewis’s analysis of convention is that, by ‘common knowledge that p’, Lewis does not require that p be true (1969: 52 ff.). Instead, it is enough that everyone has reason to believe that p, everyone has reason to believe that everyone has reason to believe that p, and so on. Whether or not anyone in fact believes that p, or in fact believes that everyone has reason to believe that p, and so on, is inconsequential to the analysis. This is why Lewis must specify separately that it is true that conditions (1)–(5) hold. Lewis adopts this characterization of common knowledge because he does not want to require, effectively, that, for a convention to hold, everyone believes that it holds. While he expects many people to be adept enough reasoners that they will come to believe the things they have reason to believe, he wants to allow for exceptions—individuals who never explicitly represent to themselves all of the various conditions which must hold for a convention to be present. But the presence of such individuals, of course, should not prevent a convention from being present (1969: 60 ff.).

Conditions (1) and (2) of Lewis’s analysis of convention are relatively straightforward, and they have been discussed above. Condition (4) is relatively straightforward as well. It requires, for example, that the vast majority of Carol’s group prefers that everyone in the group drives on the right-hand side of the road on the condition that almost everyone drives on the right-hand side of the road. If a substantial portion of the population did not desire that a convention be observed, the convention could easily collapse at any time, even if almost everyone had been observing it up to that time.  This sort of situation is often exactly what is present just before a convention is abandoned. Consider public order—the tendency for people in many societies to act in an orderly and organized way while out in public. It is not implausible to say that public order is a convention which exists in these societies. And when it does, it is often, at least in part, the result of people wanting to live in a peaceful and orderly environment. But a sufficient number of grievances can develop within a population to the point where their preference for those grievances to be addressed trumps their preference for a peaceful and orderly environment. In such circumstances, the convention of public order can disappear. Condition (5) is what distinguishes conventions from cases where only one coordination equilibrium is possible, as in the example with the farmers selling their coffee. In that case, there existed no other regularity in the behavior of the farmers other than selling their coffee in the price range between p and p′ that would have resulted in the best outcome for each of them.

Condition (3) is a bit trickier to understand. It is connected to formal issues of game theory—particularly with the question of whether a coordination equilibrium is possible. The basic idea behind it can be illustrated with an example. For simplicity, suppose that Carol and Diane are the only people in the group. There are four possible combinations of actions to the coordination problem of which side of the road on which to drive:

(a)  Carol drives on the left and Diane drives on the left.

(b)  Carol drives on the left and Diane drives on the right.

(c)  Carol drives on the right and Diane drives on the left.

(d)  Carol drives on the right and Diane drives on the right.

And there are, in principle, twenty-four possible ways for each of Carol and Diane to order these actions according to her preference. By adopting condition (3), Lewis aims to ensure that there is enough agreement between the preferences of Carol and Diane to make a coordination equilibrium possible. If, for example, Carol prefers (d) to (a), and (a) to either (b) or (c), then an equilibrium will be unreachable if, for example, Diane prefers either of (b) and (c) to either of (a) or (d). (This is in part because Diane represents a significant portion of the group.)

Now that Lewis’s analysis of convention has been introduced, one can appreciate how he employs it in his account of what it is for a group of individuals to speak a language. Lewis provides an in-depth discussion of what he takes a language to be (1969: 160 ff.). But it should be noted that, for Lewis, a language is not just a collection of basic vocabulary items (a lexicon) and a set of rules for arranging them into more complex elements of the language, including sentences of arbitrary complexity (a grammar). It also includes an interpretation, that is, a function which assigns to each sentence of the language a set of conditions under which that sentence is true (and false). (Technically, the function assigns truth conditions to each possible utterance of each sentence, since Lewis wants to accommodate the possibility of ambiguous sentences, which are standard features of natural languages. Lewis also makes allowance for imperative sentences as well, which are “true” just in case they are obeyed.) So, a language that is just like English except that ‘p or q’ is true iff p is true and q is true and ‘p and q’ is true iff p is true or q is true would not be English, but some other language. Though it consists of the same basic vocabulary items and grammar as English, and thus the same sentences, it supplies interpretations of some of those sentences that are different from those that English supplies. In particular, it switches the truth conditions of ‘and’ and ‘or’ in English. As a result of this conception of languages, a sentence can only be true or false in a language. Another language could also have that same sentence as one of its elements, but it could supply different truth conditions for it.

For Lewis, what it is for a population P to use a language L is for there to be a convention in P to be truthful in L, that is, it is true for almost all individuals to almost always utter sentences only if they believe them to be true (1969: 177, cf. 1975: 7). That is, it is true that, and common knowledge in P that, in almost any instance of verbal communication among members of P:

    1. almost everyone is truthful in L;
    2. almost everyone expects almost everyone else to be truthful in L;
    3. almost everyone has approximately the same preferences regarding all possible combinations of utterances of L;
    4. almost everyone prefers that any one person is truthful in L, given that everyone else is truthful in L; and
    5. there is some other possible language L′ which almost everyone would prefer that any one be truthful in, on condition that almost everyone is truthful in L′.

But Lewis is careful to note that a person must occasionally use or respond appropriately to utterances of sentences of L in order to be a member of a population that uses L. If, at some point, she stops using and responding appropriately to such utterances, she will eventually not belong to any population that uses L (1969: 178).

9. Mind

There are two major respects in which Lewis contributes to the philosophy of mind. The first concerns his theory of mind, which is a version of the identity theory. The second is his theory of mental content, that is, an account of the contents of certain mental states like what is believed when one has a belief, and what is desired when one has a desire. This article discusses only the former (aside from the brief discussion of the latter included in section 2). As indicated in section 4, Lewis is a materialist insofar as he believes that everything in the actual world is material. As a result, he rejects idealism, that is, the view that everything is mental, and dualism, the view that there are fundamentally two different types of entity, mental and physical. Thus, he is a physicalist, and, as mentioned above, an identity theorist. He is a type-type identity theorist, and as such, identifies each type of mental state (each type of experience we can have) with a type of neurophysiological state. So, for example, for Lewis, pain is identical to, say, c-fiber firing. (C-fibers are nerve fibers in the human central nervous system, activation of which is responsible for certain types of pain.) Such views are typically contrasted with token-token identity theories, which say only that each token mental state is identical to some token physical state. A token-token identity theorist will reject the rather general identity between pain and c-fiber firing, though they will recognize an identity between, say, the specific token of pain that Ronald Reagan felt when he was struck by John Hinkley Jr.’s bullet on March 30, 1981 and the appropriate token neurophysiological event which occurred in Reagan’s brain and which was caused by his nerves firing as a result of the bullet strike.

Lewis’s commitment to his theory of mind can be found in his earliest published work, in ‘An Argument for the Identity Theory’ (1966). Given the title, the reader will not be surprised that his main argument for it can be found there too. He argues that because mental states are defined in terms of their causal roles, being caused by certain stimuli and causing certain behaviors, and because every physical phenomenon’s occurrence can be explained by appeal only to physical phenomenon, the phenomena to which we appeal to explain our behaviors, which are usually rendered in the vocabulary of folk psychology (for example, Alice felt/believed x, so she did y), must themselves be physical phenomena. Folk psychology is the largely unscientific theory that each of us uses in order to explain and predict the behavior of others, by appealing to such things as pleasure, pain, beliefs, and desires. We are using folk psychology, for example, when we say that Alice screamed because she was in pain.

Concerning his first premise, Lewis thinks that, for instance, pain is defined by a set of pairs of causal inputs and behavioral outputs that is characteristic only to it. That set might include, for example, the causal input of a live electrode being put into contact with a human being, and the causal output of that human being vocalizing loudly. If this sounds behaviorist, that is because the view has its roots in behaviorism. But, unlike the behaviorist, Lewis does not think that that is all there is to say about mentality. He thinks that each mental state must still be a physical entity. While each is definable in terms of causal roles, each is a neurophysiological state. Furthermore, Lewis thinks that the mental concepts afforded to us by folk psychology pick out real mental states—at least for the most part. Thus Lewis expects that, by and large at least, each mental state that is part of our folk psychological theory will be definable in terms of a unique set of causal inputs and outputs. This sets Lewis (and other reductionists about the mind) apart from eliminativists, who expect no such accuracy in our folk psychological theory, and, indeed, often argue against its adequacy (as in, for example, Churchland 1981).

Lewis’s second premise is that the physical world is explanatorily closed. For any (explicable) physical phenomenon, there are some phenomena in terms of which it can be explained that are themselves physical. (Lewis leaves room for physical phenomena that have no explanations because they depend on chance, such as why a particular atom of uranium-235 decayed at a particular time t.) What is important for Lewis’s project is that this means we will never have to appeal to any non-physical (read: mental) entity in order to explain any physical phenomenon. And, because the causes and effects in the characteristic set that defines any given mental state are always physical (things like the placement of live electrodes and vocalizations), we will never need to invoke mental phenomena in order to explain any of these phenomena. We will be able to find some physical phenomena in terms of which to do so.

Very often, token-token identity theorists are role functionalists, who identify each type of mental state with a type of functional role. This role can, in principle, be realized by more than one type of physical state. And hence each type of mental state can, in principle, be realized by more than one type of physical state. But, according to role functionalists, a mental state itself is not identical to any physical state. So, for example, a role functionalist might identify pain with the functional state of bodily damage detection. That functional state is (we are supposing) realized in humans by c-fiber firings. As a result, pain is realized in humans by c-fiber firings. But it is something more abstract than just c-fiber firings; it is just whatever plays the role of bodily damage detection. It just so happens that what plays that role in humans is (we are supposing) c-fiber firings. Lewis was not a role functionalist. As stated, he identified each type of mental state with some type of physical state. So he identified pain with c-fiber firings, rather than saying that the former is realized by the latter.

This opens Lewis’s view up to the problem of the multiple realizability of the mental. This is the idea that human beings (or, more generally, organisms in which the role of bodily damage detection is played by c-fibers) are presumably not the only sorts of creatures that can be in pain. There may be animals on earth which lack c-fibers but which, when subjected to an electric shock, behave in the sort of way human beings behave, vocalizing loudly, moving away from the source of the shock, and so on. And even if there are not, we can imagine beings, perhaps Martians, that meet these conditions. What of them? Presumably, they can be in pain. But if they do not have c-fibers, then Lewis is forced to say that they, in fact, cannot be in pain.

In ‘Mad Pain and Martian Pain’ (1980a), Lewis deals with this problem by essentially biting the bullet. He recognizes that there will be distinct mental states associated with similar causal roles like human pain, jellyfish pain, Martian pain, and so forth. But he does not think this was too big a bullet to bite. The debate is, ultimately, just one about which state—realizer or role—we refer to when we use our folk psychological terminology to refer to mental states (such as ‘pleasure’, ’pain’, ‘belief’, ‘desire’, and so on). But Lewis also thinks there is good reason to prefer his view. Remember that he identifies mental states by their causal roles. Pain is whatever both is caused by certain sorts of stimuli (electric shocks, pricks with a needle, and so forth) and causes certain sorts of behavior (vocalizing loudly, moving away from the stimulus, and so forth). But an abstract functional role is not apt to play this causal role. There must be something physical that does so—that is actually involved in the push-and-pull of each causal chain of physical events. On Lewis’s account, according to which each type of mental state is a type of physical state, and in which each token mental state is a token physical state, there is always a physical state to play the needed causal role, and, moreover, to play that role while keeping the world at large completely material. One cannot help but appreciate how neatly this reply is connected to the argument he originally gives for his identity theory in his 1966 paper.

Another problem Lewis addresses in ‘Mad Pain and Martian Pain’ is, in a certain sense, the reverse of the problem of the multiple realizability of the mental. His terminology calls this ‘the problem of mad pain.’ The basic idea is that it is possible for there to be individual human beings (and as such, individuals we want to count as being capable of being in human pain), who lack the behavioral outputs that are typically associated with certain environmental inputs among humans, or have atypical behavioral outputs associated with certain environmental inputs. So, for example, when subjected to an electric shock, rather than screaming or moving away from its source, such an individual might sigh, relax her posture, and smile pleasantly. And when eating a piece of cake, she might scream and move away from it. Call such an individual a madman.

Even as early as his 1966 paper, Lewis is careful to characterize the characteristic causal role of a mental state as a set of typical associated environmental stimuli and behaviors (1966: 19–20). So the existence of a madman here or there does not cause problems for Lewis’s view. But, of course, one immediately wonders relative to what group these stimuli and behaviors are typically associated. He says, of the group relative to which we should characterize ‘pain’:

Perhaps (1) it should be us; after all, it’s our concept and our word. On the other hand, if it’s X we’re talking about, perhaps (2) it should be a population that X himself belongs to, and (3) it should preferably be one in which X is not exceptional. Either way, (4) an appropriate population should be a natural kind—a species, perhaps. (1980a: 219–20)

In the case of representative individuals of a population, all four criteria pull together. In the case of the Martian, criterion (1) is outweighed by the other three (whether the characteristic set for pain in Martians is exactly the same as it is in humans or if there are some differences between them). And in the case of the madman, it is criterion (3) that is outweighed by the other three. There will be certain cases with which Lewis’s account will have difficulties, to be sure. If a lightning strike hits a swamp and produces a one-off creature that is a member of no population apart from that consisting of just itself, Lewis’s account would provide no direction about how to regard a set of associated stimuli and behaviors which are correlated in the creature. That is, it would not tell us which mental state the set is associated with. But Lewis is prepared to live with such difficult cases, as he think our intuitions would not be reliable in such a situation anyway. As a result, he thinks that the fact that his theory provides no definitive answers in such cases is not a drawback of it, but, in fact, is in line with our pre-theoretic estimation of such cases.

A final issue worth mentioning is qualia—the subjective nature of an experience, for example, what it feels like to be in the sort of pain caused by a live electrode being put into contact with one’s left thumb. Identity theorists, and physicalists in general, often face the problem of qualia, that is, the allegation that their theory cannot make sense of the idea that there is something that it feels like to be in a particular mental state. One of the most famous statements of this problem is by Frank Jackson, in his paper, ‘Epiphenomenal Qualia’ (1982). He asks us to consider an individual, Mary, who has spent her entire life in a black and white room, never seeing any color other than black and white. Nonetheless, she has devoted herself to learning everything she can about color from (black and white) textbooks, television programs, and so forth, and is, at this point, perfectly knowledgeable about the subject. We can suppose she knows every piece of physical information there is to know about electromagnetism, optics, physiology, neuroscience, and so forth, that is related to color and its perception. Jackson then asks us to imagine that one day, Mary steps outside for the first time, and sees a red rose. He maintains that she learns something upon doing so that she did not know before, namely, what it is like to see red. Thus, Jackson concludes, not all information is physical information. This poses a problem for the physicalist because, according to physicalist, this should not be possible. There is nothing to know about color and its perception outside of the complete collection of physical information associated with color and its perception.

Lewis’s response to the qualia problem can be found in his Postscript to ‘Mad Pain and Martian Pain’ (1983b: 130–32), ‘What Experience Teaches’ (1988c), ‘Reduction of Mind’ (1994b), and ‘Should a Materialist Believe in Qualia?’ (1995). He credits it to Laurence Nemirow (1979, 1980, and 1990), and, in short, it is the idea that when Mary exits the room and sees a rose, she does not learn a new piece of information, instead, she gains a new ability. In particular, she gains the ability to make certain comparisons and to imagine certain sorts of objects that she was not able to do before. Now that she has seen the rose, she can go further out into the world and distinguish between things that are the same color as the rose and those which are not. And she can imagine what a red car would look like, even if she has not seen one. These are things she was not able to do before. But they are not propositional knowledge, in the sense that they are not things that can be expressed by a sentence of a language.

10. Other Work and Legacy

There are numerous aspects of Lewis’s work which this article has not discussed. He has influential views about the nature of dispositions, a discussion of which can be found in ‘Finkish Dispositions’ (1997b). He writes on free will in ‘Are We Free to Break the Laws?’ (1981a). And his discussions of his theory of mental content can be found in, for example, ‘Attitudes De Dicto and De Se’ (1979a) and ‘Reduction of Mind’ (1994b: 421 ff.). In addition to metaphysics, the philosophy of language, and the philosophy of mind, Lewis contributed to other subfields, including epistemology and philosophy of mathematics. The reader can find what Lewis has to say about knowledge in ‘Elusive Knowledge’ (1996b). His main focus in the philosophy of mathematics is on squaring his materialistic commitments with his liberal use of set theory (in, for example, his theory of properties). After all, sets are, prima facie, abstract objects. Lewis’s strategy is to provide an analysis of set theory in mereological terms. The parthood relation does much of the work that the membership relation does in set theory. A set of some objects is, for him, just their mereological sum. With this idea in place, Lewis is able to make sense of set-theoretic talk in terms of concrete objects which stand in parthood relationships to one another. The interested reader can find discussions of this issue in his book Parts of Classes (1991) and his articles ‘Nominalistic Set Theory’ (1970c) and ‘Mathematics is Megethology’ (1993b).

Lewis discusses central issues in the philosophy of religion, including the ontological argument in ‘Anselm and Actuality’ (1970a), and the problem of evil in ‘Evil for Freedom’s Sake’ (1993a) and the posthumous ‘Divine Evil’ (2007). In the philosophy of science, he discusses inter-theoretic reduction in ‘How to Define Theoretical Terms’ (1970b) and verificationism in ‘Statements Partly About Observation’ (1988b). Lewis also writes extensively on chance and probabilistic reasoning in, for example, ‘Prisoners’ Dilemma Is a Newcomb Problem’ (1979c), ’A Subjectivist’s Guide to Objective Chance’ (1980b), ‘Causal Decision Theory’ (1981b), ‘Why Ain’cha Rich?’ (1981c), ‘Probabilities of Conditionals and Conditional Probabilities’ (1976a), ‘Probabilities of Conditionals and Conditional Probabilities II’ (1986d), ‘Human Supervenience Debugged’ (1994a), and ‘Why Conditionalize?’ (1999b). And he discusses certain issues that fall at the intersection of probabilistic and practical reasoning in ‘Desire as Belief’ (1988a) and ‘Desire as Belief II’ (1996a).

Lewis makes contributions to deontic logic, which is a formal modal language used to express claims of obligation and permission, whose operators are interpreted to mean ‘it is obligatory that’ and ‘it is permissible that’, in, for example, ‘Semantic Analyses for Dyadic Deontic Logic’ (1974). Lewis also has well-developed views about ethics, metaethics, and applied ethics. In ‘Dispositional Theories of Value’ (1989b), Lewis develops a materialism-friendly theory of value in terms of things’ dispositions to affect us in appropriate ways (or to generate appropriate attitudes in us) in ideal conditions. These attitudes are certain (intrinsic, as opposed to instrumental) second-order desires. That is, one values something only if she desires that she desires it. As a result, Lewis is officially a subjectivist about value. But he thinks (or at least hopes) that there is enough commonality among moral agents that a more-or-less fixed set of values can be discerned. Lewis does not develop a systematic ethical system. But he delivers critiques of consequentialist ethical theories (according to which what makes an action right or wrong is determined by the nature of its consequences) like utilitarianism (according to which what makes an action right/wrong is that it maximizes/fails to maximize the benefit to the largest number of people). See, for example, ‘Reply to McMichael’ (1978), ‘Devil’s Bargains and the Real World’ (1984), and Plurality (1986b: 128). One general constraint Lewis does make explicit about his positive view is that an ethical theory should be compatible with there being multiple, potentially conflicting, moral values. Similarly, he thinks it might be impossible to provide a binary evaluation of someone’s character as good or bad, overall. It might be that we can only point to respects in which an individual has good or bad character. Nolan (2005: 189) takes it to be likely that Lewis’s positive ethical theory, to the extent it can be discerned in his writings, is a version of virtue ethics, and thus that he bases the rightness or wrongness of a particular act on whether a moral agent with appropriate virtues and in appropriate circumstances would perform it (see, for example, Lewis 1986b: 127). Lewis focuses on several issues in applied ethics, including punishment in ‘The Punishment that Leaves Something to Chance’ (1987) and ‘Do We Believe in Penal Substitution?’ (1997a), tolerance in ‘Academic Appointments: Why Ignore the Advantage of Being Right?’ (1989a) and ‘Mill and Milquetoast’ (1989c), and nuclear deterrence in ‘Devil’s Bargains and the Real World’ (1984), ‘Buy Like a MADman, Use Like a NUT’ (1986a), and ‘Finite Counterforce’ (1989b).

Truly, then, Lewis’s contributions to philosophy range much more widely than his most-known work. It is difficult to summarize Lewis’s legacy. He makes important contributions to understanding probability and probabilistic reasoning, and his work on conditionals—counterfactuals in particular—can only be described as foundational. His work on causation is very important as well. In particular, his move from a simpler counterfactual analysis of causation to one invoking the notion of influence is reflected in more recent interventionist accounts of causation, according to which the cause of an event E is something which, when manipulated in some way (for example, by slightly changing the time at which it occurs or the manner in which it occurs), one can modify E. And, as Woodward (2016, sec. 9) notes, interventionist accounts are ultimately counterfactual accounts, and so they are also in this way indebted to Lewis’s earlier work on causation as well as to his work on counterfactuals. While dualism about the mind is much more popular in the first two decades of the twenty-first century than in Lewis’s day, his argument for his identity theory, which appeals to the explanatory closure of the physical world, is an important foil for the dualists who emerged in the 1980s and 90s. And his and Nemirow’s response to the problem of qualia was also a must-address for those dualists.

Lewis’s discussion of time and perdurance in Plurality generated a large debate in that area, and to a great extent set its parameters. Recall (see section 4) that he sets out three ways of solving the problem of temporary intrinsics: regarding intrinsic properties like shape to be relations to times, presentism, and his own worm theory. A lot of work was done exploring the tenability of each of these options, and exploring other nearby options. In addition, Lewis’s paper ‘The Paradoxes of Time Travel’ (1976b) is arguably responsible for an entire sub-literature on that topic.

Lewis’s metaphysics is, by and large, nominalist. But realism about universals is much more popular today than it was in the mid-20th century. As nominalistic as his views are, Lewis makes important moves away from the ideas which formed the environment in which his philosophical development took place. Quine, of course, believed that there is “no entity without identity” (for example, 1969: 23). What he intended by this is that we must have clear identity conditions for any entity whose existence we posit. This is one of the reasons why Quine was happy to recognize the existence of sets, which are individuated extensionally, that is, according to which members they have, but was skeptical of such things as properties. Lewis makes properties extensional by identifying them with sets, but goes a step further by allowing their extensions to range across all possibilia, rather than just actual entities. Lewis then goes even further in conceding, in ‘New Work for a Theory of Universals’ (1983a), that universals can do things which properties, as conceived by Lewis, cannot do. His basic distinction between properties which are perfectly natural and those which are not is rather anti-nominalistic, and this position can be understood as a bridge connecting the Quinean extensional picture of the world with the new hyperintensional picture of it, which allows for distinctions amongst entities, such as properties or propositions, that are not only extensionally equivalent, in that they apply to the same things or are all true or false at the actual world, but are intensionally equivalent, that is, they do so or are so at every possible world. An example are the properties, mentioned in section 3, being a triangular polygon and being a trilateral (three-sided) polygon. Sider (2011) generalizes Lewis’s idea from properties, which are the worldly correlates of predicates, to other sorts of entities, including the worldly correlates of predicate modifiers, sentential connectives, and quantifiers. He ends up with a very general notion of joint-carvingness, which is a feature of certain of our linguistic expressions, and he uses the notion to characterize the notion of fundamentality, as Lewis does with naturalness (for Lewis, the perfectly natural properties are the fundamental properties, all other properties being definable in terms of them—see, for example, 1994a: 474). It is hard to say exactly what the philosophical world today would be like without Lewis. But we can be sure that it would be very different than it is.

11. References and Further Reading

Note: Many of the papers below have been reprinted, sometimes with postscripts, in one of the collections Lewis 1983b, 1986c, 1998, 1999a, and 2000b; below, only the first appearance is cited.

a. Primary Sources

  • Lewis, David K. 1966. An Argument for the Identity Theory. Journal of Philosophy 63, 17–25.
  • Lewis, David K. 1968. Counterpart Theory and Quantified Modal Logic. Journal of Philosophy 65, 113–26.
  • Lewis, David K. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.
  • Lewis, David K. 1970a. Anselm and Actuality. Noûs 4, 175–88.
  • Lewis, David K.1970b. How to Define Theoretical Terms. Journal of Philosophy 67, 427–46.
  • Lewis, David K. 1970c. Nominalistic Set Theory. Noûs 4, 225–40. Reprinted in Lewis 1998, 186–202.
  • Lewis, David K. 1971. Counterparts of Persons and Their Bodies. Journal of Philosophy 68, 203–11.
  • Lewis, David K. 1973a. Causation. Journal of Philosophy 70, 556–67.
  • Lewis, David K. 1973b. Counterfactuals. Oxford: Blackwell.
  • Lewis, David K. 1974. Semantic Analyses for Dyadic Deontic Logic. In Sören Stenlund (ed.), Logical Theory and Semantic Analysis: Essays Dedicated to Stig Kanger on His Fiftieth Birthday. Dordrecht: Reidel.
  • Lewis, David K. 1975. Languages and Language. In Keith Gunderson (ed.), Minnesota Studies in the Philosophy of Science. University of Minnesota Press, 3–35.
  • Lewis, David K. 1976a. Probabilities of Conditionals and Conditional Probabilities. Philosophical Review 85, 297–315.
  • Lewis, David K. 1976b. The Paradoxes of Time Travel. American Philosophical Quarterly 13, 145–52.
  • Lewis, David K. 1978. Reply to McMichael. Analysis 38, 85–86.
  • Lewis, David K. 1979a. Attitudes De Dicto and De Se. The Philosophical Review 88, 513–43.
  • Lewis, David K. 1979b. Counterfactual Dependence and Time’s Arrow. Noûs 13, 455–76.
  • Lewis, David K. 1979c. Prisoners’ Dilemma Is a Newcomb Problem. Philosophy and Public Affairs 8, 235–40.
  • Lewis, David K. 1980a. Mad Pain and Martian Pain. In Ned Block (ed.), Readings in Philosophy of Psychology,  Vol. 1. Cambridge, MA: Harvard University Press, 216–22.
  • Lewis, David K. 1980b. A Subjectivist’s Guide to Objective Chance. In Richard C. Jeffrey (ed.), Studies in Inductive Logic and Probability, Vol. II. Berkeley, CA: University of California Press, 263–93.
  • Lewis, David K. 1981a. Are We Free to Break the Laws? Theoria 47, 113–21.
  • Lewis, David K. 1981b. Causal Decision Theory. Australasian Journal of Philosophy 59, 5–30.
  • Lewis, David K. 1981c. Why Ain’cha Rich? Noûs 15, 377–80.
  • Lewis, David K. 1983a. New Work for a Theory of Universals. Australasian Journal of Philosophy 61, 343–77.
  • Lewis, David K. 1983b. Philosophical Papers, Vol. I. Oxford: Oxford University Press.
  • Lewis, David K. 1984. Devil’s Bargains and the Real World. In Douglas MacLean (ed.), The Security Gamble:  Deterrence in the Nuclear Age. Totowa, NJ: Rowman and Allenheld, 141–154.
  • Lewis, David K. 1986a. Buy Like a MADman, Use Like a NUT. QQ 6: 5–8.
  • Lewis, David K. 1986b. On the Plurality of Worlds. Oxford: Blackwell.
  • Lewis, David K. 1986c. Philosophical Papers, Vol. II. Oxford: Oxford University Press.
  • Lewis, David K. 1986d. Probabilities of Conditionals and Conditional Probabilities II. Philosophical Review 95, 581–89.
  • Lewis, David K. 1987. The Punishment that Leaves Something to Chance. In Proceedings of the Russellian Society (University of Sydney) 12, 81–97. Also in Philosophy and Public Affairs 18, 53–67.
  • Lewis, David K. 1988a. Desire as Belief. Mind 97, 323–32.
  • Lewis, David K. 1988b. Statements Partly About Observation. Philosophical Papers 17, 1–31.
  • Lewis, David K. 1988c. What Experience Teaches. Proceedings of the Russellian Society (University of Sydney) 13, 29–57.
  • Lewis, David K. 1989a. Academic Appointments: Why Ignore the Advantage of Being Right? In Ormond Papers, Ormond College, University of Melbourne.
  • Lewis, David K. 1989b. Finite Counterforce. In Henry Shue (ed.), Nuclear Deterrence and Moral Restraint. Cambridge: Cambridge University Press, 51–114.
  • Lewis, David K. 1989c. Mill and Milquetoast. Australasian Journal of Philosophy 67, 152–71.
  • Lewis, David K. 1991. Parts of Classes. Oxford: Blackwell.
  • Lewis, David K. 1993a. Evil for Freedom’s Sake. Philosophical Papers 22, 149–72.
  • Lewis, David K. 1993b. Mathematics is Megethology. Philosophia Mathematica 3, 3–23.
  • Lewis, David K. 1994a. Humean Supervenience Debugged. Mind 103, 473–90.
  • Lewis, David K. 1994b. Reduction of Mind. In Samuel Guttenplan (ed.), A Companion to the Philosophy of Mind. Oxford: Blackwell, 412–31.
  • Lewis, David K. 1995. Should a Materialist Believe in Qualia? Australasian Journal of Philosophy 73, 140–44.
  • Lewis, David K.1996a. Desire as Belief II. Mind 105, 303–13.
  • Lewis, David K. 1996b. Elusive Knowledge. Australasian Journal of Philosophy 74, 549–67.
  • Lewis, David K.1997a. Do We Believe in Penal Substitution? Philosophical Papers 26, 203–09.
  • Lewis, David K. 1997b. Finkish Dispositions. The Philosophical Quarterly 47, 143–58.
  • Lewis, David K. 1998. Papers in Philosophical Logic. Cambridge: Cambridge University Press.
  • Lewis, David K. 1999a. Papers on Metaphysics and Epistemology. Cambridge: Cambridge University Press.
  • Lewis, David K. 1999b. Why Conditionalize? In Lewis 1999a. (Written in 1972.)
  • Lewis, David K. 2000a. Causation as Influence. Journal of Philosophy 97, 182–97.
  • Lewis, David K. 2000b. Papers in Ethics and Social Philosophy. Cambridge: Cambridge University Press.
  • Lewis, David K. 2002 Tensing the Copula. Mind 111, 1–13.
  • Lewis, David K. 2004. Causation as Influence (extended version). In John Collins, Ned Hall, and L. A. Paul (eds),  Causation and Counterfactuals. Cambridge, MA: MIT Press, 75–106.
  • Lewis, David K. 2007. Divine Evil. In Louise M. Antony (ed.), Philosophers without Gods: Meditations on Atheism and the Secular Life. Oxford: Oxford University Press.

b. Secondary Sources

  • Armstrong, David M. 1978a. Universals and Scientific Realism, Vol. I: Nominalism and Realism.  Cambridge: Cambridge University Press.
  • Armstrong, David M. 1978b. Universals and Scientific Realism, Vol. II: A Theory of Universals. Cambridge: Cambridge University Press.
  • Armstrong, David M. 1983. What Is a Law of Nature? Cambridge: Cambridge University Press.
  • Churchland, Paul 1981. Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy 78, 67–90.
  • Fine, Kit 1975. Critical Notice of Counterfactuals. Mind 84, 451–58.
  • Goodman, Nelson 1947. The Problem of Counterfactual Conditionals. Journal of Philosophy 44, 113–28.
  • Goodman, Nelson 1955. Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press.
  • Hall, Ned. 2004. Two Concepts of Causation. In John Collins, Ned Hall, and L.A. Paul (eds), Causation and Counterfactuals. Cambridge, MA: The MIT Press, 225–76.
  • Kripke, Saul A. 1980. Naming and Necessity. Cambridge, MA: Harvard University Press.
  • Nemirow, Laurence 1979. Functionalism and the Subjective Quality of Experience. Doctoral Dissertation, Stanford University.
  • Nemirow, Laurence 1980. Review of Thomas Nagel, Moral Questions. Philosophical Review 89, 475–76.
  • Nemirow, Laurence 1990. Physicalism and the Cognitive Role of Acquaintance. In William G. Lycan (ed.), Mind and Cognition. Oxford: Blackwell.
  • Nolan, Daniel 2005. David Lewis. Chesham: Acumen.
  • Quine, William Van Orman. 1969. Ontological Relativity and Other Essays. New York: Columbia University Press.
  • Sider, Theodore. 1996. All the World’s a Stage. Australasian Journal of Philosophy 74, 433–53.
  • Sider, Theodore. 2001. Four-Dimensionalism: An Ontology of Persistence and Time. Oxford: Oxford University Press.
  • Sider, Theodore. 2011. Writing the Book of the World. Oxford: Oxford University Press.
  • Stalnaker, Robert C. 1968. A Theory of Conditionals. In Nicolas Rescher (ed.), Studies in Logical TheoryAmerican Philosophical Quarterly Monograph Series, Vol. 2. Oxford: Blackwell, 98–112.
  • van Inwagen, Peter. 1990. Material Beings. New York: Cornell University Press.
  • Weatherson, Brian. 2016. David Lewis. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.
  • Woodward, James. 2016. Causation and Manipulability. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.

c. Further Reading

  • Nolan, Daniel 2005. David Lewis. Chesham: Acumen.
  • Jackson, Frank and Graham Priest. 2004. Lewisian Themes: The Philosophy of David K. Lewis. Oxford:   Oxford University Press.
  • Loewer, Barry and Jonathan Schaffer. 2015. A Companion to David Lewis. Oxford: Blackwell.
  • Weatherson, Brian. 2016. David Lewis. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.

 

Author Information

Scott Dixon
Email: ts.dixon@ashoka.edu.in
Ashoka University
India

Meaning and Context-Sensitivity

Truth-conditional semantics explains meaning in terms of truth-conditions. The meaning of a sentence is given by the conditions that must obtain in order for the sentence to be true. The meaning of a word is given by its contribution to the truth-conditions of the sentences in which it occurs.

What a speaker says by the utterance of a sentence depends on the meaning of the uttered sentence. Call what a speaker says by the utterance of a sentence the content of the utterance. Natural languages contain many words whose contribution to the content of utterances varies depending on the contexts in which they are uttered. The typical example of words of this kind is the pronoun ‘I’. Utterances of the sentence ‘I am hungry’ change their contents depending on who the speaker is. If John is speaking, the content of his utterance is that John is hungry, but if Mary is speaking, the content of her utterance is that Mary is hungry.

The words whose contribution to the contents of utterances depends on the context in which the words are uttered are called context-sensitive. Their meanings are guidance for speakers to use language in particular contexts for expressing particular contents.

This article presents the main theories in philosophy of language that address context-sensitivity. Section 1 presents the orthodox view in truth-conditional semantics. Section 2 presents linguistic pragmatism, also known as ‘contextualism’, which comprises a family of theories that converge on the claim that the orthodox view is inadequate to account for the complexity of the relations between meanings and contexts. Sections 3 and 4 present indexicalism and minimalism, which from two different perspectives try to resist the objections raised by linguistic pragmatism against the orthodox view. Section 5 presents relativism, which provides a newer conceptualization of the relations between meanings and contexts.

Table of Contents

  1. The Orthodox View in Truth-Conditional Semantics
    1. Context-Sensitive Expressions and the Basic Set
    2. Following Kaplan: Indexes and Characters
    3. Context-Sensitivity and Saturation
    4. Grice on What is Said and the Syntactic Constraint
    5. Semantic Contents of Utterances
  2. Departing from the Orthodox View: Linguistic Pragmatism
    1. Underdetermination of Semantic Contents
    2. Completions and Expansions
    3. Saturation and Modulation
    4. Core Ideas and Differences among Linguistic Pragmatists
  3. Defending the Orthodox View: Indexicalism
    1. Extending Indexicality and Polysemy
    2. Two Objections to Linguistic Pragmatism: Overgeneration and Normativity
    3. Hidden Variables and the Binding Argument
    4. Objections to the Binding Argument
  4. Defending the Autonomy of Semantics: Minimalism
    1. Distinguishing Semantic Content from Speech Act Content
    2. Rebutting the Arguments for Linguistic Pragmatism
    3. Motivation and Tenets of Minimalism
    4. Testing Context-Sensitivity
  5. Defending Invariant Semantic Contents: Relativism
    1. Indexicality, Context-Sensitivity, and Assessment-Sensitivity
    2. The Intelligibility of Assessment-Sensitivity
    3. Faultless Disagreement
  6. References and Further Reading
    1. References
    2. Further Reading

1. The Orthodox View in Truth-Conditional Semantics

a. Context-Sensitive Expressions and the Basic Set

The orthodox view in truth-conditional semantics maintains that the content (proposition, truth-condition) of an utterance of a sentence is the result of assigning contents, or semantic values, to the elements of the sentence uttered in accord with their meanings and combining them in accord with the syntactic structure of the sentence. The content of the utterance is determined by the conventional meanings of the words that occur in the sentence.

Conventional meanings are divided into two kinds. Meanings of the first kind determine semantic values that remain constant in all contexts of utterance. Meanings of the second kind provide guidance for the speaker to exploit information from the context of utterance to express semantic values. Linguistic expressions governed by meanings of the second kind are context-sensitive and can be used to express different semantic values in different contexts of utterance. The following is a list of some context-sensitive expressions (Donaldson and Lepore 2012):

Personal pronouns: I, you, she

Demonstratives: this, that

Adjectives: present, local, foreigner

Adverbs: here, now, today

Nouns: enemy, foreigner, native

Cappelen and Lepore (2005) call the set of expressions that exhibit context-sensitivity in their conventional meaning the Basic Set. Compare the following pair of utterances:

(1) I am hungry (uttered by John).

(2) John is hungry (uttered by Mary).

Utterance (1) and utterance (2) have the same truth-conditional content. Both are true if and only if John is hungry. Yet, the sentence ‘I am hungry’ and the sentence ‘John is hungry’ have different meanings. The meaning of the first-person pronoun ‘I’ prescribes that only the speaker can utter it to refer to herself. Only John can utter the sentence ‘I am hungry’ to say that John is hungry. In a context where the speaker is not John, the sentence ‘I am hungry’ cannot be uttered to say that John is hungry. The meaning of the proper name ‘John’, instead, allows for speakers in different contexts of utterance to refer to John. In all contexts of utterance the sentence ‘John is hungry’ can be uttered to say that John is hungry.

b. Following Kaplan: Indexes and Characters

Since David Kaplan’s works (1989a, 1989b) in formal semantics, the conventional meaning of a word is a function from an index, which represents features of the context of utterance, to a semantic value. The features of the context of utterance include who is speaking, when, where, the object the speaker refers to with a demonstrative, and the possible world where the utterance takes place. Adopting Kaplan’s terminology, philosophers call the function from indexes to semantic values character. The semantic values of the words in a sentence relative to an index are composed into a function that distributes truth-values at points of evaluation, pairs of possible worlds and times. The formal semantic machinery determines the condition under which a sentence relative to a given index is true at a world and a time. For example, John’s utterance (1) is represented as the pair formed of the sentence ‘I am hungry’ and an index i that contains John as speaker. The semantic machinery determines the truth-condition of this pair so that the sentence ‘I am hungry’ at the index i is true at a possible world w and a time t if and only if the speaker of i is hungry in w at t; that is, if and only if John is hungry in w at time t. If Mary uttered the sentence ‘I am hungry’, another index i* with Mary as speaker would be needed to represent her utterance. The semantic machinery would ascribe to the sentence ‘I am hungry’ at the index i* the content that is true at a possible world w and a time t if and only if Mary is hungry in w at time t.

In formal semantics, then, context-sensitive meanings are characters that vary depending on the indexes that represent features of the contexts of utterance, where indexes are tuples of slots, or parameters, to be filled in in order for sentences at indexes to have a truth-conditional content. The meanings of context-insensitive expressions, instead, are characters that remain constant in all indexes. For example, the meaning of the proper name ‘John’ is a constant character that returns John as semantic value in all indexes. No matter who is speaking, when, or where, John is the semantic value of the proper name ‘John’, and the sentence ‘John is hungry’, relative to all indexes, is true at a world w and time t if and only if John is hungry in w at time t.

It is convenient here to introduce an aspect relevant to section 5. Since the indexes that are used to represent features of contexts of utterance contain possible worlds and times, the semantic machinery distributes unrelativised truth-values to index-sentence pairs. A sentence S at index i is true (simpliciter) if and only if S is true in iw at it, where iw and it are the possible world and the time of index; see Predelli (2005: 22). For example, if John utters the sentence ‘I am hungry’ at noon on 23 November 2019, the index that represents the features of John’s context of utterance contains the time noon on 23 November 2019 and the actual world. John’s utterance is true (simpliciter) if and only if John is hungry at noon on 23 November 2019 in our actual world.

c. Context-Sensitivity and Saturation

The orthodox truth-conditional view in semantics draws the distinction between the meaning of an expression type and the content of an utterance of the expression. The meaning of the expression type is the linguistic rule that governs the use of the expression. Context-insensitive expressions are governed by linguistic rules that determine their contents (semantic values), which remain invariant in all contexts of utterance. Context-sensitive expressions, instead, are governed by linguistic rules that prescribe how the speaker can use them to express contents in contexts of utterance.

The meanings of context-sensitive expressions specify what kinds of contextual factors play certain roles with respect to utterances. More precisely, the meanings of context-sensitive expressions fix the parameters that have to be filled in in order for utterances to have contents. Philosophers and linguists use the technical term saturation for what the speaker does by filling in the demanded parameters with values taken from contextual factors. Indexicals are typical examples of context-sensitive expressions. For example, the meaning of the pronoun ‘I’ establishes that an utterance of it refers to the agent that produces it. The meaning of the demonstrative ‘that’ establishes that an utterance of it refers to the object that plays the role of demonstratum in the context of utterance. Thus, the meaning of ‘I’ demands that the speaker fill in an individual, typically herself, as the value of the parameter speaker of the utterance. And the meaning of ‘that’ demands that the speaker fill in a particular object she has in mind as the value of the parameter demonstratum.

In formal semantics the parameters that are filled in with values are represented with indexes, and the meanings of expressions are functions—characters—from indexes to contents. The meanings of context-insensitive expressions are constant characters, while the meanings of context-sensitive expressions are variable characters. If a sentence contains no context-sensitive expressions, it can be uttered to express the same content in all contexts of utterance. On the contrary, if a sentence contains context-sensitive expressions, it can be used to express different contents in different contexts of utterances.

d. Grice on What is Said and the Syntactic Constraint

One of the main tenets of the orthodox truth-conditional view is that all context-sensitivity is linguistically triggered in sentences or in their logical forms. The presence of each component of the truth-conditional content of an utterance of a sentence is mandatorily governed by a linguistic element occurring in the uttered sentence or in its logical form. For this reason, some philosophers equate the distinction between the meanings of expression types and the contents of utterances with Paul Grice’s (1989) distinction between sentence meaning and what is said by an utterance of a sentence. The sentence meaning is given by the composition of the meanings of the words that occur in the sentence. What is said corresponds to the truth-conditional content that the speaker expresses by undertaking the processes of disambiguation, reference assignment, and saturation that are required by her linguistic and communicative intentions and by the meanings of the uttered words.

Grice held that what is said is part of the speaker’s meaning. The speaker’s meaning is the content that the speaker intends to communicate by an utterance of a sentence. In Grice’s view, the speaker’s meaning comprises two parts: What is said and what is implicated. What is said is the content that the speaker explicitly and directly communicates by the utterance of a sentence. What is implicated is the content the speaker intends to convey indirectly. Grice called the contents that are indirectly conveyed implicatures. Implicatures can be inferred from what is said and general principles governing communication: the cooperative principle and its maxims. To illustrate Grice’s distinctions, suppose that at a party A, pointing to Bob and speaking to B, utters the following sentence:

(3) That guest is thirsty.

Following Grice, the utterance of (3) can be analysed at three distinct levels. (i) The level of sentence meaning is given by the linguistic conventions that govern the use of the words in the sentence. Due to linguistic competence alone, the hearer B understands that A’s utterance is true if and only if the individual, to whom A refers with the complex demonstrative ‘that guest’, is thirsty. (ii) The second level is given by what A says, that is, the truth-conditional content A’s utterance expresses. What is said—the content of A’s utterance—is that Bob is thirsty. To understand this content, B must consider A’s expressive and communicative intentions. B must understand that A has Bob in mind and wants to refer to him. To do so, B needs to rely on his pragmatic competence and contextual information. Mere linguistic competence is not enough. (iii) Finally, there is the level of what is meant through a conversational implicature. A intends that B offer Bob some champagne. Grice’s idea is that to understand what A intended to communicate, B must first understand the content of what A said—that Bob is thirsty—and then understand the implicature that it would be nice to offer Bob some champagne.

One very important aspect of Grice’s view is that each element that enters the content of what is said corresponds to some linguistic expression in the sentence. Grice maintained that what is said is “closely related to conventional meanings of words” (1989: 25). Grice imposed a syntactic constraint on what is said, according to which each element of what is said must correspond to an element of the sentence uttered. Carston (2009) speaks of the ‘Isomorphism Principle’, which states that if an utterance of a sentence S expresses the propositional content P, then the constituents of P must correspond to the semantic values of some constituents of S or of its logical form.

e. Semantic Contents of Utterances

Some philosophers reject the equation of the notion of content of an utterance with Grice’s notion of what is said. For example, Korta and Perry (2007) maintain that the content of an utterance is determined by the conventional meanings of the words the speaker utters and by the fact that the speaker undertakes all the semantic burdens that are demanded by those meanings, in particular disambiguation, reference assignment, and saturation of context-sensitive expressions. Korta and Perry call the content of an utterance so determined locutionary content (see also Bach 2001) and argue that there are clear cases in which the locutionary content does not coincide with Grice’s what is said, which is always part of what the speaker intends to communicate, that is, the speaker’s meaning. Irony is a typical example of this distinction. When, pointing to X, a speaker utters the sentence:

(4) He is a fine friend

ironically, the speaker does not intend to communicate that X is a fine friend, but the opposite. Nonetheless, without identifying the referent of ‘he’ and the literal content of ‘is a fine friend’, that is, without understanding the locutionary content of (4), the hearer is not able to understand the speaker’s ironic comment.

To illustrate in detail the debate on Grice’s notion of what is said goes beyond the purpose of this article. It is important to remark here that, according to the orthodox truth-conditional view—at least when speakers use language literally—what is said by an utterance of a sentence corresponds to the content that is determined by the conventional meanings of the words in the uttered sentence: The speaker undertakes all the semantic burdens that are demanded by those meanings, such as disambiguation, reference assignment, and saturation of context-sensitive expressions. When a speaker uses language literally, the content of an utterance of a sentence is what one gets by composing the semantic values of the expressions that occur in accord with their conventional meanings and the syntactic structure of the sentence. This content is a fully propositional one with a determinate truth-condition. This picture, which underlies the orthodox truth-conditional view in semantics, has been challenged by philosophers who call for a new theoretical approach. This new approach is called linguistic pragmatism and it expands the truth-conditional roles of pragmatics. The following section presents it.

2. Departing from the Orthodox View: Linguistic Pragmatism

a. Underdetermination of Semantic Contents

Neale (2004) coined the term ‘linguistic pragmatism’, though some philosophers and linguists prefer the term ‘contextualism’. Linguistic pragmatism comprises a family of theories (Bach 1994, 2001, Carston 2002, Recanati 2004, 2001, Sperber and Wilson 1986) that converge on one main thesis, that of semantic underdetermination. Linguistic pragmatists maintain that the meanings of most expressions—perhaps all, according to radical versions of linguistic pragmatism—underdetermine their contents in contexts, and pragmatic processes that are not linguistically governed are required to determine them. The main point of linguistic pragmatism is the distinction between semantic underdetermination and indexicality.

The orthodox view accepts that context-sensitivity is codified in the meanings of indexical expressions, which demand saturation processes. Linguistic pragmatists too accept this form of context-sensitivity, but according to them indexicality does not exhaust context-sensitivity. Linguistic pragmatists say that the variability of contents in contexts of many expressions is not codified in linguistic conventions. Rather, the variability of contents in contexts is due to the fact that the meanings of the expressions underdetermine their contents. Speakers must complete the meanings of the expressions with contents that are not determined by linguistic conventions codified in those meanings. The pragmatic operations that intervene in the process of completing the contents in context are not governed by conventions of the language, that is, by linguistic information, but work on more general contextual information.

Linguistic pragmatists make use of three kinds of arguments to support their view:

(i) Context-shifting arguments test people’s intuitions about the content of sentences in actual or hypothetical contexts of utterance. If people have the intuition that a sentence S expresses differing contents in different contexts, despite the fact that no overt context-sensitive expression occurs in S, it is evidence that some expression that occurs in S is semantically underdetermined. Consider the following example. Mark is 185 cm tall, and George utters the sentence:

 (5) Mark is short

in a conversation about the average height of basketball players and then in a conversation about the average height of American citizens. People have the intuition that what George said in the first context is true while what he said in the second context is false. Linguistic pragmatists draw the conclusion that the content of (5) varies through contexts of utterance, despite the fact that the adjective ‘short’ is not an overt context-sensitive expression. They argue that the content of ‘short’ is underdetermined by its conventional meaning and explain the variation in content from context to context as a result of pragmatic processes that are not linguistically governed but nonetheless complete the meaning of ‘short’.

(ii) Incompleteness arguments too test people’s intuitions about the contents of sentences in context, pointing at people’s inability to evaluate the truth-value of a sentence without taking into account contextual information. Suppose George utters the sentence:

(6) Anna is ready.

People cannot say whether George’s utterance is true or false without considering what Anna is said to be ready for. The conclusion now is that (6) does not express a full propositional content with determinate truth-conditions. There is no such thing as Anna’s being ready simpliciter. The explanation is semantic underdetermination: The adjective ‘ready’ does not provide an invariant contribution to a full propositional content and it does not provide guidance to determine such a contribution either, because it is not an overt context-sensitive expression. The enrichment that is required to determine a full truth-conditional content is the result of a pragmatic process that is not governed by the meaning of ‘ready’.

(iii) Inappropriateness arguments spot the difference between the content that is strictly encoded in a sentence and the contents that are expressed by utterances of that sentence in different contexts. Suppose a math teacher utters the following sentence in the course of a conversation about her class:

(7) There are no French girls.

People usually understand the math teacher to say that there are no French girls attending the math class. Some philosophers say that in this case there is an invariant semantic content composed out of the meanings of the words in the sentence: French girls do not exist. However, it seems awkward both to claim that in uttering (7) the speaker says that French girls do not exist and to claim that hearers understand (7) as denying the existence of French girls in general. On the contrary, it seems convenient to suppose that both speakers and hearers restrict the interpretation of (7) to a particular domain, such as the students attending the math class.

b. Completions and Expansions

The claim on which all versions of linguistic pragmatism agree is that very often the content of an utterance is richer than the content obtained composing the semantic values of the expressions in the uttered sentence. Adopting a terminology from Bach (1994), it is common to distinguish two cases of pragmatic enrichments: completions and expansions.

With completions, the content determined by the meanings of the expressions that occur in a sentence is incomplete because it lacks full truth-conditions. These cases often recur in context-shifting arguments and incompleteness arguments:

(5) Mark is short.

(6) Anna is ready.

People do not know what conditions a person must meet to be short or ready simpliciter, so it appears there are no determinate conditions making a person so. To obtain a truth-conditional content it is necessary to add elements that do not correspond to any expression in (5) and (6). Linguistic pragmatists maintain that what is said is a completion of the content that is obtained by composing the meanings of the expressions in the sentence with some completion taken from the context. For instance, the contents of (5) and (6) could be completed in ways that might be expressed as follows:

(5*) Mark is short with respect to the average height of basketball players.

(6*) Anna is ready to climb Eiger’s North Face.

With expansions, the content of an utterance of a sentence is an enrichment of the literal content obtained by composing the semantic values of the expressions in the sentence. Some interesting cases of expansions are employed in inappropriateness arguments. Consider the following examples:

(8) All the students got an A.

(9) Anna has nothing to wear.

In these cases, there is a complete content that does not correspond to the content of the utterance. (8) expresses the content that all students in existence got an A, and (9) expresses the content that Anna has no clothes to wear at all. However, these sentences are usually used to express different contents. For example, (8) can be used by the logic professor to say that all students in her class got an A, and (9) can be used to say that Anna has no appropriate dress for a particular occasion.

c. Saturation and Modulation

Linguistic pragmatists maintain that completions and expansions are obtained through pragmatic processes that are not linguistically driven by conventional meanings. Recanati draws a distinction between saturation and modulation: Processes of saturation are mandatory pragmatic processes required to determine the semantic contents of linguistic expressions (bottom-up or linguistically driven processes). Processes of modulation are optional pragmatic processes that yield completions and expansions (top-down or ‘free’ processes).

Pragmatic processes of saturation are directed and governed by the linguistic meanings of context-sensitive expressions. For instance, the linguistic meaning of the demonstrative ‘that’ demands the selection of a salient object in the context of utterance to determine the referent of the demonstrative. In contrast, pragmatic processes of modulation are optional because they are not activated by linguistic meanings. They are not activated for the simple reason that the elements that form completions and expansions do not match any linguistic expression in the sentence. Recanati distinguishes three types of pragmatic processes of modulation:

(i) Free enrichment is a process that narrows the conditions of application of linguistic expressions. Some of the above examples are cases of free enrichment. In (8) the domain of the quantifier ‘all students’ is restricted to the logic class and in (9) the domain of ‘nothing to wear’ is restricted to appropriate dresses for a given occasion. In (5) the conditions of application of the adjective ‘short’ are restricted to people whose height is lower than the average basketball player. In (6) the conditions of application of the adjective ‘ready’ are restricted to people who acquired technical and physical ability for climbing Eiger’s North Face.

(ii) Loosening is a process that widens the conditions of application of words specifying the degree of approximation. Here is one example used by Recanati:

(10) The ATM swallowed my credit card.

Literally speaking, an ATM cannot swallow anything because it does not have a digestive system. In this case, the conditions of application of the verb ‘swallow’ are made loose so as to include a wider range of actions. Another example of loosening is the following:

(11) France is hexagonal.

This sentence does not say that the borders of France draw a perfect hexagon, but that it does so approximately.

(iii) Semantic transfer is a process that maps the meaning of an expression onto another meaning. The following is an example of semantic transfer. Suppose a waiter in a bar says to his boss:

(12) The ham sandwich left without paying.

Through a process of modulation, the meaning of the phrase ‘the ham sandwich’ is mapped onto the meaning of the phrase ‘the customer who ordered the ham sandwich’.

d. Core Ideas and Differences among Linguistic Pragmatists

The orthodox truth-conditional view distinguishes two kinds of pragmatic processes, primary ones and secondary ones. Primary pragmatic processes contribute to determine the contents of utterances for context-sensitive expressions. Secondary pragmatic processes contribute to conversational implicatures and are activated after the composition of the contents of utterances has been accomplished. The fundamental aspect of the orthodox view that linguistic pragmatists reject is the idea that primary pragmatic processes are only processes of saturation, which are activated and driven by conventional meanings of words. Linguistic pragmatists affirm that primary pragmatic processes also include processes of modulation that are not encoded in linguistic meanings. According to linguistic pragmatism, the process of truth-conditional composition that gives the contents of utterances is systematically underdetermined by linguistic meanings.

The different versions of linguistic pragmatism are all unified by the criticism of the orthodox view. Recanati calls the content of an utterance in the pragmatist conception ‘pragmatic truth-conditions’, Bach speaks of ‘implicitures’, Carston of ‘explicatures’. There are important and substantive differences among these notions. For Bach an impliciture is a pragmatic enrichment of the strict semantic content that is determined by linguistic meanings alone and can be truth-conditionally incomplete. The strict semantic content is like a template that needs to be filled. Recanati argues that Bach’s strict semantic content is only a theoretical abstraction that does not perform any proper role in the computation of what is said. Carston and relevance theorists like Sperber and Wilson adopt a similar view, but—in contrast with Recanati—they affirm that primary and secondary pragmatic processes are, from a cognitive point of view, processes of the same kind that are explained by the principle of relevance, according to which one accepts the interpretation that satisfies the expectation of relevance with the least effort.

However, there is something on which Bach, Recanati, Carston, Sperber and Wilson all agree: Very often, semantic interpretation alone gives at most semantic schemata, and only with the help of pragmatic processes of modulation can a complete propositional content be obtained.

Finally, the most radical views of Searle (1978), Travis (2008), and Unnsteinsson (2014) claim that conventional meanings do not exist. Speakers rely upon models of past applications of words and any new interpretation of a word arises from a process of modulation from one of its past applications. The latest works by Carston (2019) tend to develop a similar view. Radical linguistic pragmatists reject even the idea that semantics provides schemata to be pragmatically enriched by modulation processes. In their view, the difficulty is to explain what such an incomplete semantic content might be for many expressions. Think, for example, of ‘red’. It is difficult to individuate a semantic content, no matter how incomplete, that is shared in ‘red car’, ‘red hair’, ‘red foliage’, ‘red rashes’, ‘red light’, ‘red apple’, etc. It is even more difficult to explain how this alleged incomplete content could be enriched into the contents that people convey with those expressions.

The next section is devoted to indexicalism, a family of theories that react against linguistic pragmatism.

3. Defending the Orthodox View: Indexicalism

a. Extending Indexicality and Polysemy

Indexicalists attempt to recover the orthodox truth-conditional approach in semantics from the charge of semantic underdetermination raised by linguistic pragmatists. Indexicalists reject the thesis of semantic underdetermination and explain the variability of utterances’ contents in contexts with the resources of the orthodox truth-conditional view, mainly by enlarging the range of indexicality and the range of polysemy. The typical examples of variability of contents in contexts invoked by linguistic pragmatists are the following:

(13) John is tall.

(14) Mary is ready.

(15) It is raining.

(16) Everybody got an A.

(17) Mary and John got married and had a child.

In the course of a conversation about basketball players, an utterance of (13) might express the content that John is tall with respect to the average height of basketball players. In the course of a conversation about the next logic exam, an utterance of (14) might express the content that Mary is ready to take the logic exam. If Mary utters (15) while in Rome, her utterance might express the content that it is raining in Rome at the time of the utterance. If the professor of logic utters (16), her utterance might express the content that all the students in her class got an A. Mostly, if a speaker utters (17), she expresses the content that Mary and John got married before having a child.

Linguistic pragmatists argue that, in order for utterances of sentences like (13)-(17) to express those contents, the conventional meanings encoded in the sentences are not sufficient. Linguistic pragmatists hold that the presence in the content expressed of a comparison class for ‘tall’, of a course of action for ‘ready’, of a location for weather reports, of a restricted domain for quantified noun phrases, and of the temporal/causal order for ‘and’ is not the result of a process that is governed by a semantic convention. Linguistic pragmatists generalize this claim and argue that what is true of expressions like ‘tall’, ‘ready’, ‘it rains’, ‘everybody’, and ‘and’, is true of nearly all expressions in natural languages. According to linguistic pragmatists, semantic conventions provide at most propositional schemata—propositional radicals—that lack determinate truth-conditions.

The indexicalists’ strategy for resisting the call for a new theoretical approach raised by linguistic pragmatists is to enlarge both the range of indexicality, thought of as the result of linguistically governed processes of saturation, and the range of polysemy. Michael Devitt says, there is more linguistically governed context-sensitivity and polysemy in our language than linguistic pragmatists think. Indexicalists try to explain examples like (13)-(16) by conventions of saturation: It is by linguistic conventions codified in language that people use ‘tall’ having in mind a class of comparison, ‘ready’ a course of action, ‘it rains’ a location, and ‘everyone’ a domain of quantification. Some indexicalistsexplain examples like (17) by polysemy: ‘And’ is a polysemous word having multiple meanings, one for the truth-functional conjunction and one for the temporally/causally ordered conjunction.

Indexicalism too comprises a family of theories, and there are deep and fundamental differences among them. As said, on an orthodox semantic theory the meaning of context-sensitive expressions sets up the parameters, or slots, that must be loaded with contextual values. Sometimes the parameters are explicitly expressed in the sentence, as with indexicals. Sometimes, instead, the parameters do not figure at the level of surface syntax. Philosophers and linguists disagree on where the parameters, which do not show up at the level of surface syntax, are hidden. Some (Stanley 2005a, Stanley and Williamson 1995, Szabo 2001, Szabo 2006) hold that such parameters are associated with elements that occur in the logical form. Taylor (2003) advances a different theory and argues that hidden parameters are represented in the syntactic basement of the lexicon. They are constituents not of sentences but of words. On Taylor’s view, the lexical representations of words specify the parameters that must be filled in with contextual values in order for utterances of sentences to have determinate truth-conditions. In a different version of indexicalism, some authors (Rothshield and Segal 2009) argue that the expressions that are regularly used to express different contents in different contexts ought to be treated as ordinary context-sensitive expressions and added to the Basic Set.

What all indexicalist theories have in common is the view that the variability of contents in contexts is always linguistically governed by conventional meanings of expressions. In all versions of indexicalism the phenomenon of semantic underdetermination is denied: The presence of each component of the content of an utterance of a sentence is mandatorily governed by a linguistic element occurring in the sentence either at the level of surface syntax or at the level of logical form.

b. Two Objections to Linguistic Pragmatism: Overgeneration and Normativity

There are two connected motivations that underlie the indexicalists’ defence of the orthodox view. One is a problem with overgeneration, the other is a problem with the normativity of meaning.

Linguistic pragmatists aim at keeping in place the distinctions among the level of linguistic meaning, the level of the contents of utterances, and the level of what speakers communicate indirectly by means of implicatures. To this end, linguistic pragmatists need a principled way to distinguish the contents of utterances (Sperber and Wilson’s and Carston’s explicatures, Bach’s implicitures, Recanati’s pragmatic truth-conditions) from implicatures. The canonical definition of explicature—and from now on this article adopts the term ‘explicature’ for pragmatically enriched contents of utterances—is the following:

An explicature is a pragmatically inferred development of logical form, where implicatures are held to be purely pragmatically inferred—that is, unconstrained by logical form.

The difficulty arises because explicatures are taken to be pragmatic developments of logical forms but not all pragmatic developments of logical forms count as explicatures. Linguistic pragmatists need to keep developments of logical forms that are explicatures apart from developments of logical forms that are not. Explicatures result from pragmatic processes that are not linguistically driven. There is a problem of overgeneration. As Stanley points out, if explicatures are linguistically unconstrained, then there is no explanation of why an utterance of sentence (18) can never have the same content as an utterance of sentence (19), or why an utterance of sentence (20) can have the same content as an utterance of sentence (21) but never the same content as an utterance of sentence (22):

(18) Everyone likes Sally.

(19) Everyone likes Sally and her mother.

(20) Every Frenchman is seated.

(21) Every Frenchman in the classroom is seated.

(22) Every Frenchman or Dutchman in the classroom is seated.

Carston and Hall (2012) try to answer Stanley’s objection of overgeneration from within the camp of linguistic pragmatists. For an assessment and criticism of their attempts, see Borg (2016). However, the point of Stanley’s objection of overgeneration is clear: Once pragmatic processes are allowed to contribute to direct contents of utterances in ways that are not linguistically governed by conventional meanings, it is difficult to draw the distinction between what speakers directly say and what they indirectly convey, so that the distinction between explicatures and implicatures collapses.

The other objection against linguistic pragmatism concerns the normativity of meaning. According to indexicalists, the explanation of contents of utterances supplied by semantics in the orthodox approach is superior to the explanation supplied by linguistic pragmatism because the former accounts for the normative aspect of meaning while the latter does not. Normativity is constitutive of the notion of meaning. If there are meanings, there must be such things as going right and going wrong with the use of language. The use of an expression is right if it conforms with its meaning, and wrong otherwise. If literal contents of speech acts are thought of in truth-conditional terms, conformity with meaning amounts to constraints on truth-conditions. In cases of expressions with one meaning the speaker undertakes the semantic burden of using them for expressing their conventional semantic values. In cases of polysemy the speaker undertakes the semantic burden of selecting a convention that fixes a determinate contribution to the truth-conditional contents expressed by utterances of sentences. In cases of expressions governed by conventions of saturation, the speaker undertakes the semantic burden of loading the demanded parameters with contextual values. Whenever the speaker fulfils these semantic burdens, she goes right with her use of language, otherwise she goes wrong, unless the speaker is speaking figuratively. As said above, the speaker who utters sentences (13)-(16) undertakes the semantic burden of loading a comparison class for ‘tall’, a course of action for ‘ready’, a location for ‘it rains’, a restricted domain of quantification for ‘everybody’. And a speaker who utters (17) undertakes the semantic burden of selecting the convention for ‘and’ that fixes the truth-functional conjunction or the convention that fixes the temporal/causal ordered conjunction.

Indexicalists say that the problem for linguistic pragmatism is to provide an account of how the meanings of expressions constrain truth-conditional contents of utterances, if the composition of truth-conditions is not governed by linguistic conventions, and how, lacking such an explanation, linguistic pragmatism can preserve the distinction between going right and going wrong with the use of language.

The remainder of this section gives a short illustration of the version of indexicalism that tries to explain the variability of contents in contexts by adding hidden variables in the logical form of sentences. The next two sections introduce some technicalities, and the reader who is content with a general introduction to context-sensitivity can skip to section 4.

c. Hidden Variables and the Binding Argument

Some indexicalists (Stanley, Szabo, Williamson) reinstate the Gricean syntactic constraint, rejected by linguistic pragmatists, at the level of logical form. They maintain that every pragmatic process that contributes to the determination of the truth-conditional content of a speech act is a process of saturation that is always activated by the linguistic meaning of an expression. If there is no trace of such expression in the surface syntactic structure, then there must be an element in the logical form that triggers a saturation process. The variables in the logical form work as indexicals that require contextual assignments of values. The pragmatic processes that assign the values of those variables are processes that are governed by linguistic rules; they are not optional.

Here are some examples, with some simplifications, given that a correct rendering of the logical form would require more technicalities. Suppose that, while on the phone to Mary on 25 November 2019, answering a question about the weather in London, George says:

(15) It’s raining.

People tend to agree that George said that it is raining in London on that date. Linguistic pragmatists concede that the reference to the day is due to the present tense of the verb, which works as an indexical expression that refers to the time of the utterance. However, the reference to the place, the city of London, is given by free enrichment. For linguistic pragmatists (15) can be represented as follows:

(15*) It’s raining (t).

The variable ‘t’ corresponds to the present tense of the verb. In the logical form there is no variable taking London as value. On the contrary, indexicalists claim that (17) can be represented as follows:

(15**) It’s raining (t, l).

In (15**) the variable ‘l’ takes London as a value. The process that assigns London to the variable ‘l’ is of the same kind as the process that assigns a referent to the indexical ‘here’ and it is linguistically driven because it is activated by an element of the logical form.

The variables that indexicalists insert in logical forms have a more complex structure. In (15**) the variable ‘t’ has the structure ‘ƒ(x)’ and the variable ‘l’ has the structure ‘ƒ*(y)’. ‘x’ is a variable that takes contextually salient entities as values and ‘ƒ’ is a variable that ranges over functions from entities to temporal intervals. The variable ‘y’ also takes contextually salient entities as values, and ‘ƒ*’ ranges over functions from entities to locations. The reason for this complexity will be explained in the next section. For now, it suffices to note that in simple cases like (15**), ‘x’ take instants as values and ‘ƒ ’ takes the identity function, so that ƒ(x) = x. Likewise, ‘y’ takes locations as values and ‘ƒ*’ takes the identity function, so that ƒ*(y) = y.

Here is another example. Consider Mark, the  player whose coach makes the following assertion:

(5) Mark is short.

The coach said that Mark is short with respect to the average height of basketball players. Indexicalists explain this case by inserting a variable in (5):

(5*) Mark is short (h).

h’ is a variable that takes standards of height as values. The variable ‘h’ too has a structure of the kind ‘ƒ(x)’, where ‘x’ ranges over contextually salient entities (for example, the set of basketball players) and ‘ƒ’ over functions that map the salient entities to other entities (for instance, the subset of the basketball players that are shorter than the average height of basketball players).

Here is an example with quantifiers. Consider the following sentence, asserted by the professor of logic:

(8) All students got an A.

The professor said that all students that took the logic class got an A. Indexicalists claim that in the quantifier ‘all students’ there is a variable that assigns domains of quantification:

(8*) [all x: student x]ƒ(y) (got an A x).

In this example the value of the variable ‘y’ is the professor of logic and the value of ‘ƒ’ is a function that maps y onto the set of students who took the logic class taught by y. This set becomes the domain of the quantifier ‘all students’.

Stanley and Szabo present a strategy for justifying the insertion of hidden variables in logical forms, the so-called binding argument: to show that an element of the truth-conditional content of an utterance of a sentence is the result of a process of saturation, it is enough to show that it can vary in accordance with the values of a variable bound by a quantifier.

Consider the following sentence:

(23) Whenever Bob lights a cigarette, it rains.

An interpretation of (23) is the following: Whenever Bob lights a cigarette, it rains where Bob lights it. In this interpretation, the location where it rains varies in relation to the time when Bob lights a cigarette. Therefore, the value of the variable ‘l’ in ‘it rains (t, l)’ depends on the value of the variable ‘t’ that is bound by a quantifier that ranges over times. This interpretation can be obtained if (23) is represented as follows:

(23*) [every t: temporal interval t Ù Bob lights a cigarette at t](it rains (ƒ(t), ƒ*(t))).

The value of ‘ƒ’ is the identity function so that ƒ(t) = t, and the value of ‘ƒ*’ is a function that assigns to the time that is the value of ‘t’ the location where Bob lights a cigarette at that time.

d. Objections to the Binding Argument

Some philosophers (Cappelen and Lepore 2002, Breheny 2004) raise an objection of overgeneration against the binding argument. In their view, the binding argument forces the introduction of too many hidden variables, even when there is no need for them. The strongest objection against the binding argument has been raised by Recanati (2004: 110), who argues that the binding argument is fallacious. Recanati summarizes the binding argument as follows:

    1. Linguistic pragmatism maintains that in ‘it rains’ the implicit reference to the location is the result of a process of modulation that does not require any covert variable.
    2. In the sentence ‘whenever Bob lights a cigarette, it rains’, the reference to the location varies according to the value of the variable bound by the quantifier ‘whenever Bob lights a cigarette’.
    3. There can be no binding without a variable in the logical form.
    4. In the logical form of ‘it rains’ there is a variable for locations, although phonologically not realized.

Therefore:

    1. Linguistic pragmatism is wrong: In ‘it rains’, the reference to the location is mandatory, because it is articulated in the logical form.

Recanati argues that this argument is fallacious because of an ambiguity in premise 4, where the sentence ‘it rains’ can be intended either in isolation or as a part of compound phrases. According to Recanati, the sentence ‘it rains’ contains a covert variable when it occurs as a part of the compound sentence ‘whenever Bob lights a cigarette, it rains’, but it does not contain any variable when it occurs alone.

Recanati proposes a theory that admits that binding requires variables in the logical form, but at the same time it rejects indexicalism. Recanati makes use of expressions that modify predicates. Given an n-place predicate, a modifier can form an n+1 place or an n-1 place predicate. A modifier expresses a function from properties/relations to other properties/relations. For example, Recanati says that ‘it rains’ expresses the property of raining, which is predicated of temporal intervals. Expressions like ‘at’, ‘in’, and so forth, transform the predicate ‘it rains (t)’ from a one-place predicate to a two-place predicate: ‘it rains (t, l)’. Expressions like ‘here’ or ‘in London’ are special modifiers that transform the predicate ‘it rains’ from a one-place predicate to a two-place predicate but also provide a value for the new argument place. Recanati argues that expressions like ‘whenever Bob lights a cigarette’ are modifiers of the same kind as ‘here’ and ‘in London’. They change the number of predicate places and provide a value to the new argument through the value of the variable they bind. Recanati’s conclusion is that although binding requires variables in the logical form of compound sentences, there is no need to insert covert variables in sub-sentential expressions or sentences in isolation.

The next section presents a different approach to semantics, one that distinguishes between semantic contents and speech act contents.

4. Defending the Autonomy of Semantics: Minimalism

a. Distinguishing Semantic Content from Speech Act Content

Indexicalists and linguistic pragmatists share the view that the goal of semantics is to explain the explicit contents of speech acts performed by utterances of sentences. They both agree that there must be a close explanatory connection between the meaning encoded in a sentence S and the contents of speech acts performed by utterances of S. One important corollary of this conception is that if a sentence S is systematically uttered for performing speech acts with different contents at different contexts, this phenomenon calls for an explanation on behalf of semantics. The point of disagreement between indexicalists and linguistic pragmatists is that the former think that semantics can provide such an explanation while the latter think that semantics alone is not sufficient and a new theoretical model is needed, one in which pragmatic processes, semantically unconstrained, contribute to determine the contents of speech acts. As said above, indexicalists explain the variability of contents in contexts in terms of context-sensitivity by enlarging the range of indexicality and polysemy, whereas linguistic pragmatists explain it in terms of semantic underdetermination. The debate between indexicalists and linguistic pragmatists starts taking for granted the explanatory connection between semantics and contents of speech acts.

Minimalism in semantics is a family of theories that reject the explanatory connection between semantics and contents of speech acts. Minimalists (Borg 2004, 2012, Cappelen and Lepore 2005, Soames 2002) maintain that semantics is not in the business of explaining the contents of speech acts performed by utterances of sentences. Minimalists work with a notion of semantic content that does not play the role of speech acts content. According to them the semantic content of a sentence is a full truth-conditional content that is obtained compositionally by the syntactic structure of the sentence and the semantic values of the expressions in the sentence that are fixed by conventional meanings. Moreover, they claim that the Basic Set of genuinely context-sensitive expressions, which are governed by conventions of saturation, comprises only overt indexicals (pronouns, demonstratives, tense markers, and a few other words). Minimalists call the semantic content of a sentence its minimal content.

The above statement that minimal contents are not contents of speech acts requires qualification. Cappelen and Lepore argue indeed for speech act pluralism. They argue that speech acts have a plurality of contents and the minimal content of a sentence is always one of many contents that its utterances express. In order to protect speech act pluralism from the objection that very often speakers are not aware of having made an assertion with the minimal content, and, if asked, they would deny having made an assertion with the minimal content, Cappelen and Lepore argue that speakers can sincerely assert a content without believing it and without knowing they have asserted it. For example, if Mary looks into the refrigerator and says ‘there are no beers’, Mary herself would deny that she asserted that there are no beers in existence and deny that she believes that there are no beers in existence, although that there are no beers in existence is the minimal content that the sentence ‘there are no beers’ semantically expresses.

The main line of the minimalists’ attack on indexicalism and linguistic pragmatism is methodological. Minimalists argue that both indexicalists and linguistic pragmatists adhere to the methodological principle that says that a semantic theory is adequate just in case it accounts for the intuitions people have about what speakers say, assert, claim, and state by uttering sentences. Minimalists claim that this principle is mistaken just because it conflates semantic contents and contents of speech acts. Semantics is the study of the semantic values of the lexical items and their contribution to the semantic contents of complex expressions. Contents of speech acts, instead, are contents that can be used to describe what people say by uttering sentences in particular contexts of utterance.

b. Rebutting the Arguments for Linguistic Pragmatism

Minimalists dismiss context-shifting arguments and inappropriateness arguments just on the grounds that they conflate intuitions about semantic contents of sentences and intuitions about contents of speech acts. Incompleteness arguments are a subtler matter and require more articulated responses. Cappelen and Lepore’s (2005) response and Borg’s (2012) response are presented in the following. An incompleteness argument aims at showing that there is no invariant content that a sentence S expresses in all contexts of utterance. For example, with respect to:

(14) Mary is ready,

an incompleteness argument starts from the observation that if (14) is taken separately from contextual information specifying what Mary is ready for, people are unable to evaluate it as true or false. This evidence leads to the conclusion that there is no minimal content—that Mary is ready (simpliciter)—that is invariant and semantically expressed by (14) in all contexts of utterance. In general, then, the conclusions of incompleteness arguments are that minimal contents do not exist: without pragmatic processes, many sentences in our language do not express full propositional contents with determinate truth-conditions.

Cappelen and Lepore accept the premises of incompleteness arguments, that is, that people are unable to truth-evaluate certain sentences, but they argue that from these premises it does not follow that minimal contents do not exist. Borg adopts a different strategy. Borg tries to block incompleteness arguments by rejecting their premises and explaining away people’s inability to truth-evaluate certain sentences.

Cappelen and Lepore raise the objection that incompleteness arguments try to establish metaphysical conclusions, for example about the existence of the property of being ready (simpliciter) as a building block of the minimal content that Mary is ready (simpliciter), from premises that concern psychological facts regarding people’s ability to evaluate sentences as true or false. They point out that psychological data are not relevant in metaphysical matters. The data about people’s dispositions to evaluate sentences might reveal important facts about psychology and communication but have no weight at all in metaphysics. Cappelen and Lepore say that people’s inability to evaluate sentences like (14) as true or false independently of contextual information does not provide evidence against the claim that the property of being ready exists and is the semantic content of the adjective ‘ready’. On the one hand, they acknowledge that the problem of giving the analysis of the property of being ready is a difficult one, but it is for metaphysicians, not for semanticists. On the other hand, they argue that semanticists have no difficulty at all in stating what invariant minimal content is semantically encoded in (14). Sentence (14) semantically expresses the minimal content that Mary is ready. There is no difficulty in determining its truth-conditions either: ‘Mary is ready’ is true if and only if Mary is ready.

Cappelen and Lepore address the immediate objection that if the truth-condition of (14) is represented by a disquotational principle like the one reported above, then nobody is able to verify whether such truth-condition is satisfied or not. This fact is witnessed by people’s inability to evaluate (14) as true or false independently of information specifying what Mary is ready for. Cappelen and Lepore respond that it is not a task for semantics to ascertain how things are in the world. For example, it is not a task for semantics to say whether (14) is true or false. That a semantic theory for a language L does not provide speakers with a method of verifying sentences of L is not a defect of that semantic theory. Cappelen and Lepore say that those theorists who think otherwise indulge in verificationism. For an objection to Cappelen and Lepore see Recanati (2004), Clapp (2007), Penco and Vignolo (2019).

In Pursuing Meaning Borg offers a different strategy for blocking incompleteness arguments. Borg’s strategy is to explain away the intuitions of incompleteness. Borg agrees that speakers have intuitions of incompleteness with respect to sentences like ‘Mary is ready’, but she argues that intuitions of incompleteness emerge from some overlooked covert and context-insensitive syntactic structure. Borg says that ‘ready’ is lexically marked as an expression with two argument places. On Borg’s view ‘ready’ always denotes the same relation, the relation of readiness, which holds between a subject and the thing for which they are held to be ready. When only one argument place is filled at the surface level, the other is marked by an existentially bound variable in the logical form. Thereby ‘ready’ makes exactly the same contribution in any context of utterance to any propositional content literally expressed. For example, Borg says that in a context where what is salient is the property being ready to join the fire service, the sentence ‘Mary is ready’ literally expresses the minimal content that Mary is ready for something not that Mary is ready to join the fire service. As Borg points out, the minimal content that Mary is ready for something is almost trivially true. Yet, Borg warns not to conflate intuitions about the informativeness of a propositional content with intuitions about its semantic completeness.

Borg’s explanation of the intuitions of incompleteness is that speakers are aware of the need for the two arguments, which is in tension with the phonetic delivery of only one argument. Speakers are uneasy to truth-evaluate sentences like ‘Mary is ready’ not because the sentence is semantically incomplete and lacks a truth-condition, but because their expectation for the second argument to be expressed is frustrated and the minimal content that is semantically expressed, when the argument role corresponding to the direct object is not filled at the surface level, is barely informative. For a critical assessment of Borg’s strategy, see Clapp (2007) and Penco and Vignolo (2019).

The following subsection illustrates the tenets that characterise minimalism and the central motivation for it.

c. Motivation and Tenets of Minimalism

Minimalism is characterised by four main theses (Borg 2007) and one main motivation. The first thesis is propositionalism. Propositionalism states that sentence types, relativized to indexes representing contexts of utterance, express full propositional contents with determinate truth-conditions. These semantic contents are the minimal ones, which are invariant through contexts of utterance when sentence types do not contain overt context-sensitive expressions. Propositionalism distinguishes minimalism from radical minimalism, which is a philosophical view sustained by Bach (2007). Bach acknowledges the existence of semantic contents of sentence types, but he rejects the view that such contents are always fully propositional with determinate truth-conditions. According to Bach, most semantic contents are propositional radicals. As Borg points out, despite the fact that Bach insists on avowing that he is not a linguistic pragmatist, it is not easy to spot substantial differences between Bach’s view and linguistic pragmatism. Although Bach’s semantically incomplete sentences are not context-sensitive unless they contain overt context-sensitive expressions, linguistic pragmatists need not deny that semantic theories are possible. They simply maintain that in most cases semantic theories deal with sub-propositional contents. Bach and linguistic pragmatists agree that, in many if not most cases, in order to reach full propositional contents theorists need to focus on speech acts and not on sentence types.

The second important thesis of minimalism is the Basic Set assumption. The Basic Set assumption states that the only genuine context-sensitive expressions that trigger and drive pragmatic processes for the determination of semantic values are those that are listed in the Basic Set, that is, overt indexicals like ‘I’, ‘here’, ‘now’, ‘that’, plus or minus a bit. Expressions like ‘ready’, ‘tall’, ‘green’, quantified noun phrases, and so on, are not context-sensitive.

The third tenet of minimalism is the distinction between semantic contents and speech acts contents: Semantic contents are not what speakers intend to explicitly and directly communicate. The contents explicitly communicated are pragmatic developments of semantic contents. As said, this move serves to disarm batteries of arguments advanced by indexicalists and linguistic pragmatists. Even if in almost all cases semantic contents are not the contents of speech acts, they nonetheless play an important theoretical role in communication. Semantic contents are fallback contents that people are able to understand on the sole basis of their linguistic competence when they ignore or mistake the intentions of the speakers and the contextual information needed for understanding what speakers are trying to communicate. Minimal contents can play this role in communication just because they can be grasped simply in virtue of linguistic competence alone.

The fourth and last thesis of minimalism is a commitment to formalism. Formalism is the view that the processes that compute the truth-conditional contents of sentence types are entirely formal and computational. There is an algorithmic route to the semantic content of each sentence (relative to an index representing contextual features), and all contextual contributions to semantic contents are formally tractable. More precisely, all those contextual contributions that depend on speakers’ intentions must be kept apart from semantic contents. This last claim puts a further constraint on context-sensitive expressions, which ought to be responsive only to objective aspects of contexts of utterance, like who is speaking, when, and where. These are the features that Bach (1994, 1999, 2001) and Perry (2001) termed narrow features of contexts and play a semantic role, as opposed to wide features that depend on speakers’ intentions and play a pragmatic role. It is also a claim that relates to Kaplan’s distinction between pure (automatic) indexicals, which refer semantically by picking out objective features of the context of utterance, and intentional indexicals, which refer pragmatically in virtue of intentions of speakers (Kaplan 1989a, Perry 2001).

Formalism is related to one of the main motivations for minimalism. Minimalism is compatible with a modular account of meaning understanding. The modularity theory of mind is the view that the mind is constituted of discrete and relative autonomous modules, each of which is dedicated to the computation of particular cognitive functions. A module possesses a specific body of information and specific rules working computationally on that body of information. Among such modules there is one, the module of the faculty of language, which is dedicated to the comprehension of literal contents of sentences. This model includes phonetic/orthographic information and related rules, syntactic information and related rules, and semantic information and related rules.

A minimalist semantics fits well as part of the language module since it derives the truth-conditional contents of sentences, relative to indexes, in a computational way operating on representations of semantic properties of the lexicon and with formal rules working on such representations. Thus, if linguistic comprehension is modular, minimalism offers a theory that is consistent with the existence of the language module.

The following data are often-invoked evidence to justify the claim that linguistic comprehension is modular. The understanding of literal meanings of sentences seems to be the result of domain-specific and encapsulated processes. Linguistically competent people understand the literal meaning of a sentence even when they ignore salient aspects of the context of utterance and the communicative intentions of the speaker. Moreover, the understanding of literal meaning is carried out independently of any sort of encyclopaedic information. The processes that yield literal truth-conditional contents of sentences are mandatory, very fast, and mostly unavailable to consciousness. People cannot help reading certain signs and hearing certain sounds as utterances of sentences in languages they understand. Competent speakers interpret straightforwardly and very quickly those signs and sounds as sentences with literal contents without being aware of the information and the rules operating on it that yield such an understanding. Finally, linguistic understanding is associated with localized neuronal structures that undergo regularities in acquisition and development processes, and regularities of breakdown due to neuronal damages. In conclusion, for those who believe that this is good evidence that comprehension of literal meaning is modular, minimalism offers a semantic theory that can be coherently taken to be part of the language faculty module.

The presentation of minimalism closes with the discussion of the tests that Cappelen and Lepore propose in order to select the only context-sensitive expressions that go into the Basic Set. The following subsection contains some technicalities. The reader who is mainly interested in an overview on context-sensitivity can skip to section 5.

d. Testing Context-Sensitivity

Cappelen and Lepore propose different tests for distinguishing the expressions in the Basic Set that are genuinely context-sensitive from those that are not. Here only one of their tests is illustrated, but it is sufficient to give a hint of their work.

Test of inter-contextual disquotational indirect reports: Suppose that Anna, who had planned to climb Eiger’s North Face on July 1 but cancelled, utters the following sentence on July 2:

(24) Yesterday I was not ready.

Suppose that on July 3 Mary indirectly reports what Anna said on July 2. Mary cannot use the same words as Anna used. If she did, she would make the following report:

(25) Anna said that yesterday I was not ready.

From this example it is clear that context-sensitive expressions like ‘I’and ‘yesterday’ generate inter-contextual disquotational indirect reports that are false or inappropriate.

Cappelen and Lepore say that it is possible to make inter-contextual disquotational indirect reports with the adjective ‘ready’, and this fact provides evidence that ‘ready’ is not context-sensitive. Assume that on July 5 Mary utters the following sentence:

(26) On July 1 Anna was not ready.

Then, on July 6 George might report what Mary said with the utterance of the following sentence:

(27) Mary said that on July 1 Anna was not ready.

These results generalize to all expressions that do not belong to the Basic Set.

Another case is the following. Suppose Mary utters ‘Anna is ready’ in a context C1 to say that Anna is ready to climb Eiger’s North Face and makes a second utterance of it in a context C2 to say that Anna is ready to sit her logic exam. Cappelen and Lepore argue that in a context C3 the following reports are true:

(28) Mary said that Anna is ready (with respect to the utterance in C1).

(29) Mary said that Anna is ready (with respect to the utterance in C2).

(30) In C1 and C2 Mary said that Anna is ready.

Cappelen and Lepore say that linguistic pragmatism and indexicalism have difficulty explaining the truth of the above inter-contextual disquotational indirect reports. It is not obvious, however, that the difficulty Cappelen and Lepore propose is insurmountable. The context C3 might differ from C1 and C2 because the speaker, the time, and the place of the utterance are different, but the same contextual information might be available in C3 and be relevant for the interpretation of the utterance in C1 or C2. In C3 the speaker (and the audience too) might be aware that Mary was talking about alpinism in C1 and of logic exams in C2.

According to a suggestion by Stanley (2005b), and Cappelen and Hawthorne (2009), sentence (30) might be represented as follows:

(30*) C1 and C2 lx (in x Mary said that Anna is readyƒ(x)).

Here the variable ‘x’ takes contexts as values and the variable ‘ƒ’ takes a function that maps contexts to kinds of actions or activities salient in those contexts. This analysis yields the interpretation that the report (30) is true if and only if in C1 Mary said that Anna is ready to climb Eiger’s North Face and in C2 Mary said that Anna is ready to take her logic exam. On the other hand, if one supposes that the speaker in C3 has the erroneous belief that Mary was talking about Anna’s readiness to go out with friends, linguistic pragmatists and indexicalists will doubt the truth of the reports (28)-(30) and reduce the debate to a conflict of intuitions.

The test of inter-contextual disquotational indirect reports and the other tests that Cappelen and Lepore present, such as the test of inter-contextual disquotation and the test of collective descriptions, raised an intense debate. For critical assessments of these tests, see Leslie (2007) and Taylor (2007). Cappelen and Hawthorne (2009) present the test of agreement, while Donaldson and Lepore (2012) add the test of collective reports. Limits of space prevents deeper detail of the debate on tests for context-sensitivity. The foregoing suffices to give an idea of the kind of arguments that philosophers involved in that debate deal with.

While minimalism is a strong alternative to linguistic pragmatism and indexicalism, another approach develops in a new way the idea of invariant semantic contents: relativism. The next section presents the view of relativism, which reconceptualises the relations between meaning and context.

5. Defending Invariant Semantic Contents: Relativism

a. Indexicality, Context-Sensitivity, and Assessment-Sensitivity

Relativism in semantics provides a new conceptualization of context dependence. Relativists (Kolbel 2002, MacFarlane 2014, Richard 2008) recover invariant semantic contents and explain some forms of context dependence not in terms of variability of contents in contexts of utterance but in terms of variability of extensions in contexts of assessment. A context of utterance is a possible situation in which a sentence might be uttered and a context of assessment is a possible situation in which a sentence might be evaluated as true or false.

As said in section 1b, Kaplan represents meanings as functions that return contents in contexts of utterances. Contents are functions that distribute extensions in circumstances of evaluation. The content of a sentence in a context of utterance is a function that returns truth-values at standard circumstances of evaluation composed of a possible world and a time. MacFarlane shows that the technical machinery of Kaplan’s semantics is apt to draw conceptual distinctions among what he calls indexicality, context-sensitivity, and assessment-sensitivity. MacFarlane’s notion of indexicality covers the standard variability of contents in contexts. His notions of context-sensitivity and assessment-sensitivity cover new semantic phenomena, according to which expressions might change extensions while maintaining the same contents. MacFarlane’s notions are defined as follows:

Indexicality:

An expression E is indexical if and only if its content at a context of utterance depends on features of the context of utterance.

Context-sensitivity:

An expression E is context-sensitive if and only if its extension at a context of utterance depends on features of the context of utterance.

Assessment-sensitivity:

An expression E is assessment-sensitive if and only if its extension at a context of utterance depends on features of a context of assessment.

For example, consider two utterances of (5): a true utterance in a conversation about basketball players and a false utterance in a conversation about the general population.

(5) Mark is short.

Indexicality: The standard account in terms of indexicality affirms that the two utterances have different contents because the adjective ‘short’ is treated as an expression that expresses different contents in different contexts of utterance. According to indexicalism, the meaning of ‘short’ demands that the speaker fill in a standard of height that is operative in the context of utterance in order to determine the content of the utterance. Thus, the speaker in the first conversation expresses a different content than that expressed in the second conversation. Since the difference in truth-values between the two utterances is explained in terms of a difference in contents, the context of utterance—in our example speaker’s intentions referring to different standard of height—has a content-determinative role.

Context-sensitivity: Context-sensitivity, in MacFarlane’s sense, explains the difference in truth-values in terms of a difference in the circumstance of evaluation. The circumstance of evaluation is enriched with non-standard parameters. In our example, the circumstance of evaluation is enriched with a parameter concerning the standard of height. The meaning of ‘short’ returns the same content in all contexts of utterance. The content of ‘short’ is invariant across contexts of utterance, but it returns different extensions in circumstances of evaluation that comprise a possible world, a time, and a standard of height. The standard of height that is operative in the first conversation enters the circumstance of evaluation with respect to which ‘short’ has an extension in that context of utterance. According to that standard of height, Mark does belong to the extension of ‘short’. The standard of height that is operative in the second conversation enters the circumstance of evaluation with respect to which ‘short’ has an extension in that context of utterance. According to that standard of height, Mark does not belong to the extension of ‘short’. With context-sensitivity (in MacFarlane’s sense) the context of utterance has a circumstance-determinative role, since it fixes the non-standard parameters that enter the circumstance of evaluation with respect to which expressions have extensions at the context of utterance.

Context-sensitivity so defined is not relativism. For any context of utterance, expressions have just one, if any, extension at that context. In particular, sentences in contexts have absolute truth-values. Truth for sentences in contexts is defined as follows:

A sentence S at a context of utterance i is true if and only if S is true in iw at it and with respect to ih1ihn, where iw and it are the world and the time of the context of utterance i, and ih1ihn are all the non-standard parameters, demanded by the expressions in S, which are operative in i (in the above example the standard of height demanded by ‘short’, that is, the average height of basketball players in the first context and the average height of American citizens in the second context).

On the contrary, relativism holds that the extensions of expressions at contexts of utterance are relative to contexts of assessment. So, if contexts of assessment change, extensions too might change. In particular, sentences are true or false at contexts of utterance relative to contexts of assessment. Relative truth is defined as follows:

A sentence S at a context of utterance i is true relative to a context of assessment a if and only if S is true in iw at it and with respect to ah1ahn, where iw and it are the world and the time of the context of utterance i, and ah1ahn are all the non-standard parameters, demanded by the expressions in S, that are operative in the context of assessment a.

Relativism requires small revisions of the technical machinery of standard truth-conditional semantics in order to define the notion of relative truth, but it provides a radical reconceptualization of the ways in which meaning, contents, and extensions are context-dependent. Different authors apply relativism to different parts of language. MacFarlane (2014) presents a relativistic semantics for predicates of taste, knowledge attributions, epistemic modals, deontic modals, and future contingents. Kompa (2002) and Richard (2008) offer a relativist treatment of comparative adjectives like ‘short’. Predelli (2005) suggests a view close to relativism for colour words like ‘green’.

The major difficulty for relativists is not technical but conceptual. Relativism must explain what it is for a sentence at a context of utterance to be true relative to a context of assessment. The next subsection presents MacFarlane’s attempt to answer this conceptual difficulty. The final subsection discusses the case of faultless disagreement, which many advocates of relativism employ to show it superior to rival theories in semantics.

b. The Intelligibility of Assessment-Sensitivity

Many philosophers, following Dummett, say that the conceptual grasp of the notion of truth is due to a clarification of its role in the overall theory of language. In particular, the notion of truth has been clarified by its connection with the notion of assertion. One way to get this explication is to take the norm of truth as constitutive of assertion. The norm of truth can be stated as follows:

Norm of truth: Given a context of utterance C and a sentence S, an agent is permitted to assert that S at C only if S is true. (Remember that a sentence S at a context of utterance C is true if and only if S is true in the world of C at the time of C.)

Relativism needs to provide the explication of what it is for a sentence at a context of utterance to be true relative to a context of assessment. If the clarification of the notion of relative truth is to proceed along with its connection to the notion of assertion, what is needed is a norm of relative truth that relates the notion of assertion to the notion of relative truth. It would seem intuitive to employ the following norm of relative truth that privileges the context of utterance and selects it as the context of assessment:

Norm of relative truth: given a context of utterance C and a sentence S, an agent is permitted to assert that S at C only if S at context C is true as assessed from context C itself.

The problem, as MacFarlane points out, is that if the adoption of the norm of relative truth is all that can be said in order to explicate the notion of relative truth, then assessment-sensitivity is an idle wheel with no substantive theoretical role. Relativism becomes a notational variant of standard truth-conditional semantics. The point is that when the definition of relative truth is combined with the norm of relative truth, which picks out the context of utterance and makes it the context of assessment, relativism has the same prescriptions for the correctness of assertions as standard truth-conditional semantics, which works with the definition of truth (simpliciter) combined with the norm of truth.

MacFarlane argues that in order to clarify the notion of relative truth, the norm of relative truth is necessary but not sufficient. In order to gain a full explication of the notion of relative truth, a norm for retraction of assertions must be added to the norm of relative truth. MacFarlane presents the norm for retraction as follows:

Norm for retraction: An agent at a context of assessment C2 must retract an assertion of the sentence S, uttered at a context of utterance C1, if S uttered at C1 is not true as assessed from C2.

Relativism together with the norm of relative truth and the norm for retraction predicts cases of retraction of assertions that other semantic theories are not able to predict. Consider the following example: Let C1 be the context of utterance consisting of John, a time t in the year 1982, and the actual world @; let C2 be the context of utterance consisting of John, a time t´ in 2019, and the actual world @. Let C3 be the context of assessment in which John’s taste in 1982 is operative and C4 the context of assessment in which John’s taste in 2019 is operative.

Suppose John did not like green tea in 1982, when he was ten years old, but he likes green tea a lot in 2019, when he is forty-seven years old. Green tea is not in the extension of ‘tasty’ at C1 as assessed from C3 but it is in the extension of ‘tasty’ at C2 as assessed form C4. Suppose John utters:

(31) ‘Green tea is not tasty’ at C1

and

(32) ‘Green tea is tasty’ at C2.

Relativism predicts that both assertions are correct. John does not violate the norm of relative truth. However, relativism also predicts that in 2019 John must retract the assertion he made in 1982, because in 1982 John uttered a sentence that is false as assessed from C4.

Notice that John’s retraction of his assertion made in 1982 is predicted only by relativism, which treats the adjective ‘tasty’ as assessment-sensitive. If ‘tasty’ is treated as an indexical expression, then John’s assertions in 1982 and in 2019 have two distinct contents, and there is no reason why in 2019 John ought to retract his assertion made in 1982, because his assertion made in 1982 is true. There is no reason why John ought to retract his assertion if ‘tasty’ is treated as a context-sensitive expression. In this case too John’s assertion made in 1982 is true, because the circumstance of evaluation of his 1982 assertion contains the taste that is operative for John in 1982. Retraction is made possible only if ‘tasty’ is assessment-sensitive, making it possible to assess an assertion made in a context of utterance with respect to parameters that are operative in another context (the context of assessment).

c. Faultless Disagreement

Even if one accepts MacFarlane’s explanation of the intelligibility of relativism, it remains an open question whether languages contain assessment-sensitive expressions. It is important, then, to clarify whether there are linguistic phenomena that relativism explains better than linguistic pragmatism, indexicalism, or minimalism. Relativists address a representative phenomenon: faultless disagreement. In a pre-theoretic sense there is faultless disagreement between two parties when they disagree about a speech act or an attitude and neither of them violates any epistemic or constitutive norm governing speech acts or attitudes.

Faultless disagreement is very helpful to model disputes about non-objective matters, for instance, disputes on aesthetic values like tastes. Such disputes show the distinctive linguistic traits of genuine disagreement when the parties involved say ‘No, that is false’, ‘What you are saying is false’, ‘You are wrong, I disagree with you’, and so on. However, many philosophers feel compelled to avoid the account of disagreement that characterizes matters of objective fact, which in subjective areas of discourse would impute implausible cognitive errors and chauvinism to the parties in disagreement.

First, it is important to identify what kinds of disagreement are made intelligible in different semantic theories. Then, given an area of discourse, one must ask which of these kinds of disagreement can be found in it. Thus, semantic theories can be assessed on the basis of which of them predicts the kind of disagreement that is present in that area of discourse.

By employing the notion of relative truth, MacFarlane defines the following notion of accuracy for attitudes and speech acts:

Assessment-sensitive accuracy: An attitude or speech act occurring at a context of utterance C1 is accurate, as assessed from a context of assessment C2, if and only if its content at C1 is true as assessed from C2.

Based on the notion of assessment-sensitive accuracy, MacFarlane defines the following notion of disagreement:

Preclusion of joint accuracy: Agent A disagrees with agent B if and only if the accuracy of the attitudes or speech acts of A, as assessed from a given context, precludes the accuracy of the attitudes or speech acts of B, as assessed from the same context.

There are also different senses in which an attitude or speech act can be faultless. One of them is the absence of violation of constitutive norms governing attitudes or speech acts. According to MacFarlane, the kind of faultless disagreement given by preclusion of joint accuracy together with absence of violation of constitutive norms of attitudes or speech acts is typical of disputes in non-objective matters like taste.

Consider the sentence ‘Green tea is tasty’. Relativism accommodates the idea that its truth depends on the subjective taste of the assessor. Whether green tea is tasty is not an objective state of affairs. Suppose John utters the sentence ‘Green tea is tasty’ and George utters the sentence ‘Green tea is not tasty’. John and George disagree to the extent that there is no context of assessment from which both John’s and George’s assertions are accurate, but neither of them violates the norm of relative truth and the norm of retraction. John’s assertion is accurate if assessed from John’s context of assessment where John’s standard of taste is operative. George’s assertion is accurate if assessed from George’s context of assessment where George’s standard of taste is operative. They are both faultless. Moreover, George will acknowledge that ‘Green tea is tasty’ is true if assessed from John’s standard of taste and vice versa. Finally, suppose that after trying green tea several times, George starts appreciating it. George now says:

(33) Green tea is tasty.

George must retract his previous assertion and say:

(34) What I said (about green tea) is false.

Relativism predicts this pattern of linguistic uses of the adjective ‘tasty’. On the contrary, other semantic theories cannot describe the dispute between John and George as a case of faultless disagreement defined as preclusion of joint accuracy and absence of violation of constitutive norms governing attitudes/speech acts.

Linguistic pragmatism and indexicalism affirm that John’s and George’s tastes have a content-determinative role. Uttered by John, ‘tasty’ means tasty in relation to John’s standard of taste, and uttered by George it means tasty in relation to George’s standard of taste. Therefore, the sentence ‘green tea is tasty’ has a different content in John’s context of utterance than in George’s, with the consequence that disagreement is lost.

Minimalism says that the content of ‘tasty’, the objective property of tastiness, is invariant through all contexts of utterance and its extension in a given possible world is invariant through all contexts of assessment. Therefore, either green tea is in the extension of ‘tasty’ or is not. In this case, John and George are in disagreement but at least one of them is at fault.

6. References and Further Reading

a. References

  • Bach, Kent, 1994. ‘Conversational Impliciture’, Mind and Language, 9: 124-162.
  • Bach, Kent, 1999. ‘The Semantics-Pragmatics Distinction: What It Is and Why It Matters’, in K. Turner (ed.), The Semantics-Pragmatics Interface from Different Points of View, Oxford: Elsevier, pp. 65-84.
  • Bach, Kent, 2001. ‘You Don’t Say?’, Synthese, 128: 15-44.
  • Bach, Kent, 2007. ‘The Excluded Middle: Minimal Semantics without Minimal Propositions’, Philosophy and Phenomenological Research, 73: 435-442.
  • Borg, Emma, 2004. Minimal Semantics, Oxford: Oxford University Press.
  • Borg, Emma, 2007. ‘Minimalism versus Contextualism in Semantics’, in Preyer & Peter (2007), pp. 339-359.
  • Borg, Emma, 2012. Pursuing Meaning, Oxford: Oxford University Press.
  • Borg, Emma, 2016. ‘Exploding Explicatures’, Mind and Language, 31(3): 335-355.
  • Breheny, Richard, 2004. ‘A Lexical Account of Implicit (Bound) Contextual Dependence’, in R. Young, and Y. Zhou (eds.), Semantics and Linguistic Theory (SALT) 13, pp. 55-72.
  • Cappelen, Herman, and Lepore, Ernie, 2002. ‘Indexicality, Binding, Anaphora and A Priori Truth’, Analysis, 62, 4: 271-81.
  • Cappelen, Herman, and Lepore, Ernest, 2005. Insensitive Semantics: A Defence of Semantic Minimalism and Speech Act Pluralism, Oxford: Blackwell.
  • Cappelen, Herman, and Hawthorne, John, 2009. Relativism and Monadic Truth, Oxford: Oxford University Press.
  • Carston, Robyn, 2002. Thoughts and Utterances: The Pragmatics of Explicit Communication, Oxford: Blackwell.
  • Carston, Robyn, 2009. ‘Relevance Theory: Contextualism or Pragmaticism?’,  UCL Working Papers in Linguistics 21: 19-26.
  • Carston, Robyn, 2019. ‘Ad Hoc Concepts, Polysemy and the Lexicon’ In K. Scott, R. Carston, and B. Clark (eds.) Relevance, Pragmatics and Interpretation, Cambridge: Cambridge University Press, pp. 150-162.
  • Carston, Robyn, and Hall, Alison, 2012. ‘Implicature and Explicature’, in H. J. Schmid and D. Geeraerts (eds.), Cognitive Pragmatics, Vol. 4. Berlin: Mouton de Gruyter, 47–84.
  • Clapp, Lenny, 2007. ‘Minimal (Disagreement about) Semantics’, in Preyer & Peter (2007) below, pp. 251-277.
  • Donaldson, Tom, and Lepore, Ernie, 2012. ‘Context-Sensitivity’, in D. G. Fara, and G. Russell (eds.), 2012, pp. 116-131.
  • Grice, Herbert Paul, 1989. Studies in the Way of Words, Cambridge, MA: Harvard University Press.
  • Kaplan, David, 1989a. ‘Demonstratives’, in J. Almog, J. Perry, and H. Wettstein (eds.), Themes from Kaplan, Oxford: Oxford University Press, pp. 481-563.
  • Kaplan, David, 1989b. ‘Afterthoughts’, in J. Almog, J. Perry, and H. Wettstein (eds.), Themes from Kaplan, Oxford: Oxford University Press, pp. 565-614.
  • Korta, Kepa, and John, Perry, 2007. ‘Radical Minimalism, Moderate Contextualism.’ In Preyer & Peter (2007), pp. 94-111.
  • Kolbel, Max, 2002. Truth without Objectivity, London: Routledge.
  • Kompa, Nikola, 2002. ‘The Context-Sensitivity of Knowledge Ascriptions’, Grazer Philosophische Studien, 64: 79-96.
  • Leslie, Sarah-Jane, 2007. ‘How and Why to be a Moderate Contextualist’, in Preyer & Peter (2007), pp. 133-168.
  • MacFarlane, John, 2014. Assessment Sensitivity, Oxford: Oxford University Press.
  • Neale, Stephen, 2004. ‘This, That, and the Other’, in A. Bezuidenhout, and M. Reimer (eds.), Descriptions and Beyond, Oxford: Oxford University Press, pp. 68-182.
  • Penco, Carlo, and Vignolo, Massimiliano, 2019. ‘Some Reflexions on Conventions’, Croatian Journal of Philosophy, Vol. XIX, No. 57: 375-402.
  • Perry, John, 2001. Reference and Reflexivity, Stanford, CSLI Publications.
  • Predelli, Stefano, 2005. Contexts: Meaning, Truth, and the Use of Language, Oxford: Oxford University Press.
  • Recanati, Francois, 2004. Literal Meaning, New York: Cambridge University Press.
  • Recanati, Francois, 2010. Truth-Conditional Pragmatics, Oxford: Clarendon Press.
  • Richard, Mark, 2008. When Truth Gives Out, Oxford: Oxford University Press.
  • Rothschild, Daniel, and Segal, Gabriel, 2009. ‘Indexical Predicates’, Mind and Language, 24, 4: 467-493.
  • Searle, John, 1978. ‘Literal Meaning’, Erkenntnis, 13: 207-224.
  • Soames, Scott, 2002. Beyond Rigidity: The Unfinished Semantic Agenda of Naming and Necessity, Oxord: Oxford University Press.
  • Sperber, Dan, and Wilson, Deindre, 1986. Relevance: Communication and Cognition, Oxford: Blackwell.
  • Stanley, Jason, 2005a. Language in Context, Oxford: Oxford University Press.
  • Stanley, Jason, 2005b. Knowledge and Practical Interests, Oxford: Oxford University Press.
  • Stanley, Jason, and Williamson, Timothy, 1995. ‘Quantifiers and Context-Dependence’, Analysis, 55: 291-295.
  • Szabo, Zoltan Gendler, 2001. ‘Adjectives in context’. In I. Kenesi, and R. Harnish (eds.), Perspectives on Semantics, Pragmtics, and Discourse. Amsterdam: John Benjamins, pp. 119-146.
  • Szabo, Zoltan Gendler, 2006. ‘Sensitivity Training’, Mind and Language, 21: 31-38.
  • Taylor, Kenneth, 2003. Reference and the Rational Mind, Stanford, CA: CSLI Publications.
  • Taylor, Kenneth, 2007. ‘A Little Sensitivity Goes a Long Way’, in (Preyer & Peter (2007), pp. 63-92.
  • Travis, Charles, 2008. Occasion-Sensitivity: Selected Essays, Oxford: Oxford University Press.
  • Unnsteinsson, Elmar Geir, 2014. ‘Compositionality and Sandbag Semantics’, Synthese, 191: 3329–3350.

b. Further Reading

  • Bianchi, Claudia (ed.), 2004. The Semantic/Pragmatic Distinction, Stanford: CSLI.
    • A collection on context-sensitivity.
  • Domaneschi, Filippo, and Penco, Carlo (eds.), 2013. What is Said and What is Not, Stanford: CSLI.
    • A collection on context-sensitivity.
  • Fara, Delia Graff, and Russell, Gillian (eds.), 2012. The Routledge Companion to Philosophy of Language, New York: Routledge.
    • A companion to the philosophy of language that covers many of the topics that are discussed in this encyclopedia article.
  • Garcia-Carpintero, Manuel, and Kolbel, Max (eds.), 2008. Relative Truth, Oxford: Oxford University Press.
    • A collection on relativism.
  • Preyer, Gerhard, and Peter, George (eds.), 2007. Context-Sensitivity and Semantic Minimalism: New Essays on Semantics and Pragmatics, Oxford: Oxford University Press.
    • A collection on minimalism.
  • Recanati, Francois, Stojanovic, Isidora, and Villanueva, Neftali (eds.), 2010. Context Dependence, Perspective, and Relativity, Berlin: De Gruyter.
    • A collection on context-sensitivity.
  • Szabo, Zoltan Gendler (ed.), 2004. Semantics versus Pragmatics, Oxford: Oxford University Press.
    • A collection on context-sensitivity.

 

Author Information

Carlo Penco
Email: penco@unige.it
University of Genoa
Italy

and

Massimiliano Vignolo
Email: massimiliano.vignolo@unige.it
University of Genoa
Italy

Constructivism in Metaphysics

Although there is no canonical view of “Constructivism” within analytic metaphysics, here is a good starting definition:

Constructivism: Some existing entities are constructed by us in that they depend substantively on us.

Constructivism is a broad view with many, more specific, iterations. Versions of Constructivism will vary depending on who does the constructing, for example, all humans, an ideal subject, certain groups. It will also vary depending on what is constructed, for example, concrete objects, abstract objects, facts), and what the constructed entity is constructed out of (for example, natural objects, nonmodal stuff, concepts).  Most Constructivists take the constructing relation to be constitutive, that is, it is part of the very nature of constituted objects that they depend substantively on humans. Some, however, take the constituting relation to be merely causal. Some versions of Constructivism are relativistic; others are not. Another key difference between versions of Constructivism concerns whether they take the constructing relation to be global in scope (so everything—or, at least every object we have epistemic access to—is a constructed object) or local (so there are unconstructed objects, as well as constructed ones).

Given the many dimensions along which versions of Constructivism differ, one might wonder what unites them—what, that is, do all versions of Constructivism have in common that marks them out as versions of Constructivism? Constructivists are united first in their opposition to certain forms of Realism—namely, those that claim that x exists and is suitably independent of us. Constructivists about x agree that x exists, but they deny that it is suitably independent of us. Constructivism is distinguished from other versions of anti-Realism by the emphasis it places on the constructing relation. Constructivists are united by all being anti-Realists about x and by believing this is due to x’s being, in some way, constructed by us.

Table of Contents

  1. What Is Constructivism?
  2. 20th-Century Global Constructivism in Analytic Metaphysics
  3. 21st-Century Local Constructivism in Analytic Metaphysics
  4. Criticisms of Constructivism in Analytic Metaphysics
    1. Coherence Criticisms
    2. Substantive Criticisms
  5. Evaluating Constructivism within Analytic Metaphysics
  6. Timeline of Constructivism in Analytic Metaphysics
  7. References and Further Reading
    1. Constructivism: General
    2. Constructivism: Analytic Metaphysics
    3. Critics of Analytic Metaphysical Constructivism

1. What Is Constructivism?

There is no canonical definition of “Constructivism” within philosophy. The following, however, can serve as a good starting point definition for understanding constructivism:

Constructivism: Some extant entities are constructed by us in that they depend substantively on us. (Exactly what it is for an entity to “depend substantively on us” varies between views.)

Constructivism can be further elucidated by noting that constructing is a three-place relation Cxyz (x constructs y out of z) which involves a constructor x (generally taken to be humans), a constructed entity y, and a more basic entity z which serves as a building block for the constructed entity. (Some would take constructing to be a four-place relation Cxyztx constructs y out of z at time t. To simplify, the time variable is left out of the relation. It is straightforward to add it in. Each of the terms that are related are examined below before the examination of the constructing relation itself.

Regarding x, who does the constructing? There is no orthodox view regarding which humans do the constructing; different constructivists give different answers. Constructivists frequently (though not always) emphasize the role language and concepts play in constructing entities. Since language and concepts both arise at the level of the group, rather than the level of the individual, it is generally the group (for example, of language speakers or concept users) rather than the individual which is taken to be the constructor. (Lynne Ruder Baker, for example, is typical of Constructivists when she argues that constructed objects rely on our societal conventions as a whole, rather than on the views of any lone individual: “I would not have brought into existence a new thing, a bojangle; our conventions and practices do not have a place for bojangles. It is not just thinking that brings things into existence” (Baker 2007, 44). See also Thomasson (2003, 2007) and Remhof (2017).) Some Constructivists (for example, Kant) take the constructor to be all human beings; other Constructivists (for example, Goodman, Putnam) take the constructor to be a subset of all human beings (for example, society A, society B). There are some versions of Constructivism which take it to be individuals, rather than groups, which do the constructing. (See Goswick, 2018a, 2018b.) These views are more likely to rely on overt responses (for example, how Sally responds when presented with some atoms arranged rockwise) than on language and concepts.

Regarding y, what is constructed? Versions of Constructivism within analytic philosophy can be distinguished based on which entities they focus on. Constructivism in the philosophy of science, for instance, tends to focus on the construction of scientific knowledge. (Scientific “Constructivists maintain that … scientific knowledge is ‘produced’ primarily by scientists and only to a lesser extent determined by fixed structures in the world” (Downes 1-2). See also Kuhn (1996) and Feyerabend (2010).) Constructivism in aesthetics focuses on the construction of an artwork’s meaning and/or on the construction of aesthetic properties more generally. (Aesthetic Constructivists argue that “rather than uncovering the meaning or representational properties of an artwork, an interpretation instead generates an artwork’s meaning” (Alward 247). See also Werner (2015).) Constructivism in the philosophy of mathematics focuses on mathematics objects. (Mathematical Constructivists argue that, when we claim a mathematical object exists, we mean that we can construct a proof of its existence (Bridges and Palmgren 2018).) Constructivism within ethics concerns the origin and nature of our ethical judgments and of ethical properties. (Ethical Constructivists argue that “the correctness of our judgments about what we ought to do is determined by facts about what we believe, or desire, or choose and not, as Realism would have it, by facts about a prior and independent normative reality” (Jezzi 1). Ethical Constructivism has been defended by Korsgaard, Scanlon, and Rawls. For an explication of their views, see Jezzi (2019) and Street (2008, 2010).) Social Constructivism focuses on the construction of distinctly social categories such as race, gender, and sexuality. (See Hacking (1986, 1992, 1999) and Haslanger (1995, 2003, 2012).) Constructivism in metaphysics focuses on the construction of physical objects. (See, for example, Baker (2004, 2007), Goodman (1978, 1980), Putnam (1982, 1987), Thomasson (2003, 2007).)

Regarding z, what is the more basic entity that serves as a building block for the constructed entity? There is no general answer to this question, as different versions of Constructivism give different answers. Some Constructivists (for example, Goswick) take physical stuff to be the basic building blocks of constructed entities. Goswick argues that modal objects are composite objects which have physical stuff and sort-properties as their parts (Goswick 2018b). Some Constructivists (for example, Goodman) take worlds to be the basic building blocks of constructed entities. Goodman argues that it is constructivism all the way down, so each world we construct is itself built out of other worlds.

Regarding C, what is the relation of constructing? Constructivists vary widely regarding the exact details of the constructing relation. In particular, versions of Constructivism vary with regard to whether the constructing relation is (1) global or local, (2) causal or constitutive, (3) temporally and counterfactually robust or not, and (4) relative or absolute. Each of these dimensions of difference are examined in turn.

Regarding 1, is the constructing relation global or local? Historically, the term “constructivism” has been associated with the global claim that every entity to which we have epistemic access is constructed. (Ant Eagle (personal correspondence) points out that there could be an even more global form of Constructivism which claims that all entities, even those to which we do not have epistemic access, are constructed. This is an intriguing view. However, since it has not yet been defended in analytic metaphysics, it is not discussed here.) Kant held this view; as did the main 20th-century proponents of Constructivism (Goodman and Putnam). In the 21st century, philosophers have explored a more local constructing relation in which only some of the entities we have epistemic access to are constructed. Searle, for instance, argues that social objects (for example, money, bathtubs) are constructed but natural objects (for example, trees, rocks) are not. Einheuser argues that modal objects are constructed but nonmodal stuff is not.

Regarding 2, is the constructing relation causal or constitutive? For example, when an author claims that we construct money does she mean that we bear a causal relation to money (that is, we play a causal role in bringing about the existence of money or in money’s having the nature it has) or does she mean that we bear a constitutive relation to money (that is, part of what it is for money to exist or for money to have the nature it has is for us to bear the constitutor-of relation to it)? We can define the distinction as follows: (See also Haslanger (2003, pp. 317-318) and Mallon (2019, p. 4).)

y is causally constructed by x iff x caused y to exist or to have the nature it has.

For example, we caused that $20 bill to come into existence when we printed it at the National Mint and we caused that $20 bill to have the nature it has when we embedded it in the American currency system.

y is constitutively constructed by x iff what it is for y to exist is for x to F or what it is for y to have the nature it has is for x to F.

For example, what it is for a stop sign to exist is for something with physical features P1Pn to play role r in a human society and what it is for a y to have stop-sign-nature is, in part, for humans to treat y as a stop sign.

Some Constructivists (for example, Goodman, Putnam) do not discuss whether they intend their constructing to be causal or constitutive. (Presumably because the central aims they intend to accomplish by endorsing Constructivism can be satisfied via either a causal or a constitutive version. We can easily modify their views to be explicitly causal or explicitly constitutive. For a Constructivism that is causal, endorse the standard Goodman/Putnam line and add to it that the constructing is to be taken causally. For a Constructivism that is constitutive, endorse the standard Goodman/Putnam line and add to it that the constructing is to be taken constitutively.) Other Constructivists are explicit about whether the constructing relation they utilize is causal or constitutive. Thomasson, for example, notes that

The sort of dependence relevant to [Constructivism] is logical dependence, i.e. dependence which is knowable a priori by analyzing the relevant concepts, not a mere causal or nomological dependence. The very idea of an object’s being money presupposes collective agreement about what counts as money. The very idea of something being an artifact requires that it have been produced by a subject with certain intentions. (Thomasson 2003, 580)

Remhof argues that an object is constructed “iff the identity conditions of the object essentially depend on (i.e., are partly constituted by) our intentional activities” (Remhof 2014, 2). And Searle notes that “part of being a cocktail party is being thought to be a cocktail party; part of being a war is being thought to be a war. This is a remarkable feature of social facts; it has no analogue among physical facts” (Searle 33-34). (For more on constitutive versions of Constructivism, see Haslanger (2003) and Baker (2007, p. 12). For examples of Constructivisms which are causal, see Hacking (1999) and Goswick (2018b). Regarding Hacking, Haslanger notes: “The basis of Hacking’s social constructivism is the historical [constructivist] who claims that, ‘Contrary to what is usually believed, x is the contingent result of historical events and forces, therefore x need not have existed, is not determined by the nature of things, etc.’ … He says explicitly that construction stories are histories and the point, as he sees it, is to argue for the contingency or alterability of the phenomenon by noting its social or historical origins” (Haslanger 2003, 303).)

Regarding 3, is the constituting relation temporally and counterfactually robust or not? Temporal robustness concerns whether constructed entity e exists and has the nature it has prior to and posterior to our constructing it. If yes, then e is temporally robust; otherwise, e is not temporally robust. Counterfactual robustness concerns whether constructed entity e would exist and have the nature it has if certain things were different than they actually are, for example, if we had never existed or had had different conventions/responses/intentions/systems of classification than we actually have. If it would, then the constructing relation is counterfactually robust; otherwise, it is not. Some Constructivists (for example, Putnam, Goodman) deny that the constructing relation is temporally/counterfactually robust. They believe that before we existed there were no stars and that, if we employed different systems of classification, there would be no stars. Other Constructivists take the constructing relation to be temporally/counterfactually robust. Remhof, for instance, argues that even “if there had been no people there would still have been stars and dinosaurs; there would still have been things that would be constructed by humans were they around” (Remhof 2014, 3). Schwartz adds that:

In the process of fashioning classificatory schemes and theoretical frameworks, we organize our world with a past, as well as a future, and provide for there being objects or states of affairs that predate us. Although these facts may be about distant earlier times, they are themselves retrospective facts, not readymade or built into the eternal order. (Schwartz 1986, 436)

An advantage of taking the constructing relation to be temporally/counterfactually robust is that many find it difficult to believe that, for example, there were no stars before there were people or that there would not have been stars had people employed different systems of classification. A disadvantage of endorsing a temporally/counterfactually robust Constructivism is that it is difficult to give an account which is temporally/counterfactually robust but still respects the genuine role Constructivists take humans to play in constructing. After all, if the stars would have been there even if we never existed, why think we play any substantial role in constructing them? At the very least, any role we do play must be non-essential.

Regarding 4, is the constituting relation relative or absolute? Some philosophers (for example, Kant) take the constructing relation to be absolute. Kant thought that all humans, by virtue of being human, employed the same categories and thus created the same phenomena. Other philosophers (for example, Goodman and Putnam) take the constructing relation to be relative. Both argued that worlds exist only relative to a conceptual scheme. Although relativism is often associated with Constructivism (presumably because the most prominent Constructivists of the 20th century also happened to be relativists), the two views are orthogonal. There are relativist and absolutist versions of Constructivism. Moreover, it is easy to slightly tweak relativist views to make them absolutist, or to slightly tweak absolutist views to make them relativist.

At this point, four ways in which constructing relations can differ from one another have been examined: with regard to whether they are (i) global or local, (ii) causal or constitutive, (iii) temporally/counterfactually robust or not, and (iv) relativistic or absolute. The starting point definition of Constructivism is:

Constructivism: Some extant entities are constructed by us in that they depend substantively on us.

Exactly what it is for an entity to “depend substantively on us” varies between views. This definition holds up well to scrutiny. It captures the commonalities one finds across a wide swath of views across sub-disciplines of philosophy (for example, the philosophy of mathematics, aesthetics, metaphysics) and is general enough to accommodate the many differences between views (for example, some Constructivists take constructing to be constitutive, others take it to be merely causal; some Constructivists take the scope of Constructivism to be global, others take it to be very limited in scope and claim there are very few constructed entities). There is some worry, however, that—being so general—the given definition is too broad: are there any views that do not fall under the Constructivist umbrella?

Constructivism has historically been developed in opposition to Realism; and examining the tension between Constructivism and Realism can help us further understand Constructivism. Although the word “realism” is used widely within philosophy and different philosophers take it to mean different things, several fairly canonical uses have evolved: (i) the linguistic understanding of Realism advocated by Dummett which sees the question of Realism as concerning whether sentences have evidence-transcendent truth conditions or verificationist truth conditions, (ii) an understanding of Realism developed within the philosophy of science which centers on whether the aim of scientific theories is truth understood as correspondence to an external world, and (iii) an understanding of Realism developed within metaphysics which centers on whether x exists and is suitably independent of humans. The understanding of Realism relevant to elucidating Constructivism is this final one:

Ontological Realism (about x): x exists and is suitably independent of us.

Constructivism (about x) stands in opposition to Ontological Realism (about x). The Ontological Realist takes x to be “suitably independent of us,” whereas the Constructivist takes x to “depend substantively on us for either its existence or its nature.” Whatever suitable independence is, it rules out depending substantially on us. Although one does still hear philosophers talk simply of “Realism,” it has become far more common, within analytic metaphysics, to talk of “Realism about x” and to take Realism to be a first-order metaphysical view concerning the existence and/or human independence of specific types of entities (for example, properties, social objects, numbers, ordinary objects) rather than a general stance one has (concerning, for example, the purpose of philosophical investigation). Following this trend in the literature on Realism (that is, the move away from talking about Realism and anti-Realism in general to talking specifically of Realism about x) can help us make more precise the definition of Constructivism.

Constructivism (about x): x exists and depends substantively on us for either its existence or its nature.

This definition of Constructivism is still very general (that is, because it does not spell out what “depends substantively on” entails/requires). However, given that it is standard within the literature on Realism to give a definition which is general enough to encompass many different understandings of “suitably independent of” and that Constructivism has historically been developed in opposition to Realism, it makes sense to mimic this level of generality in defining Constructivism.

One last precisification is in order before we move on to discussing the details of specific versions of Constructivism. A wide array of differences track whether the constructing relation is taken to be global or local. Global and local versions of Constructivism differ with regard to when they were/are endorsed (global: in the 20th century versus local: subsequently), why they are endorsed (global: thinks Realism itself is somehow defective versus local: likes Realism in general but thinks there is at least one sort of object it can’t account for), and what the best objections to the view are (global: general objections to constructing versus local: specific objections regarding whether some x really is constructed). Given this, it is useful to separate our discussion of Constructivism into Global Constructivism and Local Constructivism.

Global Constructivism: For all existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.

Local Constructivism: For only some existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.

2. 20th-Century Global Constructivism in Analytic Metaphysics

Who are the global constructivists? Who is it, that is, who argues that

[All physical objects we have epistemic access to are] constructed in a way that reflects our contingent needs and interests. [Global Constructivists think that we] can only make sense of there being a fact of the matter about the world after we have agreed to employ some descriptions of it as opposed to others, that prior to the use of those descriptions, there can be no sense to the idea that there is a fact of the matter “out there” constraining which of our descriptions are true and which false. (Boghossian 25, 32)

The number of Global Constructivists within analytic metaphysics is small. (Constructivism has a long and healthy history within Continental philosophy and is still much more widely discussed within contemporary Continental metaphysics than it is within contemporary analytic metaphysics. See Kant (1965), Foucault (1970), and Remhof (2017).) Scouring the literature will yield only a handful. The best-known proponents are Goodman and Putnam. Schwartz supported Goodman’s view in the 1980s and most recently wrote an article supporting the view in 2000. Kant (late 1700s) and James (early 1900s) were early proponents of the view. Rorty and Dummett each endorse the view in passing. These seven authors exhaust the list of analytic Global Constructivists. (Al Wilson (personal communication) suggests this list might be expanded to include Rudolf Carnap, Simon Blackburn, and Huw Price.) Their motivation for endorsing Global Constructivism is worries they have about the cogency of Realism. They think that, if Realism were true, we would have no way to denote objects or to know about them. Since we can denote objects and do have knowledge of them, Realism must not be the correct account of them. The correct account is, rather, Constructivism. Although their number is small, their influence—especially that of Goodman and Putnam—has reverberated within analytic metaphysics. The remainder of this section examines the views of each of the central defenders of Global Constructivism.

Goodman defended Global Constructivism is a series of articles and books clustering around the 1980s: Ways of Worldmaking (1978), “On Starmaking” (1980), “Notes on the Well-Made World” (1983), “On Some Worldly Worries” (1993). Goodman, himself, described his view as “a radical relativism under rigorous restraints, that eventuates in something akin to irrealism” (1978 x). He believed that there were many right worlds, that these worlds exist only relative to a set of concepts, and that the building blocks of constructed objects are other constructed objects: “Worldmaking as we know it always starts from worlds already on hand; the making is a remaking” (1978 6-7). Goodman thought that there is “no sharp line to be drawn between the character of the experience and the description given by the subject” (Putnam 1979, 604). Goodman is perhaps the most earnest and sincere defender of the global scope of Constructivism. Whereas others tend to find the idea that we construct, for example, stars nearly incoherent; Goodman finds the idea that we did not construct the stars nearly incoherent:

Scheffler contends that we cannot have made the stars. I ask him which features of the stars we did not make, and challenge him to state how these differ from features clearly dependent on discourse. … We make a star as we make a constellation, by putting its parts together and marking off its boundaries. … The worldmaking mainly in question here is making not with hands but with minds, or rather with languages or other symbol systems. Yet when I say that worlds are made, I mean it literally. … That we can make the stars dance, as Galileo and Bruno made the earth move and the sun stop, not by physical force but by verbal invention, is plain enough. (Goodman 1980 213 and 1983 103)

Goodman takes the constructors of reality to be societies (rather than lone individuals). He takes constructing to be relative, so, for example, society A constructs books and plants, whereas, faced with the same circumstances, society B constructs food and fuel (Goodman 1983, 103). He does not comment on whether the constructing relation is causal or constitutive. Like all relativistic versions of Constructivism, his view is not temporally/counterfactually robust. Goodman’s motivation for endorsing Global Constructivism is that he thinks it is clear that we can denote and know about, for example, stars and he thinks we would not be able to do this were Realism true.

Schwartz defends Goodmanian Global Constructivism in two articles: “I’m Going to Make You a Star” (1986) and “Starting from Scratch: Making Worlds” (2000). Since Goodman’s writings on constructivism can often be difficult to understand, examining Schwartz’s writings can serve to give us further insight into Goodman’s view. Schwartz writes that:

In shaping the concepts and classification schemes we employ in describing our world, we do take part in constituting what that reality is. Whether there are stars, and what they are like, … are facts that are carved out in the very process of devising perspicuous theories to aid in understanding our world. … Until we fashion star concepts and related categories, and integrate them into ongoing theories and speculations, there is no interesting sense in which the facts about stars are really one way rather than another. (Schwartz 1986, 429)

Schwartz emphasizes the role we play in making it the case that certain properties are instantiated and, thus, in drawing out ordinary objects from the mass of undifferentiated stuff which exists independently of people:

In natura rerum there are no inherent facts about the properties [x] has. It is no more a star, than it is a Big Dipper star and belongs to a constellation. … From the worldmaker’s perspective, the unmade world is a world without determinate qualities and shape. Pure substance, thisness, or Being may abound, but there is nothing to give IT specific character. (Schwartz 2000, 156)

Schwartz notes that, “no argument is needed to show that we do have some power to create by conceptualization and symbolic activity. Poems, promises, and predictions are a few obvious examples” (Schwartz 1986, 428). For example, it is uncontroversial that part of what it is to be a Scrabble joker (one of those blank pieces of wood that you can use as any letter when playing the game of Scrabble) is to be embedded in a certain human context: “These bits of wooden reality could no more be Scrabble jokers without the cognitive carving out of the features and dimensions of the concept, than they could be Scrabble jokers had they never been carved from the tree” (Schwartz 1986, 430-431). Schwartz, and Global Constructivists in general, differ from non-constructivists in that they think all ordinary objects (and, in fact, all the objects we have epistemic access to) are like Scrabble jokers. Of course, there is something that exists independently of us. But this something is amorphous, undefined, and plays no role in our epistemic lives. What we are aware of is the objects we create out of this mass by the (often unconscious) imposition of our concepts.

The other key defender of Global Constructivism is Putnam. Like Goodman, Putnam defended Global Constructivism is a series of articles and books which cluster around the 1980s, see, for example, “Reflections on Goodman’s Ways of Worldmaking” (1979), Reason, Truth, and History (1981), “Why There Isn’t a Ready-Made World” (1982), and The Many Faces of Realism (1987). Putnam thinks philosophy should look to science, and he shares the Positivists’ skepticism about traditional metaphysics:

There is … nothing in the history of science to suggest that it either aims at or should aim at one single absolute version of “the world”. On the contrary, such an aim, which would require science itself to decide which of the empirically equivalent successful theories in any given context was “really true”, is contrary to the whole spirit of an enterprise whose strategy from the first has been to confine itself to claims with clear empirical significance. … Metaphysics, or the enterprise of describing the “furniture of the world”, the “things in themselves” apart from our conceptual imposition, has been rejected by many analytic philosophers. … apart from relics, it is virtually only materialists [i.e. physicalists] who continue the traditional enterprise. (Putnam 1982 144 and 164)

Contrary to Putnam’s hopes, in the twenty-first century the materialists have won, and most metaphysicians recognize the sharp subject/object divide that Putnam rejected. Putnam argues that objects “do not exist independently of conceptual schemes. We cut up the world into objects when we introduce one scheme or another” (Cortens 41). Putnam takes the constructors of reality to be societies, the constructing to be relative, and does not comment on whether the constructing relation is causal or constitutive. Like all relativistic versions of Constructivism, his view is not temporally/counterfactually robust. Putnam’s motivation for endorsing Global Constructivism is that he rejects the sharp division between object and subject which Realism presupposes. He thinks analytic philosophy erred when it responded to 17th-century science by introducing a distinction between primary and secondary qualities (Putnam 1987). He argues that we should instead have taken everything that exists to be a muddled combination of the objective and subjective; there is no way to neatly separate out the two. By recognizing the role we play in constructing objects, Global Constructivism pays homage to this lack of separation; Realism does not. Thus, Putnam prefers Global Constructivism to Realism. (See Hale and Wright (2017) for further discussion of Putnam’s rejection of Realism.)

Other adherents of Global Constructivism include Kant, James, Rorty, and Dummett. (See Kant (1965), James (1907), Rorty (1972), and Dummett (1993).) In “The World Well Lost” (1972), Rorty argues that “the realist true believer’s notion of the world is an obsession rather than an intuition” (Rorty 661). He endorses an account of alternative conceptual frameworks which draws heavily on continental philosophers (Hegel, Kant, Heidegger), as well as on Dewey. Ultimately, he concludes that we should stop focusing on trying to find an independent world that is not there and should recognize the role we play in constructing the world. In Frege: Philosopher of Language (1993), Dummett argues that the “picture of reality as an amorphous lump, not yet articulated into discrete objects, thus proves to be a correct one. [The world does not present] itself to us as already dissected into discrete objects” (Dummett 577). Rather, in the process of developing language, we develop the criterion of identity associated with each term and then, with this in place, the world is individuated into distinct objects.

The heyday of analytic Global Constructivism was the 1980s. No one in analytic metaphysics defends the view Schwartz’s defense in 2000. The view has now more or less been abandoned. Remhof discussed the view in 2014, but he did not endorse it. However, Global Constructivism continues to be influential in discussions, where it serves primarily as a rallying point for the Realists who argue against it—see, for example, Devitt (1997) and Boghossian (2006). Although there are no contemporary Global Constructivists, Local Constructivism—which is an heir to Global Constructivism—is alive and well. The next section examines the many versions of Local Constructivism which proliferate in the twenty-first century.

3. 21st-Century Local Constructivism in Analytic Metaphysics

You will not find the term “constructivism” bandied about within contemporary analytic metaphysics with anything approaching the frequency with which the term is used in other sub-disciplines of analytic philosophy or within Continental philosophy. (Why is not the term “constructivism” used more frequently in contemporary analytic metaphysics? The reluctance to use the term “constructivism” probably stems from the current sociology of analytic metaphysics. Realism has a strong grip on analytic metaphysics. Moreover, many anti-Realist metaphysics writings are strikingly bad, and most philosophers currently working within analytic philosophy can easily recall the criticism that was directed toward Global Constructivism: “Barring a kind of anti-realism that none of us should tolerate” (Hawthorne 2006, 109). “[Constructivism] is such a bizarre view that it is hard to believe that anyone actually endorses it” (Boghossian 25). “We should not close our eyes to the fact that Constructivism is prima facie absurd, a truly bizarre doctrine” (Devitt 2010, 105). These factors conspire to make contemporary analytic metaphysics a particularly unappealing place to launch any theory which might smell of anti-Realism, and to be a Constructivist about x is to be an anti-Realist about x.) However, if one looks at the content of views within analytic metaphysics rather than at what the views are labeled, it quickly becomes apparent that many of them meet the definition of Local Constructivism.

Local Constructivism: For only some existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.

Although they may be Realists about many kinds of entities (and may self-identify as “Realists”), many metaphysicians of the twenty-first century are Constructivists about at least some kinds of entities. (See, for example, Baker (2004 and 2007), Einheuser (2011), Evnine (2016), Goswick (2018a), Kriegel (2008), Searle (1995), Sidelle (1989), Thomasson (2003 and 2007), Varzi (2011).) Let’s consider the views of several of these metaphysicians. In particular, let’s look at Local Constructivism with regard to vague objects (Heller), modal objects (Sidelle, Einheuser, Goswick), composite objects (Kriegel), artifacts (Searle, Thomasson, Baker, Devitt), and objects with conventional boundaries (Varzi).

Although not himself a Constructivist, in The Ontology of Physical Objects (1990) Heller presents a view which is a close ancestor of contemporary Local Constructivism. Since a minor tweak turns his view into Local Constructivism, since he was one of the first in the general field of Local Constructivism, and since his work has been so influential on contemporary Local Constructivists, it is worth taking a quick look at exactly what Heller says and why Local Constructivists have found inspiration in his book. Heller distinguishes between what he calls “real objects” and what he calls “conventional objects.” Real objects are four-dimensional hunks of matter which have precise spatiotemporal boundaries; we generally do not talk or think about real objects (since we tend not to individuate so finely as to denote objects with precise spatiotemporal boundaries). “Conventional object” is the name Heller gives to objects which we think exist, but do not really (due to the fact that, if they did exist they would have vague spatiotemporal boundaries and nothing that exists has vague spatiotemporal boundaries) (Heller 47). For example, Heller thinks there is no statue and no lump of clay:

The [purported] difference [between the statue and the clay] is a matter of convention. … This difference cannot reflect a real difference in the objects. There is only one object in the spatiotemporal region claimed to be occupied by both the statue and the lump of clay. There . are no coincident entities; there are just . different conventions applicable to a single physical object. (Heller 32)

What really exists (in the rough vicinity we intuitively think contains the statue) are many precise hunks of matter. None of these hunks is a statue or a lump of clay (because “statue” and “lump of clay” are both ordinary language terms which are not precise enough to distinguish between, for example, two hunks of matter which differ only with regard to the fact that one includes, and the other excludes, atom a), but we mistakenly think there is a statue (where really there are just these various hunks of matter). Heller is an Eliminativist about conventional objects: there are none. However, it is a short step from Heller’s Eliminativism about vague objects to Constructivism about vague objects. The framework is in place; Heller has already provided a thorough account of the difference between nonconventional objects (hunks of matter) and conventional objects (objects—such as rocks, dogs, mountains, and necklaces—which have vague spatiotemporal boundaries) and of how our causal interaction with nonconventional objects gives rise to our belief that there are conventional objects. To be a Constructivist rather than an Eliminativist about Heller’s conventional objects, one need only argue, contra Heller, that our conventions in fact bring new objects—objects which are constructed out of hunks of matter and our conventions—into existence. (Just to re-iterate, Heller is opposed to this: “There are other alternatives that can be quickly discounted. For instance, the claim that we somehow create a new physical object by passing legislation involves the absurd idea that without manipulating or creating any matter we can create a physical object” (Heller 36). However, by so thoroughly examining nonconventional objects, conventional objects, and the relationship between them, he laid the groundwork for the Local Constructivists that would come after him.)

Local Constructivists about modal objects share Heller’s skepticism about the ability of Realism to account for ordinary objects. However, whereas Heller worries that ordinary objects have vague spatiotemporal boundaries but that all objects that really exist have precise spatiotemporal boundaries and resolves this worry by being an Eliminativist about ordinary objects, Local Constructivists about modal objects worry that ordinary objects have “deep” modal properties but that all objects that Realism is true of have at most “shallow” modal properties. (Where a “deep” modal property is any de re necessity or de re possibility which is non-trivial and a “shallow” modal property is any modal property which is not “deep.” See Goswick (2018b) for a more detailed discussion.) Rather than being Eliminativists about ordinary objects, they resolve this worry by endorsing Local Constructivism about objects which have at least one “deep” modal property (henceforth, such objects will be referred to as “modal objects”).

Sidelle and Einheuser both defend Local Constructivism about modal objects. Sidelle’s goal in his (1989) is to defend a conventionalist account of modality. He argues that conventionalism about modality requires Constructivism about modal objects (1989 77). He relies on (nonmodal) stuff as the basic building block out of which modal objects are constructed: “[The] conventionalist should … say that what is primitively ostended is ‘stuff’, stuff looking, of course, just as the world looks, but devoid of modal properties, identity conditions, and all that imports. For a slogan, one might say that stuff is preobjectual” (1989 54-55). Modal objects come to exist when humans provide individuating conditions. It is because we respond to stuff s as if it is a chair and apply the label “chair” to it that there is a chair with persistence conditions c rather than just some stuff. Einheuser’s goal in her (2011) is to ground modality. She argues that the best way to do this is to endorse a conceptualist account of modality and that so doing requires endorsing Constructivism about modal objects. Like Sidelle, she endorsees preobjectual stuff: “the content of the spatio-temporal region of the world occupied by an object [is] the stuff of the object” (Einheuser 303). She argues that this stuff “does not contain … built-in persistence criteria. … It is ‘objectually inarticulate’” (Einheuser 303). Modal objects are created out of such mere stuff by the imposition of our concepts:

Concepts like statue and piece of alloy impose persistence criteria on portions of material stuff and thereby “configure” objects. That is, they induce objects governed by these persistence criteria. Our concept statue is associated with one set of persistence criteria. Applied to a suitable portion of stuff, the concept statue configures an object governed by these criteria. (Einheuser 302)

Einheuser emphasizes the fact that what we are doing is creating a new object (a piece of alloy) rather than adding modal properties to pre-existing stuff. (Einheuser on why we must be Local Constructivists about modal objects rather than Local Constructivists about only modal properties: “There is the view that our concepts project modal properties onto otherwise modally unvested objects. This view appears to imply that objects have their modal properties merely contingently. [The piece of alloy may be necessarily physical] but that is just a contingent fact about [it] for our concepts might have projected a different modal property [on to it]. That seems tantamount to giving up on the idea of de re necessity. … The conceptualist considered here maintains conceptualism not merely about modal properties but about objects: Concepts don’t project modal properties onto objects. Objects themselves are, in a sense to be clarified, projections of concepts” (302).)

Kriegel endorses Local Constructivism about composite objects. He takes Realism to be true of non-composite objects and uses them as the basic building blocks of his composite objects. He worries that, given Realism, there is simply no fact of the matter regarding whether the xs compose an o (Kriegel 2008). He argues that we should be conventionalists about composition: “the xs compose an o iff the xs are such as to produce the response that the xs compose an o in normal intuiters under normal forced-choice conditions” (Kriegel 10). A side effect of this conventionalism about composition is Local Constructivism about composite objects, namely, Kriegel is a Realist about some physical entities r (the non-composite objects) to which we have epistemic access, and he thinks that by acting in some specified way (having the composition intuition) with regard to these physical entities we thereby bring new physical objects (the composite ones) into existence.

Local Constructivism about artifacts is the most wide-spread form of Local Constructivism. It is endorsed by Seale, Thomasson, Baker, and Devitt, among others. (See also Evnine (2016).) Searle is a Realist about natural objects such as Mt. Everest, bits of metal, land, stones, water, and trees (Searle 153, 191, 4). He is a Constructivist about artifactual objects such as money, cars, bathtubs, restaurants, and schools (Searle xi, 4). He takes the natural objects to be the basic building blocks of the artifactual ones:

[The] ontological subjectivity of the socially constructed reality requires an ontological objective reality out of which it is constructed, because there has to be something for the construction to be constructed out of. To construct money, property, and language, for example, there have to be the raw materials of bits of metal, paper, land, sounds, and marks. And the raw materials cannot in turn be socially constructed without presupposing some even rawer materials out of which they are constructed, until eventually we reach a bedrock of brute physical phenomena independent of all representations. (Searle 191)

Thomasson’s Local Constructivism about artifacts arises from her easy ontology. She claims that terms have application and co-application conditions and that, when these conditions are satisfied, the term denotes an object of kind k (Thomasson 2007). Although humans set the application and co-application conditions for natural kind terms such as “rock,” humans play no role in making it the case that these conditions are satisfied. Thus, Realism about natural objects is true. However, with regard to artifactual kind terms such as “money,” humans both set the application and co-application conditions for the term and play a role in making it the case that these conditions are satisfied: “The very idea of something being an artifact requires that it have been produced by a subject with certain intentions” (Thomasson 2003, 580). Intentions, alone, however are not enough:

Although artifacts depend on human beliefs and intentions regarding their nature and their existence, the way they are also partially depends on real acts, e.g. of manipulating things in the environment. Many of the properties of artifacts are determined by physical aspects of the artifacts without regard for our beliefs about them. (Thomasson 2003, 581)

Every concrete artifact includes unconstructed properties which serve as the basis for the object’s constructed properties.

Baker distinguishes between what she calls “ID objects” and non-ID objects. ID objects are objects—such as stop signs, tables, houses, driver’s licenses, and hammocks—that could not exist in a world lacking beings with beliefs, desires, and intentions (Baker 2007, 12). Non-ID objects are objects which could exist in a world which lacked such beliefs, desires, and intentions, for example, dinosaurs, planets, rocks, trees, dogs. Artifacts are ID objects. They are constructed out of our doing certain things to and having certain attitudes toward non-ID objects.

When a thing of one primary kind is in certain circumstances, a thing of another primary kind—a new thing, with new causal powers—comes to exist. [Sometimes this new thing is an ID object.] For example, when an octagonal piece of metal is in circumstances of being painted red with white marks of the shape S-T-O-P, and is in an environment that has certain conventions and laws, a new thing—a traffic sign—comes into existence. (Baker 2007, 13)

Baker advocates a constitution theory according to which coinciding objects stand in a hierarchical relation of constitution. Aggregates are fundamental, non-ID objects, and serve as the ground-level building blocks out of which all ID objects, including artifacts, are built: “Although … thought and talk make an essential contribution to the existence of certain objects [e.g., artifacts], … thought and talk alone [do not] bring into existence any physical objects: conventions, practices, and pre-existing materials [i.e., non-ID aggregates] are also required” (Baker 2007, 46). (Unlike nearly all the other advocates of Local Constructivism about artifacts, Baker does not take constructed objects to be inferior to non-constructed ones: “An artifact has as great a claim as a natural object to be a genuine substance. This is so because artifactual kinds are primary kinds. Their functions are their essences” (Baker 2004, 104).)

Devitt is another defender of Local Constructivism about artifacts. He distinguishes between artifactual objects whose “natures are functions that involve the purposes of agents” (Devitt 1997, 247) and natural objects whose nature is not such a function: “A hammer is a hammer in virtue of its function for hammerers. A tree is not a tree in virtue of its function” (Devitt 1997, 247). Devitt argues that every constructed artifact can also be described as a natural object which is not constructed: “Everything that is [an artifact] is also a [natural object]; thus, a fence may also be a row of trees” (Devitt 1997, 248). He is at pains to distance his Local Constructivism from Global Constructivism and emphasizes the role unconstructed objects play in bringing about the existence of constructed objects:

No amount of thinking about something as, say, a hammer is enough to make it a hammer. … Neither designing something to hammer nor using it to hammer is sufficient to make it a hammer. [Only] things of certain physical types could be [hammers]. In this way [artifacts] are directly dependent on the [unconstructed] world. (Devitt 1997, 248-249)

The final version of Local Constructivism to be examined is Varzi’s Local Constructivism about objects with conventional boundaries. Varzi distinguishes between objects with natural boundaries and those with conventional boundaries. He argues that, “If a certain entity enjoys natural boundaries, it is reasonable to suppose that its identity and survival conditions do not depend on us; it is a bona fide entity of its own” (Varzi 137). On the other hand, if an entity’s “boundaries are artificial—if they reflect the articulation of reality that is effected through human cognition and social practices—then the entity itself is to some degree a fiat entity, a product of our world-making” (Varzi 137). Varzi is quick to point to the role objects with natural boundaries play in our construction of objects with conventional boundaries: “the parts of the dough [the objects with natural boundaries] provide the appropriate real basis for our fiat acts. [They] are whatever they are [independently of us] and the relevant mereology is a genuine piece of metaphysics” (Varzi 145). Varzi also emphasizes the compatibility of Local Constructivism with a generally Realist picture:

It is worth emphasizing that even a radical [constructivist] stance need not yield the nihilist apocalypse heralded by postmodern propaganda. [Constructed objects] lack autonomous metaphysical thickness. But other individuals may present themselves. For instance, on a Quinean metaphysics, there is an individual corresponding to “the material content, however heterogeneous, of some portion of space-time, however disconnected and gerrymandered”. … Such individuals are perfectly nonconventional, yet the overall [Quinean] picture is one that a [constructivist] is free to endorse. (Varzi 147-148)

Having examined five versions of Local Constructivism—constructivism about vague objects, modal objects, composite objects, artifacts, and objects with conventional boundaries—I turn now to describing what all these view have in common that marks them out as constructivist views. Taking note of what each view takes to be unconstructed and what each view takes to be constructed can provide insight into what all the views have in common:

Author Unconstructed Entities Constructed Entities
neo-Hellerian 4D hunks of matter vague objects
Sidelle/Einheuser/Goswick nonmodal stuff modal objects
Kriegel simple objects composite objects
Searle/Thomasson/Baker/Devitt natural objects artifactual objects
Varzi natural boundaries conventional boundaries

The definitive thing that each version of Local Constructivism has in common that makes it a Local Constructivist view is that (i) each takes there to be something unconstructed to which we have epistemic access, and (ii) each thinks that by acting in some specified way with regard to these unconstructed entities we thereby bring new physical objects (the constructed ones) into existence. The views differ with regard to what they think the unconstructed entities are and with regard to what they think we have to do in order to utilize these unconstructed entities to construct new entities, but they are all alike in endorsing (i) and (ii). This is what marks them out as local and constructivist. They are local—rather than global—in scope because they all think only some of the entities that we have epistemic access to are constructed. They are Constructivist—rather than Realist—about vague objects or modal objects or … objects because they take these entities to depend substantially (either causally or constitutively) on us for either their existence or nature.

Broadly speaking, all Local Constructivists share the same motivation for endorsing Constructivism—namely, they think that although Realism is generally a good theory there are little bits of the world that it cannot account for. Although Local Constructivists tend to be fond of Realism, they are even fonder of certain entities which they take Realism to be unable to accommodate. They resolve this tension (that is, between the desire to be Realists and the desire to have entities e in their ontology) by endorsing Local Constructivism about entities e. The appeal of Local Constructivism springs from an inherent tension between naturalism and Realism. Most analytic metaphysicians of the twenty-first century are naturalists: they think that metaphysics should be compatible with our best science, that philosophy has much to learn from studying the methods used in science, and that, at root, the basic entities philosophy puts in its ontology had better be ones that are scientifically respectable (quarks, leptons, and forces are in; God, dormative powers, and Berkeleyan ideas are out). It is not obvious, however, that there is a place within our best science for the ordinary objects we know and love. (“We have already seen that ordinary material objects tend to dissolve as soon as we acknowledge their microscopic structure: this apple is just a smudgy bunch of hadrons and leptons whose exact shape and properties are no more settled than those of a school of fish” (Varzi 140).) Metaphysicians’ naturalism inclines them to be Realists only about those entities our best science countenances. (Searle, for example, wonders how there can “be an objective world of money, property, marriage, governments, elections, football games, cocktail parties, and law courts in a world that consists entirely of physical particles in fields of force” (Searle xi).) They worry that there is no room within this naturalistic picture of the world for, for example, modal objects, composite objects, or artifacts. This places them in a bind: they do not want to abandon naturalism or Realism, but they also do not want to exclude entities e (whose existence/nature is not countenanced by naturalistic Realism) from their ontology. This underlying situation makes it the case that analytic metaphysicians will often end up endorsing Local Constructivism for some entities, that is, because doing so allows them to include such objects in their ontology whilst recognizing that they are defective in a way many other objects included in their ontology are not (that is, because they are existence or nature depends on us in some way the existence/nature of other objects does not). (This discussion of Local Constructivism has focused on concrete objects. There is also a literature concerning the construction of abstract objects. See, for example, Levinson (1980), Thomasson (1999), Irmak (2019), Korman (2019).)

4. Criticisms of Constructivism in Analytic Metaphysics

The previous two sections examined two central versions of Constructivism within analytic metaphysics and provided overviews of the works of their most prominent adherents. The article concludes by asking what—all things considered—we should make of Constructivism in analytic metaphysics. Before the question can be answered, there must be an examination of the central criticisms of Constructivism. These criticisms can be divided into two main sorts: (1) coherence criticisms—which argue that Constructivism is in some way internally flawed to the extent that we cannot form coherent, evaluable versions of the view, and (2) substantive criticisms—which take Constructivism to be coherent and evaluable, but argue that we have good reason to think it is false.

a. Coherence Criticisms

Consider these four coherence criticisms: (i) Constructivism is not a distinct view, (ii) The term “constructivism” is too over-used to be valuable, (iii) Constructivism is too metaphorical, and (iv) Constructivism is incoherent.

Consider, first, whether Constructivism is a distinct view within the anti-Realist family of metaphysical views. Meta-ethicists, for instance, sometimes worry about whether Ethical Constructivism is sufficiently distinct from other views (for example, emotivism or response-dependence) within ethics. (See, for example, Jezzi (2019) and Street (2008 and 2010).) Does a similar worry arise with regard to Constructivism in analytic metaphysics? It does not. Constructivism is a broad view within anti-Realism; there are many more specific versions of it, but Constructivism is sufficiently distinct from other anti-Realist views. It is not, for example, Berkeleyan Idealism (that is, because Berkeleyan Idealism requires that God play a central role in determining what exists and Constructivism has no such reliance on God) or Eliminativism (that is, because Eliminativists about x deny that x exists, whereas Constructivists about x claim that x exists).

Consider, next, whether the term “constructivism” is too over-used to be valuable. Haslanger notes that, “The term ‘social construction’ has become commonplace in the humanities. [The] variety of different uses of the term has made it increasingly difficult to determine what claim authors are using it to assert or deny” (Haslanger 2003, 301-302). The term “constructivism” certainly is not over-used with analytic metaphysics. If anything, it is underused; authors only very rarely use the term “constructivism” to refer to their own views. We need not fear that the variety of uses which plagues the humanities in general will be an issue in analytic metaphysics. The term is uncommon within analytic metaphysics; and there is value in introducing the label within analytic metaphysics—as such labels serve to emphasize the similarity both in content and in underlying motivation between views whose authors use quite disparate terms to identify their own views.

Consider, third, whether “constructivism,” as used in analytic metaphysics, is too metaphorical. This criticism has been directed primarily at Global Constructivism. Understandably when, for instance, Goodman writes, “The worldmaking mainly in question here is making not with hands but with minds, or rather with languages or other symbol systems. Yet when I say that worlds are made, I mean it literally” (Goodman 1980 213), we want to know exactly what it is to literally make a world with words—it is difficult to parse this phrase if we do not take either the making or the world to be metaphorical. Global Constructivists, themselves, often stress—as Goodman does in the above passage—that they mean their views to be taken non-metaphorically: we really do construct the stars, the planets, and the rocks. Critics of Global Constructivism, however, often find it almost irresistible to take the writings of Global Constructivists to be metaphorical, namely “The anti-realist [Constructivist] is of course speaking in metaphor. It we took him to be speaking literally, what he says would be wildly false—so much so that we would question his sanity” (Devitt 2010, 237—quoting Wolterstorff). There is something to the worry that what Global Constructivists say is just so radical (and frequently, so convoluted) that the only way we can make any sense of it at all is to take it metaphorically (regardless of whether its proponents intend us to take it this way).

A final coherence criticism is that Constructivism is simply incoherent: we cannot make enough sense of what the view is to be in a position to evaluate it. This criticism takes various forms, including that Constructivism (a) is incompatible with what we know about our terms; (b) relies on a notion of a conceptual scheme which is, itself, incoherent; (c) requires unconstructed entities of a sort Global Constructivism cannot accept; (d) relies on a notion of unconstructed objects which is itself contradictory; and (e) allows for the construction of incompatible objects.

Consider, first, the claim that Constructivism is incompatible with what we know about our terms. Boghossian, for example, writes:

Isn’t it part of the very concept of an electron, or of a mountain, that these things were not constructed by us? Take electrons, for example. Is it not part of the very purpose of having such a concept that it is to designate things that are independent of us? If we insist on saying that they were constructed by our descriptions of them, don’t we run the risk of saying something not merely false but conceptually incoherent, as if we hadn’t quite grasped what an electron was supposed to be? (Boghossian 39)

The idea behind Boghossian’s worry is that linguistic and conceptual competence reveal to us that the term “electron” and the concept electron denote something which is independent of us. If so, then any theory that proposes that electrons depend on us is simply confused about the meaning of the term “electron” or, more seriously, about the nature of electrons. There are a variety of ways one can address this concern. One could argue that externalism is true and, thus, that competent users can be radically mistaken about what their terms refer to and still successfully refer. Historically, we have often been mistaken both about what exists and about what the nature of existing objects is. We were able to successfully refer to water even when we thought it was a basic substance (rather than a composite of H2O) and we can refer successfully to electrons even if we are deeply mistaken about their nature, that is, we think they are independent entities when they are really dependent entities. The more serious version of Boghossian’s worry casts it as a worry about changing the subject matter rather than as a worry about reference. It may be that electrons-which-depend-on-us are so radically different from what we originally thought electrons were that Constructivists (who claim electrons so depend) are (i) proposing Eliminativism about electrons-which-are-independent-of-us, and (ii) introducing an entirely new ontology, namely electrons-which-depend-on-us. (See Evnine (2016) for arguments that taking electrons to depend on humans changes the subject matter so radically that Eliminativism is preferable.) The critic could press this point, but it is not very convincing. To see this, hold a rock in your hand. On the most reasonable way of casting the debate, the Realist to your right and the Constructivist to your left can both point to the rock and utter, “we have different accounts of the existence and nature of that rock.” It is uncharitable to interpret them as talking about different objects, rather than as having different views about the same object. Boghossian overestimates the extent of our knowledge of, for example, the term “electron,” the concept electron, and the objects electrons. We are not so infallible with regard to such terms, concepts, and objects that views which dissent from the mainstream Realist position are simply incoherent.

Consider, next, the criticism that Constructivism relies on a notion of a conceptual scheme which is, itself, incoherent. Goodman and Putnam both endorsed relativistic versions of Global Constructivism which rely on different cultures having different conceptual schemes and on the idea that truth can be relative to a conceptual scheme. Davidson (1974) attacks the intelligibility of truth relative to a conceptual scheme. Cortens (2002) argues that, “Many relativists run into serious trouble on this score; rarely do they provide a satisfactory explanation of just what sort of thing a conceptual scheme is” (Cortens 46). Although there are responses to this criticism, they are not presented here. (See the entries for Goodman, Putnam, and Schwartz in the bibliography.) Goodman/Putnam’s Global Constructivism is a dated view, and contemporary versions of Constructivism do not utilize the old-fashioned notion of a conceptual scheme or of truth relative to a conceptual scheme.

Another criticism which attacks the coherence of Constructivism is the claim that Constructivism requires unconstructed entities of a sort Global Constructivism cannot accept. Boghossian (2006) and Scheffler (2009) argue that Constructivism presupposes the existence of at least some unconstructed objects which we have epistemic access to. If this is correct, then Global Constructivism is contradictory, that is, since it would require unconstructed objects we have epistemic access to (to serve as the basis of our constructing) whilst also claiming that all objects we have epistemic access to are constructed:

If our concepts are cutting lines into some basic worldly dough and thus imbuing it with a structure it would not otherwise possess, doesn’t there have to be some worldly dough for them to work on, and mustn’t the basic properties of that dough be determined independently of all this [constructivist] activity. (Boghossian 2006, 35)

There are various answers Constructivists can give to this worry. Goodman, for instance, insists that everything is constructed:

The many stuffs—matter, energy, waves, phenomena—that worlds are made of are made along with the worlds. But made from what? Not from nothing, after all, but from other worlds. Worldmaking as we know it always starts from worlds already on hand; the making is a remaking (Goodman 1978, 6-7)

Goodman’s view may be hard to swallow, but it is not internally inconsistent. Another approach is to argue that although all objects are constructed, there are other types of entities (for example, Sidelle’s nonmodal stuff, Kant’s noumena) which are not constructed. (See also Remhof (2014).)

A fourth incoherence criticism is that Constructivism relies on a notion of unconstructed objects which is itself (at worst) contradictory or (at best) under explained. How cutting a worry this is depends on what a particular version of Constructivism takes to be unconstructed. Kriegel’s Local Constructivism about composite objects, for instance, allows that all mereologically simple objects are unconstructed—such simples provide a rich building base for his constructivism. Similarly, Local Constructivists about artifacts claim that natural objects are unconstructed. They are, that is, Realists about all the objects Realists typically give as paradigms. This, too, provides a rich and uncontroversially non-contradictory building base for their constructed objects. Other views—such as Global Constructivism and Local Constructivism about modal objects—do face a difficulty regarding how to allow unconstructed entities to have enough structure that we can grasp what they are, without claiming they have so much structure that they become constructed entities. Wieland and Elder give voice to this common Realist complaint against Constructivism:

When it comes to [the question of what unconstructed entities are], those who are sympathetic to [Constructivism] are remarkably vague. … The problem [is that constructivists] want to reconcile our freedom of carving with serious, natural constraints. … [The] issue is about the elusive nature of non-perspectival facts in a world full of facts which do depend on our perspective. (Wieland 22)

[Constructivists] are generally quite willing to characterize the world as it exists independently of our exercise of our conceptual scheme. It is simply much stuff, proponents say, across which a play of properties occurs. … But just which properties is it that get instantiated in the world as it mind-independently exists? (Elder 14)

Global Constructivists are quite perplexing when they try to explain how they can construct in the absence of any unconstructed entities to which we have epistemic access. This is a central problem with Global Constructivism and one reason it lacks contemporary adherents. The situation is different with Local Constructivism. Local Constructivists are vocal about the fact that they endorse the existence of unconstructed entities to which we have epistemic access and that such entities play a crucial role in our constructing. (Baker, for example, notes that, “I do not hold that thought and talk alone bring into existence any physical objects … pre-existing materials are also required” (2007 46). Devitt argues that, “Neither designing something to hammer nor using it to hammer is sufficient to make it a hammer … only things of certain physical types could be [hammers]” (1991 248). Einheuser emphasizes that the application of our concepts to stuff is only object creating when our concepts are directed at independently existing stuff which has the right nonmodal properties (Einheuser 2011).) Local Constructivists—even those such as Sidelle who think unconstructed entities have no “deep” modal properties—can provide an account of unconstructed entities which is coherent. There are a variety of ways to do this. (See, for example, Sidelle (1989), Goswick (2015, 2018a, 2018b), Remhof (2014).) Rather than presenting any one of them, there will be a few general points which should enable the reader to understand for herself that Local Constructivists about modal objects can provide a coherent view of unconstructed entities. The easiest way to see this is to note two things: (1) The Local Constructivist about modal objects does not think that every entity which has a modal property is constructed; they only think that objects which have “deep” modal properties are constructed. So, for example, arguments such as the following will not work: Let F denote some property purportedly unconstructed entity e has. Every entity that is actually F is possibly F. So, e is possibly F. Thus, e has a modal property—which contradicts the Local Constructivists’ claim that unconstructed entities do not have modal properties. But, of course, Local Constructivists are happy for unconstructed objects to have a plethora of modal properties, so long as they are “shallow” modal properties. (A “deep” modal property, remember, is any constant de re necessity or de re possibility which is non-trivial. A “shallow” modal property is any modal property which is not “deep.”) (2) Most of us have no trouble understanding Quine when he defines objects as “the material content of a region of spacetime, however heterogeneous or gerrymandered” (Quine 171). But, of course, Quine rejected “deep” modality. The Local Constructivist about modal objects can simply point to Quine’s view and use Quine’s objects as their unconstructed entities. (See Blackson (1992) and Goswick (2018c).)

A final coherence criticism of Constructivism is the claim that Constructivism licenses the construction of incompatible objects, for example, society A constructs object o (which entails the non-existence of object o*), whilst society B constructs object o* (which entails the non-existence of object o). (Suppose, for example, that there are no coinciding objects, so at most one object occupies region r. Then, society A’s constructing a statue (at region r) rules out the existence of a mere-lump (at region r) and society B’s constructing a mere-lump (at region r) rules out the existence of a statue (at region r).) What, then, are we to say with regard to the existence of o and o*? Do both exist, neither, one but not the other? Boghossian puts the worry this way:

[How could] it be the case both that the world is flat (the fact constructed by pre-Aristotelian Greeks) and that it is round (the fact constructed by us)? [Constructivism faces] a problem about how we are to accommodate the possible simultaneous construction of logically incompatible facts. (Boghossian 39-40)

Different versions of Constructivism will have different responses to this worry, but every version is able to give a response that dissolves the worry. Relativists will say that o exits only relative to society A, whereas o* exists only relative to society B. Constructivists who are not relativists will pick some subject to privilege, for example, society A gets to do the constructing, so what they say goes—o exists and o* does not.

b. Substantive Criticisms

Now that Constructivism has been shown to satisfactorily respond to the coherence criticisms, let’s turn to presenting and evaluating the eight main substantive criticisms of Constructivism: (i) If Constructivism were true, then multiple systems of classification would be equally good, but they are not, (ii) Constructivism is under-motivated, (iii) Constructivism is incompatible with naturalism, (iv) Constructivism should be rejected outright because Realism is so obviously true, (v) Constructivism requires constitutive dependence, but really, insofar as objects do depend on us, they depend on us only causally, (vi) Constructivism is not appropriately constrained, (vii) Constructivism is crazy, and (viii) Constructivism conflicts with obvious empirical facts.

Consider, first, the criticism that if Constructivism were true, then multiple systems of classification would be equally good; but they are not, so Constructivism is not true. The main proponent of this criticism is Elder. He expresses the concern in the following way:

If there were something particularly … unobjective about sameness in natural kind, one might expect that we could prosper just as well as we do even if we wielded quite different sortals for nature’s kinds. (Elder 10)

The basic idea is that, as a matter of fact, dividing up the world into rocks and non-rocks works better for us than does dividing up the world into dry-rocks, wet-rocks, and non-rocks: the sortal rock is better than the alternative sortals dry-rock and wet-rock. Why is this? Elder’s explanation is that rock is a natural kind sortal which traces the existence of real objects. Dry-rock and wet-rock do not work as well as rock because there are rocks and there are not dry-rocks and wet-rocks. Since we cannot empirically distinguish between a rock that is (accidentally) dry and an (essentially dry) dry-rock or between a rock that is (accidentally) wet and an (essentially wet) wet-rock, Elder provides no empirical basis for his claim. The Constructivist will point out that she is not arguing that any set of constructed objects is as good as any other. It may very well be the case that rock works better for us than do dry-rock and wet-rock. The Constructivist attributes this to contingent facts about us (for example, our biology and social history) rather than to its being the case that Realism is true of rocks and false of dry-rocks and wet-rocks. Nothing Elder says blocks this way of describing the facts. Pending some argument showing that the only way (or, at least, the best way) we can explain the fact that rock works better for us than do dry-rock and wet-rock is if Realism is true of rocks, Elder has no argument against the Constructivist.

Another argument one sometimes hears is that Constructivism is undermotivated. Global Constructivism is seen as an overly radical metaphysical response to minor semantic and epistemic problems with Realism. (See, for example, Devitt (1997) and Wieland (2012).) How good a criticism this is depends on how minor the semantic and epistemic problems with Realism are and how available a non-metaphysic solution to them is. This issue is not explored further here because this sort of criticism cannot be evaluated in general but must be looked at with regard to each individual view, for example, is Goodman’s Global Constructivism undermotivated, is Sidelle’s Local Constructivism about modal objects undermotivated, is Thomasson’s Local Constructivism about artifacts undermotivated? Whether the criticism is convincing will depend on how well each view does at showing there’s a real problem with Realism and that their own preferred way of resolving the problem is compelling. If Sidelle is really correct that the naturalist/empiricist stance most analytic philosophers embrace in the twenty-first century is incompatible with the existence of ordinary objects with “deep” modal properties, then we should be strongly motivated to seek a non-Realist account of ordinary objects. If Thomasson’s really right that existence is easy and that some terms really are such that anything that satisfies them depends constitutively on humans, then we should be strongly motivated to seek a non-Realist account of the referents of such terms.

Another argument one sometimes hears is that Constructivism is incompatible with the naturalized metaphysics which is in vogue. Most contemporary metaphysicians are heavily influenced by Lewisian naturalized metaphysics: they believe that there is an objective reality, that science has been fairly successful in examining this reality, that the target of metaphysical inquiry is this objective reality, and that our metaphysical theorizing should be in line with what our best science tells us about reality. If Constructivism really is incompatible with naturalized metaphysics it will ipso facto be unattractive to most contemporary metaphysicians. However, although one frequently hears this criticism, upon closer examination it is seen to lack teeth. The crucial issue—with regard to compatibility with naturalistic metaphysics—is whether one’s view is adequately constrained by an independent, objective, open to scientific investigation reality. All versions of Realism are so constrained, so Realism wears its compatibility with naturalistic metaphysics on its sleeve. Not all versions of Constructivism are so constrained, for example, Goodman and Putnam’s Global Constructivisms are not. But it would be overly hasty to throw out all of Constructivism simply because some versions of Constructivism are incompatible with naturalistic metaphysics. Some versions of Constructivism are more compatible with naturalized metaphysics than is Realism. Suppose Ladyman and Ross are correct when they say our best science shows there are no ordinary objects (2007). Suppose Einheuser is correct when she says our best science shows there are no objects with modal properties (2011). Suppose, however, that in daily human life we presuppose (as we seem to) the existence of ordinary objects with modal properties. Then, Local Constructivism about ordinary objects is motivated from within the perspective of naturalistic metaphysics. One’s naturalism prevents one from being a Realist about ordinary objects, that is, because all the subject-independent world contains is ontic structure (if Ladyman and Ross are correct) or nonmodal stuff (if Einheuser is correct). One’s desire to account for human behavior prevents one from being an Eliminativist about ordinary objects. A constructivism which builds ordinary objects out of human responses to ontic structure/nonmodal stuff is the natural position to take. Although some versions of Constructivism (for example, Global Constructivism) may be incompatible with naturalistic metaphysics, there is no argument from naturalized metaphysics against Constructivism per se.

A fourth substantive criticism levied against Constructivism is that it should be rejected outright because Realism is so obviously true:

A certain knee-jerk realism is an unargued presupposition of this book. (Sider 2011, 18)

Realism is much more firmly based than these speculations that are thought to undermine it. We have started the argument in the wrong place: rather than using the speculations as evidence against Realism, we should use Realism as evidence against the speculations. We should “put metaphysics first.” (Devitt 2010, 109)

[Which] organisms and other natural objects there are is entirely independent of our beliefs about the world. If indeed there are trees, this is not because we believe in trees or because we have experiences as of trees. (Korman 92)

For example, facts about mountains, dinosaurs or electrons seem not to be description-dependent. Why should we think otherwise? What mistake in our ordinary, naive realism about the world has the [Constructivist] uncovered? What positive reason is there to take such a prima facie counterintuitive view seriously. (Boghossian 28)

All that the Constructivist can say in response to this criticism—which is not an argument against Constructivism but rather a sharing of the various authors’ inclinations—is that she does not think Realism is so obviously true. She can, perhaps, motivate others to see it as less obviously true by not casting the debate as a global one between choosing whether the stance one wants to adopt toward the world is Global Constructivist or Global Realist, but rather as a more local debate concerning the ontological status of, for example, tables, rocks, money, and dogs. We are no longer playing a global game; one can be an anti-Realist about, for example, money without thereby embracing global anti-Realism.

Another criticism of Constructivism is that Constructivism is only true if objects constitutively depend on us, but really, insofar as objects do depend on us, they depend on us only causally. As this article has defined “Constructivism,” it has room for both causal versions and constitutive versions. (Hacking (1999) and Goswick (2018b) present causal versions of Constructivism. Baker (2007) and Thomasson (2007) present constitutive versions of Constructivism.) One could, instead, define “Constructivism” more narrowly so that it only included constitutive accounts. This would be a mistake. Consider a (purported) causal version of Local Constructivism about modal objects: Jane is a Realist about nonmodal stuff and claims we have epistemic access to it. She thinks that when we respond to rock-appropriate nonmodal stuff s with the rock-response we bring a new object into existence: a rock. Jane does not think that rocks depend constitutively on us—it is not part of what it is to be a rock that we have to F in order for rocks to exist. But we do play a causal role in bringing about the existence of rocks. If there were some modal magic, then rocks could have existed without us (nothing about the nature of rocks bars this from being the case); but there is no modal magic, so all the rocks that exist do causally depend on us. Now consider a (purported) constitutive version of Local Constructivism about modal objects: James is a Realist about nonmodal stuff and claims we have epistemic access to it. He thinks that when we respond to rock-appropriate nonmodal stuff s with the rock-response we bring a new object into existence: a rock. James thinks that rocks depend constitutively on us—it is part of what it is to be a rock that we have to F in order for rocks to exist. Even if there were modal magic, rocks could not have existed without us. Do Jane and James’ views differ to the extent that one of them deserves the label “Constructivist” and the other does not? Their views are very similar—after all they both take rocks to be composite objects which come to exist when we F in circumstances c, that is, they tell the same origin story for rocks. What they differ over is the nature of rocks: is their dependence on us constitutive of what it is to be a rock (as James says) or is it just a feature that all rocks in fact have ( as Jane says). Jane and James’ views are so similar (and the objections that will be levied against them are so similar) that taking both to be versions of the same general view (that is, Constructivism) is more perspicuous than not so doing. More generally, causal constructivism is similar enough to constitutive constructivism that defining “constructivism” in such a way that in excludes the former would be a mistake.

A sixth substantive criticism of Constructivism is that it is not appropriately constrained.

Putnam does talk, in a Kantian way, of the noumenal world and of things-in-themselves [but] he seems ultimately to regard this talk as “nonsense” … This avoids the facile relativism of anything goes by fiat: we simply are constrained, and that’s that. … [But to] say that our construction is constrained by something beyond reach of knowledge or reference is whistling in the dark. (Devitt 1997, 230)

The worry here is that it is not enough just to say “our constructing is constrained”; what does the constraining and how it does so must be explained. Global Constructivists have fared very poorly with regard to this criticism. They (for example, Goodman, Putnam, Schwartz) certainly intend their views to be so constrained. What is less clear, however, is whether they are able to accomplish this aim. They provide no satisfactory account of how, given that we have no epistemic access to them, the unconstructed entities they endorse are able to constrain our constructing. This is a serious mark against Global Constructivism. Local Constructivists fare better in this regard. They place a high premium on our constructing being constrained by the (subject-independent) world and each Local Constructivist is able to explain what constrains constructing on her view and how it does so. Baker, for example, argues that all constructed objects stand in a constitution chain which eventuates in an unconstructed aggregate. These aggregates constrain which artifacts can be in their constitution chains, namely (i) an artifact with function f can only be constituted by an aggregate which contains enough items of suitable structure to enable the proper function of the artifact to be performed, and (ii) an artifact with function f can only be constituted by an aggregate which is such that the items in the aggregate are available for assembly in a way suitable for enabling the proper function of the artifact to be performed (Baker 2007, 53). For another example, consider Einheuser’s explanation of what constrains her Local Constructivism about modal objects: Every (constructed) modal object coincides with some (unconstructed) nonmodal stuff. A modal object of sort s (for example, a rock) can only exist at region r if the nonmodal stuff that occupies region r has the right nonmodal properties (Einheuser 2011). This ensures that, for example, we cannot construct a rock at a region that contains only air molecules.

A seventh substantive criticism of Constructivism is the claim that Constructivism is crazy. Consider,

We should not close our eyes to the fact that Constructivism is prima facie absurd, a truly bizarre doctrine. … How could dinosaurs and stars be dependent on the activities of our minds? It would be crazy to claim that there were no dinosaurs or stars before there were people to think about them. [The claim that] there would not have been dinosaurs or stars if there had not been people (or similar thinkers) seems essential to Constructivism: unless it were so, dinosaurs and stars could not be dependent on us and our minds. [So Constructivism is crazy.] (Devitt 2010, 105 and Devitt 1997, 238)

The idea that we in any way determine whether there are stars and what they are like seems so preposterous, if not incomprehensible, that any thesis that leads to this conclusion must be suspect. … And a forceful, “But people don’t make stars” is often thought to be the simplest way to bring proponents of such metaphysical foolishness back to their senses. For isn’t it obvious that … there were stars long before sentient beings crawled about and longer still before the concept star was thought of or explicitly formulated? (Schwartz 1986, 429 and 427)

The “but Constructivism is crazy” elocution is not a specific argument but is rather an expression of the utterer’s belief that Constructivism has gone wrong in some serious way. Arguments lie behind the “Constructivism is crazy” utterance and the arguments, unlike the emotive outburst, can be diffused. Behind Devitt’s “it’s crazy” utterance is the worry that Constructivism simply gets the existence conditions for natural objects wrong. It is just obvious that dinosaurs and stars existed before any people did and it follows from this that they must be unconstructed objects. There are two ways to respond to this objection: (1) argue that even if humans construct dinosaurs and stars it can still be the case that dinosaurs and stars existed prior to the existence of humans. (For this approach, see Remhof, “If there had been no people there would still have been stars and dinosaurs; there would still have been things that would be constructed by humans were they around” (Remhof 2014, 3); Searle, “From the fact that a description can only be made relative to a set of linguistic categories, it does not follow that the objects described can only exist relative to a set of categories. … Once we have fixed the meaning of terms in our vocabulary by arbitrary definitions, it is no longer a matter of any kind of relativism or arbitrariness whether representation-independent features of the world that satisfy or fail to satisfy the definitions exist independently of those or any other definitions” (Searle 166); and Schwartz, “In the process of fashioning classificatory schemes and theoretical frameworks, we organize our world with a past, as well as a future, and provide for there being objects or states of affairs that predate us. Although these facts may be about distant earlier times, they are themselves retrospective facts, not readymade or build into the eternal order” (Schwartz 1986, 436).) (2) bite the bullet. Agree that—if Constructivism is true —dinosaurs and stars did not exist before there were any people. Diffuse the counter-intuitiveness of this claim by, for example, arguing that, although dinosaurs per se did not exist, entities that were very dinosaur-like did exist. (For this approach, see Goswick (2018b):

The [Constructivist] attempts to mitigate this cost by pointing out that which ordinary object claims are false is systematic and explicable. In particular, we’ll get the existence and persistence conditions of ordinary objects wrong when we confuse the existence/persistence of an s-apt n-entity for the existence/persistence of an ordinary object of sort s. We think dinosaurs existed because we mistake the existence of dinosaur-apt n-entities for the existence of dinosaurs (Goswick 2018b, 58).

Behind Schwartz’s “Constructivism is crazy” utterance is the same worry Devitt has: namely—that Constructivism simply gets the existence conditions for natural objects wrong. It can be diffused in the same way Devitt’s utterance was.

The final substantive criticism of Constructivism to be considered is the claim that Constructivism conflicts with obvious empirical facts.

It is sometimes said, for example, that were it not for the fact that we associated the word “star” with certain criteria of identity, there would be no stars. It seems to me that people who say such things are guilty of [violating well-established empirical facts]. Are we to swallow the claim that there were no stars around before humans arrived on the scene? Even the dimmest student of astronomy will tell you that this is non-sense. (Cortens 45)

This worry has largely been responded to in responding to the previous criticism. However, Cortens makes one point beyond that which Devitt and Schwartz make. Namely, that it is not just our intuitions that tell us stars existed before humans, but also our best science. Any naturalist who endorses Constructivism about stars will be skeptical—that our best science really tells us this. Even the brightest student of astronomy is unlikely to make the distinctions metaphysicians make, for example, between a star and the atoms that compose it. Does the astronomy student really study whether there are stars or only atoms-arranged-starwise? If not, how can she be in a place to tell us whether there where stars before there were humans or whether there were only atoms-arranged-starwise? The distinction between stars and atoms-arranged-starwise is not an empirical one. In general, the issues Constructivists and Realists differ over are not ones that can be resolved empirically. Given this, it is implausible that Constructivism conflicts with obvious empirical facts. It would conflict with an obvious empirical fact (or, at least, with what our best science takes to be the history of our solar system) if, for example, Constructivists denied that there was anything star-like before there were humans. But Constructivists do not do this; rather, they replace the Realists’ pre-human stars with entities which are empirically indistinguishable from stars but which lack some of the metaphysical features (for example, being essentially F) they think an entity must have to be a star.

5. Evaluating Constructivism within
Analytic Metaphysics

Having explicated what Constructivism within analytic metaphysics is and what the central criticisms of it are, let’s examine what, all things considered, should be made of Constructivism within analytic metaphysics.

Global Constructivism is no longer a live option within analytic metaphysics. Our understanding of Realism, and our ability to clearly state various versions of it, has expanded dramatically since the 1980s. Realists have found answers to the epistemic and semantic concerns which originally motivated Global Constructivism, so the view is no longer well motivated. (See, for example, Devitt (1997) and Devitt (2010).) Moreover, there are compelling objections to Global Constructivism regarding, in particular, how we can construct entities if we have no epistemic access to any unconstructed entities to construct them from, and what can constrain our constructing, namely, given that we have epistemic access only to the constructed, it appears nothing unconstructed can constrain our constructing.

Local Constructivism fares better for reasons both sociological and philosophical. Sociologically, Local Constructivism has not been around for long and, rather than being one view, it is a whole series of loosely connected views, so it has not yet drawn the sort of detailed criticism that squashed Global Constructivism. Additionally, being a Local Constructivist about x is compatible with being a Realist about y, z, a, b, … (all non-x entities). As such, it is not a global competitor to Realism and has not drawn the Realists’ ire in the way Global Constructivism did. Philosophically, Local Constructivism is also on firmer ground than was Global Constructivism. By endorsing unconstructed entities which we have epistemic access to and which constrain our constructing, Local Constructivists are able to side-step many of the central criticisms which plague Global Constructivism. Local Constructivism looks well poised to provide an intuitive middle ground between a naturalistic Realism (which often unacceptably alters either the existence or the nature of the ordinary objects we take ourselves to know and love) and an overly subjective anti-Realism (which fails to recognize the role the objective world plays in determining our experiences and the insights we can gain from science).

6. Timeline of Constructivism in Analytic Metaphysics

 

1781 Kant’s A Critique of Pure Reason distinguishes between noumena and phenomena, thereby laying the groundwork for future work on constructivism
1907 James’ Pragmatism: A New Name for Some Old Ways of Thinking defends Global Constructivism
1978-1993 Goodman and Putnam publish a series of books and papers defending Global Constructivism
1986 and 2000 Schwartz defends Global Constructivism
1990 Heller defends an eliminativist view of vague objects, along the way to doing so, he shows how to be a constructivist about vague objects
1990s-2000s Baker, Thomasson, Searle, and Devitt endorse Local Constructivism about artifacts
Post 1988 Sidelle, Einheuser, and Goswick argue that objects having “deep” modal properties are constructed
2008 Kriegel argues that composite objects are constructed
2011 Varzi argues that objects with conventional boundaries are constructed

 

7. References and Further Reading

a. Constructivism: General

  • Alward, Peter. (2014) “Butter Knives and Screwdrivers: An Intentionalist Defense of Radical Constructivism,” The Journal of Aesthetics and Art Criticism, 72(3): 247-260.
  • Boyd, R. (1992) “Constructivism, Realism, and Philosophical Method” in Inference, Explanation, and Other Frustrations: Essays in the Philosophy of Science (ed. Earman). Los Angeles: University of California Press: 131-198.
  • Bridges and Palmgren. (2018) “Constructive Mathematics” in The Stanford Encyclopedia of Philosophy.
  • Chakravartty, Anjan. (2017) “Scientific Realism” in The Stanford Encyclopedia of Philosophy.
  • Downes, Stephen. (1998) “Constructivism” in the Routledge Encyclopedia of Philosophy.
  • Feyerabend, Paul. (2010) Against Method. USA: Verso Publishing.
  • Foucault, Michel. (1970) The Order of Things. USA: Random House.
  • Hacking, Ian. (1986) “Making Up People,” in Reconstructing Individualism: Autonomy, Individuality, and the             Self in Western Thought (eds. Heller, Sosna, Wellbery). Stanford: Stanford University Press, 222-236.
  • Hacking, Ian. (1992) “World Making by Kind Making: Child-Abuse for Example,” in How Classification Works:             Nelson Goodman among the Social Sciences (eds. Dougles and Hull). Edinburgh: Edinburgh University Press, 180-238.
  • Hacking, Ian. (1999) The Social Construction of What? Cambridge: Harvard University Press.
  • Haslanger, Sally. (1995) “Ontology and Social Construction,” Philosophical Topics, 23(2): 95-125.
  • Haslanger, Sally. (2003) “Social Construction: The ‘Debunking’ Project,” Socializing Metaphysics: The Nature of Social Reality (ed. Schmitt). Lanham: Roman & Littlefield Publishers, 301-326.
  • Haslanger, Sally. (2012) Resisting Reality: Social Construction and Social Critique, New York: Oxford University Press.
  • Jezzi, Nathaniel. (2019) “Constructivism in Metaethics,” Internet Encyclopedia of Philosophy. https://iep.utm.edu/con-ethi/
  • Kuhn, Thomas. (1996) The Structure of Scientific Revolutions. Chicago: Chicago University Press.
  • Mallon, Ron. (2019) “Naturalistic Approaches to Social Construction” in the Stanford Encyclopedia       of Philosophy.
  • Rawls, John. (1980) “Kantian Constructivism in Moral Theory,” Journal of Philosophy, 77: 515-572.
  • Remhof, J. (2017) “Defending Nietzsche’s Constructivism about Objects,” European Journal of Philosophy, 25(4): 1132-1158.
  • Street, Sharon. (2008) “Constructivism about Reasons,” Oxford Studies in Metaethics, 3: 207-245.
  • Street, Sharon. (2010) “What Is Constructivism in Ethics and Metaethics?” Philosophy Compass, 5(5): 363-384.
  • Werner, Konrad. (2015) “Towards a PL-Metaphysics of Perception: In Search of the Metaphysical Roots of Constructivism,” Constructivist Foundations, 11(1): 148-157.

b. Constructivism: Analytic Metaphysics

  • Baker, Lynne Ruder. (2004) “The Ontology of Artifacts,” Philosophical Explorations, 7: 99-111.
  • Baker, Lynne Ruder. (2007) The Metaphysics of Everyday Life: An Essay in Practical Realism. USA: Cambridge University Press.
  • Bennett, Karen. (2017) Making Things Up. Oxford: Oxford University Press.
  • Dummett, Michael. (1993) Frege: Philosophy of Language. Cambridge: Harvard University Press.
  • Einheuser, Iris. (2011) “Towards a Conceptualist Solution to the Grounding Problem,” Nous, 45(2): 300-314.
  • Evnine, Simon. (2016) Making Objects and Events: A Hylomorphic Theory of Artifacts, Actions, and Organisms. Oxford: Oxford University Press.
  • Goodman, Nelson. (1980) “On Starmaking,” Synthese, 45(2): 211-215.
  • Goodman, Nelson. (1983) “Notes on the Well-Made World,” Erkenntnis, 19: 99-108.
  • Goodman, Nelson. (1978) Ways of Worldmaking. USA: Hackett Publishing Company.
  • Goodman, Nelson. (1993) “On Some Worldly Worries,” Synthese, 95(1): 9-12.
  • Goswick, Dana. (2015) “Why Being Necessary Really Isn’t the Same As Being Not Possibly Not,” Acta Analytica, 30(3): 267-274.
  • Goswick, Dana. (2018a) “A New Route to Avoiding Primitive Modal Facts,” Brute Facts (eds. Vintiadis and Mekios). Oxford: OUP, 97-112.
  • Goswick, Dana. (2018b) “The Hard Question for Hylomorphism,” Metaphysics, 1(1): 52-62.
  • Goswick, Dana. (2018c) “Ordinary Objects Are Nonmodal Objects,” Analysis and Metaphysics, 17: 22-37.
  • Goswick, Dana. (2019) “A Devitt-Proof Constructivism,” Analysis and Metaphysics, 18: 17-24.
  • Hale and Wright. (2017) “Putnam’s Model-Theoretic Argument Against Metaphysical Realism” in A Companion of the Philosophy of Language (eds. Hale, Wright, and Miller). USA: Wiley-Blackwell, 703-733.
  • Heller, Mark. (1990) The Ontology of Physical Objects. Cambridge: CUP.
  • Irmak. (2019) “An Ontology of Words,” Erkenntnis, 84: 1139-1158.
  • James, William. (1907) Pragmatism: A New Name for Some Old Ways of Thinking. New York: Longmans Green Publishing (especially lectures 6 and 7).
  • James, William. (1909) The Meaning of Truth: A Sequel to Pragmatism. New York: Longmans Green Publishing.
  • Kant, Immanuel. (1965) The Critique of Pure Reason. London: St. Martin’s Press.
  • Kitcher, Philip. (2001) “The World As We Make It” in Science, Truth and Democracy. Oxford: Oxford University Press, ch. 4.
  • Korman. (2019) “The Metaphysics of Establishments,” The Australasian Journal of Philosophy, DOI: 10.1080/00048402.2019.1622140.
  • Kriegel, Uriah. (2008) “Composition as a Secondary Quality,” Pacific Philosophical Quarterly, 89: 359-383.
  • Ladyman, James and Ross, Don. Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press, 2007.
  • Levinson. (1980) “What a Musical Work Is,” The Journal of Philosophy, 77(1): 5-28.
  • McCormick, Peter. (1996) Starmaking: Realism, Anti-Realism, and Irrealism. Cambridge: MIT Press.
  • Putnam, Hilary. (1979) “Reflections on Goodman’s Ways of Worldmaking,” Journal of Philosophy, 76: 603-618.
  • Putnam, Hilary. (1981) Reason, Truth, and History. Cambridge: Cambridge University Press.
  • Putnam, Hilary. (1982) “Why There Isn’t a Ready-Made World,” Synthese, 51: 141-168.
  • Putnam, Hilary. (1987) The Many Faces of Realism. LaSalle: Open Court Publishing.
  • Quine, W.V.O. (1960) Word and Object. Cambridge: MIT Press.
  • Remhof, J. (2014) “Object Constructivism and Unconstructed Objects,” Southwest Philosophy Review, 30(1): 177-186.
  • Rorty, Richard. (1972) “The World Well Lost,” The Journal of Philosophy, 69(19): 649-665.
  • Schwartz, Robert. (1986) “I’m Going to Make You a Star,” Midwest Studies in Philosophy, 11: 427-438.
  • Schwartz, Robert. (2000) “Starting from Scratch: Making Worlds,” Erkenntnis, 52: 151-159.
  • Searle, John. (1995) The Construction of Social Reality. USA: Free Press.
  • Sidelle, Alan. (1989) Necessity, Essence, and Individuation. London: Cornell University Press.
  • Thomasson, Amie. (1999) Fiction and Metaphysics. Cambridge: Cambridge University Press.
  • Thomasson, Amie. (2003) “Realism and Human Kinds,” Philosophy and Phenomenological Research, 67(3): 580-609.
  • Thomasson, Amie. (2007) Ordinary Objects. Oxford: OUP.
  • Varzi, Achille. (2011) “Boundaries, Conventions, and Realism” in Carving Nature at Its Joints (eds. Campbell et al.). Cambridge: MIT Press, 129-153.

c. Critics of Analytic Metaphysical Constructivism

  • Blackson, Thomas. (1992) “The Stuff of Conventionalism,” Philosophical Studies, 68(1): 65-81.
  • Boghossian, Paul. (2006) Fear of Knowledge: Against Relativism and Constructivism. New York: Oxford University Press.
  • Cortens, Andrew. (2002) “Dividing the World Into Objects” in Realism and Antirealism. (ed. Alston). Ithaca: Cornell University Press.
  • Davidson, Donald. (1974) “On the Very Idea of a Conceptual Scheme,” Proceedings and Addresses of the American Philosophical Association, 47: 5-20.
  • Devitt, Michael. (1997) Realism and Truth. Princeton: Princeton University Press.
  • Devitt, Michael. (2010) Putting Metaphysics First: Essays on Metaphysics and Epistemology. Oxford: Oxford University Press.
  • Elder, Crawford. (2011) “Carving Up a Reality in Which There Are No Joints” in A Companion to Relativism (ed. Hales). London: Blackwell, 604-620.
  • Korman, Daniel. (2016) Objects: Nothing Out of the Ordinary. Oxford: Oxford University Press.
  • Scheffler, Israel. (1980). “The Wonderful Worlds of Goodman,” Synthese, 45(2): 201-209.
  • Sider, Ted. (2011) Writing the Book of the World. Oxford: Oxford University Press.
  • Wieland, Jan. (2012) “Carving the World as We Please,” Philosophica, 84: 7-24.

 

Author Information

Dana Goswick
Email: dgoswick@unimelb.edu.au
University of Melbourne
Australia

Precautionary Principles

The basic idea underlying a precautionary principle (PP) is often summarized as “better safe than sorry.” Even if it is uncertain whether an activity will lead to harm, for example, to the environment or to human health, measures should be taken to prevent harm. This demand is partly motivated by the consequences of regulatory practices of the past. Often, chances of harm were disregarded because there was no scientific proof of a causal connection between an activity or substance and chances of harm, for example, between asbestos and lung diseases. When this connection was finally established, it was often too late to prevent severe damage.

However, it is highly controversial how the vague intuition behind “better safe than sorry” should be understood as a principle. As a consequence, we find a multitude of interpretations ranging from decision rules over epistemic principles to procedural frameworks. To acknowledge this diversity, it makes sense to speak of precautionary principles (PPs) in the plural. PPs are not without critics. For example, it has been argued that they are paralyzing, unscientific, or promote a culture of irrational fear.

This article systematizes the different interpretations of PPs according to their functions, gives an overview about the main lines of argument in favor of PPs, and outlines the most frequent and important objections made to them.

Table of Contents

  1. The Idea of Precaution and Precautionary Principles
  2. Interpretations of Precautionary Principles
    1. Action-Guiding Interpretations
      1. Decision Rules
      2. Context-Sensitive Principles
    2. Epistemic Interpretations
      1. Standards of Evidence
      2. Type I and Type II Errors
      3. Precautionary Defaults
    3. Procedural Interpretations
      1. Argumentative, or “Meta”-PPs
      2. Transformative Decision Rules
      3. Reversing the Burden of Proof
      4. Procedures for Determining Precautionary Measures
    4. Integrated Interpretations
      1. Particular Principles for Specific Contexts
      2. An Adjustable Principle with Procedural Instructions
  3. Justifications for Precautionary Principles
    1. Practical Rationality
      1. Ordinary Risk Management
      2. PPs in the Framework of Ordinary Risk Management
      3. Reforming Ordinary Risk Management
    2. Moral Justifications for Precaution
      1. Environmental Ethics
      2. Harm-Based Justifications
      3. Justice-Based Justifications
      4. Rights-Based Justifications
      5. Ethics of Risk and Risk Impositions
  4. Main Objections and Possible Rejoinders
    1. PPs Cannot Guide Our Decisions
    2. PPs are Redundant
    3. PPs are Irrational
  5. References and Further Reading

1. The Idea of Precaution and Precautionary Principles

We can identify three main motivations behind the postulation of a PP. First, it stems from a deep dissatisfaction with how decisions were made in the past: Often, early warnings have been disregarded, leading to significant damage which could have been avoided by timely precautionary action (Harremoës and others 2001). This motivation for a PP rests on some sort of “inductive evidence” that we should reform (or maybe even replace) our current practices of risk regulation, demanding that uncertainty must not be a reason for inaction (John 2007).

Second, it expresses specific moral concerns, usually pertaining to the environment, human health, and/or future generations. This second motivation is often related to the call for sustainability and sustainable development in order to not destroy important resources for short-time gains, but to leave future generations with an intact environment.

Third, PPs are discussed as principles of rational choice under conditions of uncertainty and/or ignorance. Typically, rational decision theory is well suited for situations where we know the possible outcomes of our actions and can assign probabilities to them (a situation of “risk” in the decision-theoretic sense). However, the situation is different for decision-theoretic uncertainty (where we know the possible outcomes, but cannot assign any, or at least no meaningful and precise, probabilities to them) or decision-theoretic ignorance (where we do not know the complete set of possible outcomes). Although there are several suggestions for decision rules under these circumstances, it is far from clear what is the most rational way to decide when we are lacking important information and the stakes are high. PPs are one proposal to fill this gap.

Although they are often asserted individually, these motivations also complement each other: If, as following from the first motivation, uncertainty is not allowed to be a reason for inaction, then we need some guidance for how to decide under such circumstances, for example, in the form of a decision principle. And in many cases, it is the second motivation—concerns for the environment or human health—which makes the demand for precautionary action before obtaining scientific certainty especially pressing.

Many existing official documents cite the demand for precaution. One often-quoted example for a PP is principle 15 of the Rio Declaration on Environment and Development, a result of the United Nations Conference on Environment and Development (UNCED) in 1992. It refers to a “precautionary approach”:

Rio PP—In order to protect the environment, the precautionary approach shall be widely applied by states according to their capabilities. Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing cost-effective measures to prevent environmental degradation. (United Nations Conference on Environment and Development 1992, Principle 15)

Another prominent example is the formulation that resulted from the Wingspread Conference on the Precautionary Principle 1998, where around 35 scientists, lawyers, policy makers and environmentalists from the United States, Canada and Europe met to define a PP:

Wingspread PP—When an activity raises threats of harm to human health or the environment, precautionary measures should be taken even if some cause and effect relationships are not fully established scientifically. In this context the proponent of an activity, rather than the public, should bear the burden of proof. The process of applying the precautionary principle must be open, informed and democratic and must include potentially affected parties. It must also involve an examination of the full range of alternatives, including no action. (Science & Environmental Health Network (SEHN) 1998)

Both formulations are often cited as paradigmatic examples of PPs. Although they both mention uncertain threats and measures to prevent them, they also differ in important points, for example their strength: The Rio PP makes a weaker claim,  stating that uncertainty is not a reason for inaction, whereas the Wingspread PP puts more emphasis on the fact that measures should be taken. They both give rise to a variety of questions: What counts as “serious or irreversible damage”? What does “(lack of) scientific certainty” mean? How plausible does a threat have to be in order to warrant precaution? What counts as precautionary measures? Additionally, PPs face many criticisms, like being too vague to be action-guiding, paralyzing the decision-process, or being anti-scientific and promoting a culture of irrational fear.

Thus, inspired by these regulatory principles in official documents, a lively debate has developed around how PPs should be interpreted in order to arrive at a version applicable in practical decision-making. This resulted in a multitude of PP proposals that are formulated and defended (or criticized) in different theoretical and practical contexts. Most of the existing PP formulations share the elements of uncertainty, harm, and (precautionary) action. Different ways of spelling out these elements result in different PPs (Sandin 1999, Manson 2002). For example, they can vary in how serious a harm has to be in order to trigger precaution, or which amount of evidence is needed. Additionally, PP interpretations differ with respect to the function they are intended to fulfill. They are typically classified based on some combination of the following categories according to their function (Sandin 2007, 2009; Munthe 2011; Steel 2014):

  • Action-guiding principles tell us which course of action to choose given certain circumstances;
  • (sets of) epistemic principles tell us what we should reasonably believe under conditions of uncertainty;
  • procedural principles express requirements for decision-making, and tell us how we should choose a course of action.

These categories can overlap, for example, when action- or decision-guiding principles come with at least some indication for how they should be applied. Some interpretations explicitly aim at integrating the different functions, and warrant their own category:

  • Integrated PP interpretations: Approaches that integrate action-guiding, epistemic, and procedural elements associated with PPs. Consequently, they tell us which course of action should be chosen through which procedure, and on what epistemic base.

This article starts in Section 2 with an overview of different PP interpretations according to this functional categorization. Section 3 describes the main lines of arguments that have been presented in favor of PPs, and Section 4 presents the most frequent and most important objections that PPs face, along with possible rejoinders.

2. Interpretations of Precautionary Principles

a. Action-Guiding Interpretations

Action-guiding PPs are often seen on a par with decision rules from rational decision theory. On the one hand, authors formalize PPs by using decision rules already established in decision theory, like maximin. On the other hand, they formulate new principles. While not necessarily located within the framework of decision theory, those are intended to work at the same level. Understood as principles of risk management, they are supposed to help to determine a course of action given our knowledge and our values.

i. Decision Rules

The terms used for decision-theoretic categories of non-certainty differ. In this article, they are used as follows: Decision-theoretic risk denotes situations in which we know the possible outcomes of actions and can assign probabilities to them. Decision-theoretic uncertainty refers to situations in which we know the possible outcomes, but either no or only partial or imprecise probability information is available (Hansson 2005a, 27).  When we don’t even know the full set of possible outcomes, we have a situation of decision-theoretic ignorance. When formulated as decision rules, the “(scientific) uncertainty” component of PPs is often spelled out as decision-theoretic uncertainty.

Maximin
The idea to operationalize a PP with the maximin decision rule occurred early within the debate and is therefore often associated with PPs (for example, Hansson 1997; Sunstein 2005b; Gardiner 2006; Aldred 2013).

In order to be able to apply the maximin rule, we have to know the possible outcomes of our actions and be able to at least rank them on an ordinal scale (meaning that for each outcome, we can tell whether it is better, worse, or equally good than each other possible outcome). It then tells us to select the option with the best worst case in order to “maximize the minimum”. Thus, the maximin rule seems like a promising candidate for a PP. It pays special attention to the prevention of threats, and is applicable under conditions of uncertainty. However, as has repeatedly been pointed out, maximin is not a plausible rule of choice in general. Consider the decision matrix in Table 1.

Scenario1 Scenario2
Alternative1 7 6
Alternative2 15 5

Table 1: Simplified Decision-Matrix with Two Alternative Courses of Action.

Maximin selects Alternative1. This seems excessively risk-averse because the best case in Alternative2 is much better, and the worst case is only slightly worse, as long as we assume (a) that the utilities in this example are cardinal utilities, and (b) that there is not some kind of relevant threshold passed. If we knew that the probability for Scenario1 is 0.99 and the probability for Scenario2 only 0.01, then it would arguably be absurd to apply maximin. Proponents of interpreting a PP with maximin thus have stressed that it needs be qualified by some additional criteria in order to provide a plausible PP interpretation.

The most prominent example is Gardiner (2006), who draws on criteria suggested by Rawls to determine conditions under which the application of maximin is plausible:

  1. Knowledge of likelihoods for the possible outcomes of the actions is impossible or at best extremely insecure;
  2. the decision-makers care relatively little for potential gains that might be made above the minimum that can be guaranteed by the maximin approach;
  3. the alternatives that will be rejected by maximin have unacceptable outcomes; and
  4. the outcomes considered are in some adequate sense “realistic”, that is, only credible threats should be considered.

Condition (3) makes it clear that the guaranteed minimum (condition 2) needs to be acceptable to the decision-makers (see also Rawls 2001, 98). What it means that ‘gains above the guaranteed minimum are relatively little cared for’ (condition 2) has been spelled out by Aldred (2013) in terms of incommensurability between outcome values, that is, that some outcomes are so bad that they cannot be outweighed by potential gains. It is thus better to choose an option that promises only little gains but guarantees that the extremely bad outcome can’t materialize.

Gardiner argues that a maximin rule that is qualified by these criteria fits well with some core cases where we agree that precaution is necessary and calls it the “Rawlsian Core Precautionary Principle (RCPP)”. He names the purchase of insurance as an everyday-example where his RCPP fits well with our intuitive judgments and where precaution seems already justified on its own. According to Gardiner, it also fits well with often-named paradigmatic cases for precaution like climate change: The controversy whether or not we should take precautions in the climate case is not a debate around the right interpretation of the RCPP but rather about whether the conditions for its application are fulfilled—for example, which outcomes are unacceptable (Gardiner 2006, 56).

Minimax Regret
Another decision rule that is discussed in the context of PPs is the minimax regret rule. Whereas maximin selects the course of action with the best worst case, minimax regret selects the course of action with the lowest maximal regret. The regret of an outcome is calculated by subtracting its utility from the highest utility one could have achieved under this state by selecting another course of action. This strategy tries to minimize one’s regret for not having made the superior choice in hindsight. The minimax regret rule does not presuppose any probability information, like the maximin rule. However, while for the maximin rule it is enough if outcomes can be ranked on an ordinal scale, the minimax rule requires that we are able to assign cardinal utilities to the possible outcomes. Otherwise, regret cannot be calculated.

Take the following example from Hansson (1997), in which a lake seems to be dying for reasons that we do not fully understand: “We can choose between adding substantial amounts of iron acetate, and doing nothing. There are three scientific opinions about the effects of adding iron acetate to the lake. According to opinion (1), the lake will be saved if iron acetate is added, otherwise not. According to opinion (2), the lake will self-repair anyhow, and the addition of iron acetate makes no difference. According to opinion (3), the lake will die whether iron acetate is added or not.” The consensus is that the addition of iron acetate will have certain negative effects on land animals that drink water from the lake, but that effect is less serious than the death of the lake. Assigning the value -12 to the death of the lake and -5 to the negative effects of iron acetate in the drinking water, we arrive at the utility matrix in Table 2.

(1) (2) (3)
Add iron acetate 5 -5 -17
Do nothing -12 0 -12

Table 2: Utility-Matrix for the Dying-Lake Case

We can then obtain the regret table by subtracting the utility of each outcome from the highest utility in each column, the result being Table 3. Minimax regret then selects the option to add iron acetate to the lake.

(1) (2) (3)
Add iron acetate 0 5 5
Do nothing 7 0 0

Table 3: Regret-Matrix for the Dying-Lake Case

Chisholm and Clarke (1993) strongly support the minimax regret rule. They argue that it is better suited for PP than maximin, since it gives some weight to foregone benefits. They also show that even if it is uncertain whether precautionary measures will be effective, minimax regret still recommends them as long as the expected damage from not implementing them is large enough. They advocate so-called “dual purpose” policies, where precautionary measures have other positive effects, even if they do not fulfill their main purpose. One example is measures that are aimed at abating global climate change, but at the same time have direct positive effects on local environmental problems. Contrarily, Hansson (1997) argues that to take precautions means to avoid bad outcomes, and especially to avoid worst cases. Consequently, he defends maximin and not minimax regret as the adequate PP interpretation. Maximin would, as Table 2 shows, select to not add iron acetate to the lake. According to Hansson, this is the precautionary choice as adding iron acetate could lead to a worse outcome than not adding it.

ii.Context-Sensitive Principles

Other interpretations of PPs as action-guiding principles differ from stand-alone if-this-then-that decision rules. They stress that principles have to be interpreted and concretized depending on the specific context (Fisher 2002; Randall 2011).

A Virtue Principle
Sandin (2009) argues that one can reinterpret a PP as an action-guiding principle not by reference to decision theory, but by using cautiousness as a virtue. He formulates an action-guiding virtue principle of precaution (VPP):

VPP—Perform those, and only those, actions that a cautious agent would perform in the circumstances. (Sandin 2009, 98)

Although virtue principles are commonly criticized as not being action-guiding, Sandin argues that understanding a PP in this way actually makes it more action-guiding. “Cautious” is interpreted as a virtue term that refers to a property of an agent, like “courageous” or “honest”. Sandin states that it is often possible to identify what the virtuous agent would do: Either because it is obvious, or because at least some agreement can be reached. Even the uncertain cases VPP is dealing with belong to classes of situations where we have experience with, for example, failed regulations of the past, and therefore can assess what the cautious agent would (not) have done and extrapolate from that to other cases (Sandin 2009, 99). According to Sandin, interpreting a PP as a virtue principle will avoid both objections of extremism and paralysis. It is unlikely that the virtuous agent will choose courses of action which will, in the long run, have overall negative effects or are self-refuting (like “ban activity a and do not ban activity a!”). However, even if one accepts that it makes sense to interpret “cautious” as a virtue, “the circumstances” under which one should choose the course of action that the cautious agent would choose are not specified in the VPP as it is formulated by Sandin. This makes it an incomplete proposal.

Reasonableness and Plausibility
Another important example is the PP interpretation by Resnik (2003, 2004), who defends a PP as an alternative to maximin and other strategies for decision-making in situations where we lack the type of empirical evidence that one would need for a risk management that uses probabilities obtained from risk assessment. His PP interpretation, which we can call the “reasonable measures precautionary principle” (RMPP), reads as follows:

RMPP—One should take reasonable measures to prevent or mitigate threats that are plausible and serious.

The seriousness of a threat relates to its potential for harm, as well as to whether or not the possible damage is seen as reversible or not (Resnik 2004, 289). Resnik emphasizes that reasonableness is a highly pragmatic and situation-specific concept. He names some neither exhaustive nor necessary criteria for reasonable responses: They should be effective, proportional to the nature of the threat, take a realistic attitude toward the threat, be cost-effective, and be applied consistently (Resnik 2003, 341–42). Lastly, that threats have to be credible means that there have to be scientific arguments for the plausibility of a hypothesis. These can be based on epistemic and/or pragmatic criteria, including for example coherence, explanatory power, analogy, precedence, precision, or simplicity. Resnik stresses that a threat being plausible is not the same as a threat being even minimally probable: We might accept threats as plausible that we think to be all but impossible to come to fruition (Resnik 2003, 341).

This shows that the question when a threat should count as plausible enough to warrant precautionary measures is very important for the application of an action-guiding PP. Consequently, such PPs are often very sensitive to how a problem is framed. Some authors took these aspects—the weighing of evidence and the description of the decision problem—to be central points of PPs, and interpreted them as epistemic principles, that is, principles at the level of risk assessment.

b. Epistemic Interpretations

Authors that defend an epistemic PP interpretation argue that we should accept that PPs are not principles that can guide our actions, but that this is neither a problem nor against their spirit. Instead of telling us how to act when facing uncertain threats of harm, they propose that PPs tell us something about how we should perceive these threats, and what we should take as a basis for our actions, for example, by relaxing the standard for the amount of evidence required to take action.

i. Standards of Evidence

One interpretation of an epistemic PP is to give more weight to evidence suggesting a causal link between an activity and threats of serious and irreversible harm than one gives to evidence suggesting less dangerous, or beneficial, effects. This could mean to assign a higher probability for an effect to occur than one would in other circumstances based on the same evidence. Arguably, the underlying idea of this PP can be traced back to the German philosopher Hans Jonas, who proposed a “heuristic of fear”, that is, to give more weight to pessimistic forecasts than to optimistic ones (Jonas 2003). However, this PP interpretation has been criticized on the basis that it systematically discounts evidence pointing in one direction, but not in the other. This could lead to distorted beliefs about the world in the long run, being detrimental to our epistemic and scientific progress and eventually doing more harm than good (Harris and Holm 2002).

However, other authors point out that we might have to distinguish between “regulatory science” and “normal science”. Different epistemic standards are appropriate for the two contexts since they have different aims: In normal science, we are searching for truth; in regulatory science, we are primarily interested in reducing risk and avoiding harm (John 2010). Accordingly, Peterson (2007a) refers in his epistemic PP interpretation only to decision makers—not scientists—who find themselves in situations involving risk or uncertainty. He argues that in such cases, decision-makers should strive to acquire beliefs that are likely to protect human health, and that it is less important whether they are also likely to be true. One principle that has been promoted in order to capture this idea is the preference for false positives, that is, for type I errors over type II errors.

ii. Type I and Type II Errors

Is it worse to falsely assert that there is a relationship between two classes of events, which does not exist (false positives), or to fail to assert such a relationship, when it in fact exists (false negatives)? For example, would you prefer a virus software on your computer which classifies a harmless program as a virus (false positive) or rather one that misses a malicious program (false negative)? Statistical hypotheses testing tests the so-called null-hypothesis, which is the default view that there is no relationship between two classes of events, or groups. Rejecting a true null hypothesis is called a type I error, whereas failing to reject a false null hypothesis is a type II error. Which type of possible error should we try to minimize, if we cannot minimize both at once?

In (normal) science, it is valued higher not to include false assertions into the body of knowledge, which would distort it in the long term. Thus, the default assumption—the null hypothesis—is that there is no connection between two classes of events, and typically statistical procedures are used that minimize type I errors (false positives) even if this might mean that an existing connection is missed (at least at first, or for a long time) (John 2010). To believe that a certain existing deterministic or probabilistic connection between two classes of events does not exist might slow down the scientific progress in normal science aiming at truth. However, in regulatory contexts it might be disastrous to believe falsely that a substance is safe when it is not. Consequently, a prominent interpretation of an epistemic PP takes it to entail a preference for type I errors over type II errors in regulatory contexts (see for example Lemons, Shrader-Frechette, and Cranor 1997; Peterson 2007a; John 2010).

Merely favoring one type of error over another might not be enough. It has been argued that the underlying methodology of either rejecting or accepting hypotheses does not sufficiently allow for identifying and tracking uncertainties. If a PP is understood as a principle that relaxes the standard for the amount of evidence required to take action, then a new epistemology might be needed: One that allows integrating the uncertainty about the causal connection between, for example, a drug and a harm, in the decision (Osimani 2013).

iii. Precautionary Defaults

The use of precautionary regulatory defaults is one proposal for how to deal with having to make regulatory decisions in the face of insufficient information (Sandin and Hansson 2002; Sandin, Bengtsson, and others 2004). In regulatory contexts, there are often situations in which a decision has to be made on how to treat a potentially harmful substance that also has some (potential) benefits. Other than in normal science, it is not possible to wait and collect further evidence before a verdict is made. The substance has to be treated one way or another while waiting for further evidence. Thus, it has been suggested that we should use regulatory defaults, that is, assumptions that are used in the absence of adequate information and that should be replaced if such information were obtained. They should be precautionary defaults by building in special margins of safety in order to make sure that the environment and human health get sufficient protection. One example is the use of uncertainty factors in toxicology. Such uncertainty factors play a role in estimating reference doses which are acceptable for humans by dividing a level of exposure found acceptable in animal experiments by a number (usually 100) (Steel 2011, 356). This takes into account that there are significant uncertainties, for example, in extrapolating the results from animals to humans. Such defaults are a way to handle uncertain threats. Nevertheless, they should not be confused with actual judgments about what properties a particular substance has (Sandin, Bengtsson, and others 2004, 5). Consequently, an epistemic PP does not have to be understood as a belief-guiding principle, but as saying something on which methods for risk assessment are legitimate, for example, for quantifying uncertainties (Steel 2011). According to this view, precautionary defaults like uncertainty factors in toxicology are methodological implications of a PP that allow to apply it in a scientifically sound way while protecting human health and the environment.

Given this, it might be misleading to interpret a PP as a purely epistemic principle, if it is not guiding our beliefs but telling us what assumptions to accept, that is, to act as if certain things were true, as long as we do not have more information. Thus, it has been argued that a PP is better interpreted as a procedural requirement, or as a principle that imposes several such procedural requirements (Sandin 2007, 103–4).

c. Procedural Interpretations

It has been argued that we should shift our attention when interpreting PPs from the question of what action to take to the question of what is the best way to reach decisions.

i. Argumentative, or “Meta”-PPs

Argumentative PPs are procedural principles specifying what kinds of arguments are admissible in decision-making (Sandin, Peterson, and others 2002). They are different from prescriptive, or action-guiding, PPs in that they do not directly prescribe actions that should be taken. Take principle 15 of the Rio Declaration on Environment and Development. On one interpretation, it states that arguments for inaction which are based solely on the ground that we are lacking full scientific certainty, are not acceptable arguments in the decision-making procedure:

Rio PP—“In  order to protect the environment, the precautionary approach shall be widely applied by states according to their capabilities. Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing cost-effective measures to prevent environmental degradation.” (United Nations Conference on Environment and Development 1992, Principle 15)

Such an argumentative PP is seen as a meta-rule that places real constraints on what types of decision rules should be used: For example, by entailing that decision-procedures should be used that are applicable under conditions of uncertainty, it recommends against some of the traditional approaches in risk regulation like cost-benefit analysis (Steel 2014). Similarly, it has been proposed that the idea behind PPs is best interpreted as a general norm that demands a fundamental shift in our way of risk regulation, based on an obligation to learn from regulatory mistakes of the past (Whiteside 2006).

ii. Transformative Decision Rules

Similar to argumentative principles, an interpretation of a PP as a transformative decision rule doesn’t tell us which action should be taken, but it puts constraints on which actions can be considered as valid options. Informally, a transformative decision rule is defined as a decision rule that takes one decision problem as input, and yields a new decision problem as output (Sandin 2004, 7). For example, the following formulation of a PP as a transformative decision rule (TPP) has been proposed by Peterson (2003):

TPP—If there is a non-zero probability that the outcome of an alternative act is very low, that is, below some constant c, then this act should be removed from the decision-maker’s list of options.

Thus, the TPP excludes courses of actions that could lead, for example, to catastrophic outcomes, from the options available to the decision maker. However, it does not tell us which of the remaining options should be chosen.

iii. Reversing the Burden of Proof

The requirement of reversal of burden of proof is one of the most prominent specific procedural requirements that are named in connection with PPs. For example, in the influential communication on the PP from the Wingspread Conference on the Precautionary Principle (1998), it is stated, “the proponent of an activity, rather than the public bears the burden of proof.”

One common misconception is that the proponent of a potentially dangerous activity would have to prove with absolute certainty that the activity is safe. This gave rise to the objection that PPs are too demanding, and therefore would bring every progress to a halt (Harris and Holm 2002). However, the idea is rather that we have to change our approach to regulatory policy: Proponents of an activity have to prove to a certain threshold that it is safe in order to employ it, instead of the opponents having to prove to a certain threshold that it is harmful in order to ban it.

Thus, whether or not the situation is one in which the burden of proof is reversed depends on the status quo. Instead of speaking of shifting the burden of proof, it seems more sensible to ask what has to be proven, and who has to provide what kind of evidence for it. The important point that then remains to be clarified is what standards of proof are accepted.

An alternative proposal to shifting the burden of proof is that both regulators and proponents of an activity (Arcuri 2007) should share it: If opponents want to regulate an activity, they should at least provide some evidence that the activity might lead to serious or irreversible harm, even though we are lacking evidence to prove it with certainty. Proponents, on the other hand, should provide certain information about the activity in order to get it approved. Who has the burden of proof can play an important role in the production of information: If proponents have to show (to a certain standard) that their activity is safe, this generates an incentive to gather information about the activity, whereas in the other case—“safe until proven otherwise”—they might deliberately refrain from this (Arcuri 2007, 15).

iv. Procedures for Determining Precautionary Measures

Interpreted in a procedural way, a PP puts constraints on how a problem should be described or how a decision should be made. It does not dictate a specific decision or action. This is in line with one interpretation of what it means to be a principle as opposed to a rule. While rules specify precise consequences that follow automatically when certain conditions are met, principles are understood as guidelines whose interpretation will depend on specific contexts (Fisher 2002; Arcuri 2007).

Developing a procedural precautionary framework that integrates different procedural requirements is a way to enable the context-dependent specification and implementation of such a PP. One example is Tickner’s (2001) “precautionary assessment” framework, which consists of six steps that are supposed to guide decision-making as a heuristic device. The first five steps—(1) Problem Scoping, (2) Participant Analysis, (3) Burden/Responsibility Allocation Analysis, (4) Environment and Health Impact Analysis, and (5) Alternatives Assessment—serve to describe the problem, identify stakeholders, and assess possible consequences as well as available alternatives. In the final step, (6) Precautionary Action Analysis, the appropriate precautionary measure(s) are determined based on the results from the other steps. These decisions are not permanent, but should be part of a continuous process of increasing understanding and reducing overall impacts.

That the components are clarified on a case-by-case basis is a big advantage of such procedural implementations of PPs. It avoids an oversimplification of the decision process and takes the complexity of decisions under uncertainty into account. However, they are criticized for losing the “principle” part of PPs: For example, Sandin (2007) argues that procedural requirements form a heterogeneous category. A procedural PP would soon dissolve beyond recognition because it is intermingled with other (rational, legal, moral, and so forth) principles and rules. As an answer, some authors try to preserve the “principle” in PPs, while also taking into account procedural as well as epistemic elements.

d. Integrated Interpretations

We can find two main strategies for formulating a PP that is still identifiable as an action-guiding principle while integrating procedural as well as epistemic considerations: Either (1) developing particular principles that are specific to a certain context, and accompanied by a procedural framework for this context; or (2) describing the structure and the main elements of a PP plus naming criteria for adjusting those elements on a case-by-case basis.

i. Particular Principles for Specific Contexts

It has been argued that the general talk of “the” PP should be abandoned in favor of formulating distinct precautionary principles (Hartzell-Nichols 2013). This strategy aims to arrive at action-guiding and coherent principles by formulating particular PPs that apply to a narrow range of threats and express a specific obligation. One example is the “Catastrophic Harm PP (CHPP)” of Hartzell-Nichols (2012, 2017), which is restricted to catastrophic threats. It consists of eight conditions that specify when precautionary measures have to be taken, spelling out (a) what counts as a catastrophe, (b) the knowledge requirements for taking precaution, and (c) criteria for appropriate precautionary measures. The CHPP is accompanied by a “Catastrophic Precautionary Decision-Making Framework” which guides the assessment of threats in order to decide whether they meet the CHPP’s criteria, and guides decision-makers in determining what precautionary measures should be taken against a particular threat of catastrophe. This framework lists key considerations and steps that should be performed when applying the CHPP, for example, drawing on all available sources of information, assessing likelihoods of potential harmful outcomes under different scenarios, identifying all available courses of precautionary action and their effectiveness, and identifying specific actors who should be held responsible for taking the prescribed precautionary measures.

ii. An Adjustable Principle with Procedural Instructions

Identifying main elements of a PP and accompanying them with rules for adjusting them on a case-by-case basis is another strategy to preserve the idea of a precautionary principle while avoiding both inconsistency as well as vagueness. It has been shown that as diverse as PP formulations are, they typically share the elements of uncertainty, harm, and (precautionary) action (Sandin 1999, Manson 2002). By explicating these concepts and, most importantly, by defining criteria for how they should be adjusted with respect to each other, some authors obtain a substantial PP that can be adjusted on a case-by-case basis without becoming arbitrary.

One example is the PP that Randall (2011) develops in the context of an in-depth analysis of traditional, or as he calls it, ordinary risk management (ORM). Randall identifies the following “general conceptual form of PP”:

If there is evidence stronger than E that an activity raises a threat more serious than T, we should invoke a remedy more potent than R.

Threat, T, is explicated as chance of harm, meaning that threats are assessed and compared according to their magnitude and likelihood. Our knowledge of outcomes and likelihoods is explicated with the concept of evidence, E, referring to uncertainty in the sense of our incomplete knowledge about the world. The precautionary response is conceptualized as remedy, R, which covers a wide range of responses from averting the threat, remediating its damage, mitigating harm, and adapting to changed conditions after other remedies have been exhausted. Remedies should fulfill a double function, (1) providing protection from a plausible threat, while at the same time (2) generating additional evidence about the nature of the threat and the effectiveness of various remedial actions. The main relations between the three elements are that the higher the likelihood that the remedy-process will generate more evidence, the smaller is the threat-standard and the lower is the evidence-standard that should be required before invoking the remedy even if we have concerns about its effectiveness (Randall 2011, 167).

Having clarified the concepts used in the ETR-framework, Randall specifies them in order to formulate a PP that accounts for the weaknesses of ORM:

Credible scientific evidence of plausible threat of disproportionate and (mostly but not always) asymmetric harm calls for avoidance and remediation measures beyond those recommended by ordinary risk management. (Randall 2011, 186)

He then goes on to integrate this PP and ORM together into an integrated risk management framework. Randall makes sure to stress that a PP cannot determine the decision-process on its own. As a moral principle, it has to be weighed against other moral, political, economic, and legal considerations. Thus, he also calls for the development of a procedural framework to ensure that its substantial normative commitments will be implemented on the ground (Randall 2011, 207).

Steel (2014, 2013) develops a comprehensive PP interpretation which is intended to be “a procedural requirement, a decision rule, and an epistemic rule” (Steel 2014, 10). Referring to the Rio Declaration, Steel argues that such a formulation of a PP states that our decision-process should be structured differently, namely that decision-rules should be used that can be applied in an informative way under uncertainty. However, he does not take this procedural element to be the whole PP, but interprets it as a “meta”-rule which guides the application and specification of the precautionary “tripod” of threat, uncertainty, and precautionary action. More specifically, Steel’s proposed PP consists of three core elements:

  • The Meta Precautionary Principle (MPP): Uncertainty must not be a reason for inaction in the face of serious threats.
  • The Precautionary Tripod: The elements that have to be specified in order to obtain an action-guiding precautionary principle version, namely: If there is a threat that meets the harm condition under a given knowledge condition then a recommended precaution should be taken.
  • Proportionality: Demands that the elements of the Precautionary Tripod are adjusted proportionally to each other, understood as Consistency: The recommended precaution must not be recommended against by the same PP version, and Efficiency: Among those precautionary measures that can be consistently recommended by a PP version, the least costly one should be chosen.

An application of this PP requires selecting what Steel calls a “relevant version of PP,” that is, a specific instance of the Precautionary Tripod that meets the constraints from both MPP and Proportionality. To obtain such a version, Steel (2014, 30) proposes the following strategy: (1) select a desired safety target and define the harm condition as a failure to meet this target, (2) select the least stringent knowledge condition that results in a consistently applicable version of PP given the harm condition. To comply with the MPP, uncertainty must neither turn the PP version inapplicable nor lead to continual delay in taking measures to prevent harm.

Thus, Steel’s PP proposal guides decision-makers both in formulating the appropriate PP version as well as in its application. The process of formulating the particular version already deals with many questions like how evidence should be assessed, who has to prove what, to what kind of threats we should react, and what appropriate precautionary measures would be. Arguably, this PP can thereby be action-guiding, since it helps to select specific measures, without being a rigid prescriptive rule that is not suited for decisions under uncertainty.

Additionally, proposals like the ones of Randall and Steel have the advantage that they are not rigidly tied to a specific category of decision-theoretic non-certainty, that is, decision-theoretic risk, uncertainty, or ignorance. They can be adjusted with respect to varying degrees of knowledge and available evidence, taking into account that we typically have some imprecise or vague sense of how likely various outcomes are, but not enough of a sense to assign meaningful precise probabilities to the outcomes. While these situations do not amount to decision-theoretic risk, they nonetheless include more information than what is often taken to be available in decision-theoretic uncertainty. Arguably, this better corresponds to the notion of “scientific uncertainty” than to equate the latter with decision-theoretic uncertainty (see Steel 2014, Chapter 4).

3. Justifications for Precautionary Principles

This section surveys different normative backgrounds that have been used to defend a PP. It starts by addressing arguments that can be located in the framework of practical rationality, before moving to substantial moral justifications for precautions.

a. Practical Rationality

When PPs are proposed as principles of practical rationality, they are typically seen as principles of risk regulation. This includes, but is not reduced to, rational choice theory. When we examine the justifications for PPs in this context, we have to do this against the background of established risk regulation practices. We can identify a rather standardized approach to the assessment and management of risks, which Randall (2011, 43) calls “ordinary risk management (ORM).”

i. Ordinary Risk Management

Although there are different understandings of ORM, we can identify a rather robust “core” of two main parts. First, a scientific risk assessment is conducted, where potential outcomes are identified and their extent and likelihood estimated (compare Randall 2011, 43–46). Typically, risk assessment is understood as a quantitative endeavor, expressing numerical results (Zander 2010, 17). Second, on the basis of the data obtained from the risk assessment, the risk management phase takes place. Here, alternative regulatory courses of action as response to the scientifically estimated risks are discussed, and a choice is made between them. While the risk assessment phase should be as objective and value-free as possible, the decisions that take place in the risk management phase should be, although informed by science, based on the values and interests of the parties involved. In ORM, cost-benefit analysis (CBA) is a powerful and widely used tool for making these decisions in the risk-management phase. To conduct a CBA, the results from the risk assessment, that is, what outcomes are possible under which course of action, are evaluated according to the willingness to pay (WTP) or willingness to accept compensation (WTA) of individuals in order to estimate the benefits and costs of different courses of actions. That means that non-economic values, like human lives or environmental preservation, are getting monetized in order to be comparable on a common ratio-scale. Since we rarely if ever are facing cases of certainty, where each course of action has exactly one outcome which will materialize if we chose it, these so-reached utilities are then probability-weighed and added up in order to arrive at the expected utility of the different courses of action. On this basis, it is possible to calculate which regulatory actions have the highest expected net benefits (Randall 2011, 47), that is, to apply the principle of maximizing expected utility (MEU) and to choose the option with the highest expected utility. CBA is seen as a tool that enables decision-makers to rationally compare costs and benefits, helping them to come to an informed decision (Zander 2010, 4).

In the context of ORM, we can distinguish two main lines of argumentation for PPs: On the one hand, authors argue that PPs are rational by trying to show that they gain support from ORM. On the other hand, authors argue that ORM itself is problematic in some aspects, and propose PPs as a supplement or alternative to it. In both cases, we find justifications for PPs as decision rules for risk management as well as principles that pertain to the risk assessment stage and are concerned with problem-framing (this includes epistemic and value-related questions).

ii. PPs in the Framework of Ordinary Risk Management

To begin, here are some ways in which people propose to locate and defend PPs within ORM.

Expected Utility Theory
Some authors claim that as long as we can assign probabilities to the various outcomes, that is, as long as we are in a situation of decision-theoretic risk, precaution is already “built in” into ORM (Chisholm and Clarke 1993; Gardiner 2006; Sunstein 2007). The argument is roughly that no additional PP is necessary because expected utility theory in combination with the assumption of decreasing marginal utility allows for risk aversion by placing greater weight on the disutility of large damages. Not to choose options with possibly catastrophic outcomes, even if they only have a small probability, would thus be recommended by the principle of maximizing expected utility (MEU) as a consequence of their large disutility.

This argumentation does not go unchallenged, as the next subsection (3.a.iii) shows. Additionally, MEU itself is not uncontroversial (see Buchak 2013). Still, even if we accept it, we cannot use MEU under conditions of decision-theoretic uncertainty, since it relies on probability information. Consequently, authors proposed PPs for decisions under uncertainty in order to fill this “gap” in the ORM framework. They argue that under decision-theoretic uncertainty, it is rational to be risk-averse, and try to demonstrate this with arguments based on rational choice theory. However, it is not always clear if the discussed decision rule is used to justify a—somehow—already formulated PP, or if the decision rule is proposed as a PP itself.

Maximin and Minimax Regret
Both the maximin rule—selecting the course of action with the best worst case—and the minimax regret rule—selecting the course of action where under each possible scenario, the maximal regret is the smallest—have been proposed and discussed as possible formalizations of a PP within the ORM framework. It has been argued that maximin captures the underlying intuitions of PPs (namely, that the worst should be avoided) and that it yields rational decisions in relevant cases (Hansson 1997). Although the rationality of maximin is contested (Harsanyi 1975; Bognar 2011), it is argued that we can qualify it with criteria to single out the cases in which it can—and should—rationally be applied (Gardiner 2006). This is done by showing that a so-qualified maximin rule fits with paradigm cases of precaution and commonsense-decisions that we make, arguing that it is plausible to adopt it also for further cases.

Chisholm and Clarke (1993) argue that the minimax regret rule leads to the prevention of uncertain harm in line with the basic idea of a PP, while also giving some weight to forgone benefits. Against minimax regret and in favor of maximin, Hansson (1997, 297) argues that, firstly, minimax regret presupposes more information, since we need to be able to assign numerical utilities to outcomes. Secondly, he uses a specific example to show that minimax regret and maximin can lead to conflicting recommendations. According to Hansson, the recommendation made by maximin expresses a higher degree of precaution.

Quasi-Option Value
Irreversible harm is mentioned in many PP formulations, for example in the Rio Declaration. One proposal to justify why “irreversibility” justifies precautions refers to the concept of “(quasi-)option value” (Chisholm and Clarke 1993; Sunstein 2005a, 2009), which was first introduced by Arrow and Fisher (1974). They show that when regulators are confronted with decision problems where they are (a) uncertain about the outcomes of the options, but there are (b) chances for resolving or reducing these uncertainties in the future, and (c) one or more of the options might entail irreversible outcomes, then they should attach an extra-value, that is, an option-value to the reversible options. This takes into account the value of the options that choosing an alternative with irreversible outcome would foreclose. To illustrate this, think of the logging of (a part of) a rain forest: It is a very complex ecosystem, which we could use in many ways. But once it is clear-cut, it is almost impossible to restore to its original state. By choosing the option to cut it down, all options to use the rain forest in any other way would practically be lost forever. As Chisholm and Clarke (1993, 115) point out, irreversibility might sometimes be associated with not taking actions now: Not mitigating greenhouse gas (GHG) emissions means that more and more GHG aggregate in the atmosphere, where they stay for a century or more. They argue that introducing the concept of quasi-option value supports the application of a PP even if decision makers are not risk-averse.

iii. Reforming Ordinary Risk Management

After reviewing attempts to justify a PP in the ORM framework, without challenging the framework itself, let us now examine justifications for PPs that are partially based on criticisms of ORM.

Deficits of ORM
As a first point, ORM as a regulatory practice tends toward oversimplification that neglects uncertainty and imprecision, leading to irrational and harmful decisions. This is seen as a systematic deficit of ORM itself, not only of its users (see Randall 2011, 77), and not only as a problem under decision-theoretic uncertainty, that is, situations where no reliable probabilities are available, but already under decision-theoretic risk. First, decision makers tend to ignore low probabilities as irrelevant, focusing on the “more realistic,” higher ones. This means that low, but significant probabilities for catastrophe are ignored, for example, so called “fat tails” in climate scenarios (Randall 2011, 77). Second, decision makers are often “myopic”, placing higher weight on current costs than on future benefits, avoiding high costs today. This often leads to even higher costs in the future. Third, disutilities might get calculated too optimistically, neglecting so-called “secondary effects” or “social amplifications,” for example, the psychological and social effects of catastrophes (see Sunstein 2007, 7). Lastly, since cost-benefit analysis (CBA) provides such a clear view, there is a tendency to apply it even if the conditions for its application are not fulfilled. We tend to assume more than we know, and to decide according to the MEU criterion although no reliable probability information and/or no precise utility information is available. This so-called “tuxedo fallacy” is seen as a dangerous fallacy because it creates an “illusion of control” (Hansson 2008, 426–27).

Since PPs are seen as principles that address exactly such problems—drawing our attention on unlikely catastrophic possibilities, demanding action besides uncertainty, to consider the worst possible outcomes, and not to assume more than we know—they gain indirect support from these arguments. ORM in its current form allures us to apply it incorrectly and to neglect rational precautionary action. At least some sort of overarching PP that reminds us of correct practices seems necessary.

As a second point, it is argued that the regulatory practice of ORM has not only the “built-in” tendency to miss-apply its tools, but that it has fundamental flaws in itself which should be corrected by a PP. Randall (2011, 46–70) criticizes risk assessment in ORM on the grounds that it is typically built on simple models of the threatened system, for example, the climate system. Those neglect systemic risks like the possibility of feedback effects or sudden regime shifts. By depending on the law of large numbers, ORM is also not a decision framework that is suitable to deal with potential catastrophes, since they are singular events (Randall 2011, 52). Similarly, Chisholm and Clarke (1993, 112) argue that expected utility theory is only useful as long as “probabilities and possible outcomes are within the normal range of human experience.” Examples for such probabilities and outcomes in the normal range of human experience are insurances like car and fire insurance: We have statistics about the probabilities of accidents or fires, and can calculate reasonable insurance premiums based on the law of large numbers. Furthermore, we have experience with how to handle them, and have institutions in place like fire departments. None of this is true for singular events like anthropogenic climate change. Consequently, it is argued that we cannot just leave ORM relatively unaltered, and support it with a PP for decisions under uncertainty, and perhaps a more general, overarching PP as a normative guideline. Instead, it is demanded that we also have to reform the existing ORM framework in order to include precautionary elements.

Historical Arguments for Revising ORM
In the past, failures to take precautionary measures often resulted in substantial, widespread, and long-term harm to the environment and human health (Harremoës and others 2001, Gee and others 2013). This insight has been used to defend adopting a precautionary principle as a corrective to existing practices: For John (2007, 222), these past failures can be used as “inductive evidence” in an argument for reforming our regulatory policies. Whiteside (2006, 146) defends a PP as a product of social learning from past mistakes. According to Whiteside, these past mistakes reveal that (a) our knowledge about the influences of our actions on complex ecological systems is insufficient, and (b) that how decisions were reached was an important part of their inefficiency, leading to insufficient protection of the environment and human health. As such, to Whiteside, the PP generates a normative obligation to re-structure our decision-procedures (Whiteside 2006, 114). The most elaborate historical argument is made by Steel (2014, Chapter 5). Steel’s argument rests on the following premise:

If a systematic pattern of serious errors of a specific type has occurred, then a corrective for that type of error should be sought. (Steel 2014, 91)

By critically examining not only cases of failed precautions and harmful outcomes, but also counter-examples of allegedly “excessive” precaution, Steel shows that such a pattern of serious errors in fact exists. Cases such as the ones described in “Late Lessons from Early Warnings” (Harremoës and others 2001) demonstrate that continuous delays in response to emerging threats have frequently led to serious and persistent harms. Steel (2014, 74–77) goes on to examine cases that have been named as examples of excessive precaution. He finds that in fact, often no regulation whatsoever was implemented in the first place. And in cases where regulations were put in place, they were mostly very restricted, had only minimal negative effects, and were relatively easily reversible. For example, one of the “excessive precautions” consisted in putting a warning label on products containing saccharine in the US. According to Steel (2014, 82), the historical argument thus supports a PP as a corrective against a systematic bias that is entrenched in our practices. This bias emerges because there are informational and political asymmetries that make continual delays more likely than precautionary measures when there are trade-offs between short-term economic gain for an influential party against harms that are uncertain or distant in terms of space or time (or all three).

Epistemic Implications
The justifications presented so far all concern PPs aiming at the management of risks, that is, action-guiding interpretations. But we can also find discussions of a PP for the assessment of threats, so called “epistemic” PPs. It is not enough to just supply existing practices with a PP; clearly, risk assessment has to be changed, too, in order to be able to apply a PP. This means that uncertainties have to be taken seriously and to be communicated clearly, that we need to employ more adequate models which take into account the existence of systemic risks (Randall 2011, 77–78), that we need criteria to identify plausible (as opposed to “mere”) possibilities, and so on. However, this is more a question of the implications of adopting a PP, not an expression of a genuine PP itself. Thus, these kinds of argument are either presuppositions for a PP, because we need to identify uncertain harms first in order to do something about them. Or they are implications from a PP, because it is not admissible to conduct a risk assessment that makes it impossible to apply a PP.

Procedural Precaution
Authors who favor a procedural interpretation of PPs stress that they are concerned especially with decisions under conditions of uncertainty. They point out that while ORM, with its focus on cost-effectiveness and maximizing benefits, might be appropriate for conditions of decision-theoretic risk, the situation is fundamentally different if we have to make decisions under decision-theoretic uncertainty or even decision-theoretic ignorance. For example, Arcuri (2007, 20) points out that since PPs are principles particularly for decisions under decision-theoretic uncertainty, they cannot be prescriptive rules which tell us what the best course of action is—because the situation is essentially characterized by the fact that we are uncertain about the possible outcomes to which our actions can lead. Tickner (2001, 14) claims that this should lead to redirecting the questions that are asked in environmental decision-making: The focus should be moved from the hazards associated with a narrow range of options to solutions and opportunities. Thus, the assessment of alternatives is a central point of implementing PPs in procedural frameworks:

In the end, acceptance of a risk must be a function not only of hazard and exposure but also of uncertainty, magnitude of potential impacts and the availability of alternatives or preventive options. (Tickner 2001, 122)

Although (economical) efficiency should not be completely dismissed and still should have its place in decision-making, proponents of a procedural PP proclaim that we should shift our aim in risk regulation from maximizing benefits to minimizing threats, especially in the environmental domain where harms are often irreversible (compare Whiteside 2006, 75). They also advocate democratic participation, pointing out that a decision-making process under scientific uncertainty cannot be a purely scientific one (Whiteside 2006, 30–31; Arcuri 2007, 27). They thus see procedural interpretations of PPs as justified with respect to the goal of ensuring that decisions are made in a responsible and defensible way, which is especially important when there are substantial uncertainties about their outcomes.

Challenging the Underlying Value Assumptions
In addition to scientific uncertainty, Resnik (2003, 334) distinguishes another kind of uncertainty, which he calls “axiological uncertainty.” Both kinds make it difficult to implement ORM in making decisions. While scientific uncertainty arises due to our lack of empirical evidence, axiological uncertainty is concerned with our value assumptions. This kind of uncertainty can take on different forms: We can be unsure about how to measure utilities—in dollars lost/saved, lives lost/saved, species lost/saved, or something else? Then, we can be uncertain how to aggregate costs and benefits, and how to compare, for example, economic values with ecological ones. Values cannot always be measured on a common ordinal scale, much less on a common cardinal scale (as ORM requires, at least in some senses such as those including the use of a version of cost-benefit analysis). Thus, it is irrational to treat them as if they would fulfill this requirement (Thalos 2012, 176–77; Aldred 2013). This challenges the value assumptions underlying ORM, and is seen as a problem that should be fixed by a PP.

Additionally, authors like Hansson (2005b, 10) criticize that it is essentially problematic that costs and benefits get aggregated without regard to who has them, and that person-related aspects like autonomy, or if a risk is willingly taken or imposed by others, are unjustly neglected.

To sum up, we can say that when the underlying value assumptions of ORM are challenged, either the criticism pertains to how values are estimated and assigned, or the utilitarian decision criterion of maximizing overall expected utility is criticized. In both cases, we are arguably leaving the framework of rational choice and ORM, and move toward genuine moral justifications for PPs.

b. Moral Justifications for Precaution

Some authors stress that, regardless of whether a PP is thought to supplement ordinary risk management (ORM) or whether it is a more substantive claim, a PP is essentially a moral principle, and has to be justified on explicitly moral grounds. (Note that depending on the moral position one holds, many of the considerations in 3.a can also be seen as discussions of PPs from a moral standpoint; most prominently utilitarianism, since ORM uses the rule of maximizing expected utility.) They argue that taking precautionary measures under uncertainty is morally demanded, because otherwise we risk damages that are in some way morally unacceptable.

i. Environmental Ethics

PPs are often associated with environmental ethics, and the concept of sustainable development (O’Riordan and Jordan 1995; Kaiser 1997; Westra 1997; McKinney and Hill 2000; Steele 2006; Paterson 2007). Some authors take environmental preservation to be at the core of PPs. PP formulations such as the Rio or the Wingspread PP emerged in a debate about the necessity to prevent environmental degradation, which explains why many PPs highlight environmental concerns. It seems plausible that a PP can be an important part of a broader approach to environmental preservation and sustainability (Ahteensuu 2008, 47). But it seems difficult to justify a PP with recourse to sustainability, since the concept itself is vague and contested. Indeed, when PPs have been discussed in the context of sustainability, they are often proposed as ways to operationalize the vague concept into a principle for policymaking, along with other principles like the “polluter pays” principle (Dommen 1993; O’Riordan and Jordan 1995). Thus, while PPs are partly motivated by the insight that our way of life is not sustainable, and that we should change how we approach environmental issues, it is difficult to justify them solely on such grounds. However, the hope is that a clarification of the normative (moral) underpinnings of PPs will help to justify a PP for sustainable development. In the following, we will see that it might make sense to take special precautions with respect to ecological issues, not only because they often are complex and might entail unresolvable uncertainties (Randall 2011, 64–70), but also because harm to the environment can affect many other moral concerns, for example, human rights and both international and intergenerational justice. As we will see, these moral issues might provide justifications for PPs on their own, without explicit reference to sustainability.

ii. Harm-Based Justifications

PPs that apply to governmental regulatory decisions have been defended as an extension of the harm principle. There are different versions of the harm principle, but roughly, it states that the government is justified in restricting citizens’ individual liberty only to avoid harm to others.

The application of the harm principle normally presupposes that certain conditions are fulfilled,  for example, that the harms in question must be (1) involuntarily taken, (2) sufficiently severe and (3) probable, and (4) the prescribed measures must be proportional to the harms (compare Jensen 2002, Petrenko and McArthur 2011). If these conditions are fulfilled, the prevention principle can be applied, prescribing proportional measures to prevent the harm in question from materializing. However, PPs apply to cases where we are unsure about the extent and/or the probability of a possible harm. Consequently, PPs are seen as a “clarifying amendment” (Jensen 2002, 44) which extends the normative foundation of the harm principle from prevention to precaution (Petrenko and McArthur 2011, 354): The impossibility to assign probabilities does not negate the obligation to act as long as possible harms are severe enough and scientifically plausible. Even for the prevention principle, it holds that the more severe a threat is, the less probable it has to be in order to warrant preventive measures. Thus, it has been argued that the probability of high-magnitude harms becomes almost irrelevant, as long as they are scientifically plausible (Petrenko and McArthur 2011, 354–55). Additionally, some harm is seen as so serious that it warrants special precaution, for example, if it is irreversible or cannot be (fully) compensated (Jensen 2002, 49–50). In such situations, the government is justified in restricting liberties by, for example, prohibiting a technology, even if there remains uncertainty about whether or not the technology would actually have harmful effects.

A related idea is that governments have an institutional obligation not to harm the population, which overrides the weaker obligation to do good—meaning that it is worse if certain regulatory decisions of the government lead to harm than if they lead to foregone benefits (John 2007).

The question what exactly makes a threat severe enough to justify the implementation of precautionary measures has also been discussed with reference to justice- and rights-based considerations.

iii. Justice-Based Justifications

McKinnon (2009, 2012) presents two independent arguments for precautions, which both are justice-based. Those arguments are developed with respect to the possibility of a climate change catastrophe (CCC), and concern two alternative courses of action and their worst cases. The case of “Unnecessary Expenditure” means taking precautions which turn out to have been unnecessary, thereby wasting money which could have been spent for other, better purposes. “Methane Nightmare” describes the case of not taking precautions, leading to CCCs with catastrophic consequences, making survival on earth very difficult if not impossible. McKinnon argues that CCCs are uncertain in the sense that they are scientifically plausible, even though we cannot assign probabilities to them (McKinnon 2009, 189).

Playing it Safe
McKinnon’s first argument for why uncertain, yet plausible harm with the characteristics of CCCs justifies precautionary measures is called the “playing safe”– argument. It is based on two Rawlsian commitments about justice (McKinnon 2012, 56): (1) That treating people as equals means (among other things) to ensure a distribution of (dis)advantage among them that makes the worst-off group as well off as possible, and (2) that justice is intergenerational in scope, governing relations across generations as well as within them.

McKinnon (2009, 191–92) argues that the distributive injustice would be so much higher if “Methane Nightmare” would materialize than if it came to “Unnecessary Expenditure” that we have to choose to take precautionary measures, even though we do not know how probable “Methane Nightmare” is. That is to say, such a situation warrants the application of the maximin-principle, because distributive justice in the sense of making the worst-off as well off as possible has lexical priority to maximizing the overall benefits for all. Choosing an option that has a way better best case, but, in the worst-case, would lead to distributive injustice, over another option which might have a less-good best-case, but where the worst-case does not entail such distributive injustices, would be inadmissible.

Unbearable Strains of Commitment
As McKinnon notes, the “playing safe” justification only holds if one accepts a very specific understanding of distributive (in)justice. However, she claims to have an even more fundamental argument for precautionary measures in this context, which is also based on Rawlsian arguments concerning intergenerational justice, but does not rely on a specific conception of distributive justice. It is called the “unbearable strains of commitment”-argument and is based on a combination of the “just savings”-principle for intergenerational justice together with the “impartiality”-principle. It states that we should not choose courses of actions that impose on future generations conditions which we ourselves could not agree to and which would undermine the bare possibility of justice itself (McKinnon 2012, 61). This justifies taking precautions against CCCs, since the worst-case in that option is “Unnecessary Expenditure”, which, in contrast to “Methane Nightmare” would not lead to justice-jeopardizing consequences.

iv. Rights-Based Justifications

Strict precautionary measures concerning climate change have been demanded based on the possible rights violations that such climate change might entail. For example, Caney (2009) claims that although other benefits and costs might be discounted, human rights are so fundamental that they must not be discounted. He argues that the possible harms involved in climate change justify precautions: An unmitigated climate change entails possible outcomes which would lead to serious or catastrophic right violations, while a policy of strict mitigation would not involve a loss of human rights—at least not if it is carried out by the affluent members of the world. Additionally, “business as usual” from the affluent would mean to gamble with the conditions of those who already lack fundamental rights protection, because the negative effects of climate change would come to bear especially in poor countries. Moreover, the benefits of taking the “risk of catastrophic climate change” outcomes would almost entirely result for the risk-takers, not the risk-bearers (Caney 2009, 177–79). If we extrapolate from this concrete application, the basic justification for precaution seems to be: If a rights violation is plausibly possible, and there are ways to avoid this possibility by choosing another course of action, which does not involve the plausible possibility of rights violations, then we have to choose the second option. It does not matter how likely the rights violations are going to happen; as long as they are plausible, we have to treat them as if they would materialize with certainty.

Thus, in this interpretation, precaution means making sure that no rights violations happen, even if we (because of uncertainty) “run the risk” of doing more than what would have been necessary—as long as we don’t have to jeopardize our own rights in order to do so.

v. Ethics of Risk and Risk Impositions

Some authors see the PP as an expression of a problem with what they call standard ethics (Hayenhjelm and Wolff 2012, e28). According to them, standard ethical theories, with their focus on evaluations of actions and their outcomes under conditions of certainty, fail to keep up with the challenges that technological development poses. PPs are then placed in the broader context of developing and defending an ethics of risk, that is, a moral theory about the permissibility of risk impositions. Surprisingly, so far there are few explicit connections between the discussion of the ethics of risk impositions (see for example Hansson 2013, Lenman 2008, Suikkanen 2019) and the discussion of PPs.

One exemption is Munthe (2011), who argues that before we can formulate an acceptable and intelligible PP, we first need at least the basic structure of an ethical theory that deals directly with issues of creating and avoiding risks of harm. In Chapter 5 of his book, Munthe (2011) sets out to develop such a theory, which focuses on the responsibility of a decision, specifically, responsibility as a property of decisions: Decisions and risk impositions may be morally appraised in their own right. When one does not know what the outcome of a decision will be, it is important to make responsible decisions, that is, decisions that can still be defensible as being responsible given the information one had at the time the decision was made, even if the outcome is wrong. However, even though Munthe’s discussion starts out from the PP, he ultimately concludes that we do not need a PP, but a policy that expresses a proper degree of precaution: “What is needed is plausible theoretical considerations that may guide decision makers also employing their own judgement in specific cases. We do not need a precautionary principle, we need a policy that expresses a proper degree of precaution.” Thus, the idea seems to be that while a fully developed ethics of risk will justify demands commonly associated with PPs, it ultimately will replace the need for a PP.

4. Main Objections and Possible Rejoinders

This section presents the most frequent and the most important objections and challenges PPs face. They can be roughly divided into three groups. The first argues that there are fundamental conceptual problems with PPs, which make them unable to guide our decisions. The second claims that PPs, in any reasonable interpretation, are superfluous and can be reduced to existing practices done right. The third rejects PPs as irrational, saying that they are based on unfounded fears and that they contradict science, leading to undesirable consequences. While some objections are aimed at specific PP-proposals, others are intended as arguments against PPs in general. However, even the latter typically hold only for specific interpretations. This section shortly presents the main points of these criticisms, and then discusses how they might be answered.

a. PPs Cannot Guide Our Decisions

There are two main reasons why PPs are seen as unable to guide us in our decision-making: They are rejected either as incoherent, or as being vacuous and devoid of normative content.

Objection: PPs are incoherent
One frequent criticism, most prominently advanced by Sunstein (2005b), is that a “strong PP” leads to contradicting recommendations and is therefore paralyzing our decision-making. He understands “strong PP” as a very demanding principle which states that “regulation is required whenever there is a possible risk to health, safety, or the environment, even if the supporting evidence remains speculative and the economic costs of regulation are high” (Sunstein 2005b, 24). The problem is that every action poses such a possible risk, and thus, both regulation and non-regulation would be prohibited by the “strong PP,” resulting in paralysis (Sunstein 2005b, 31). Hence, “strong PP” is rejected as an incoherent decision-rule, because it leads to contradicting recommendations.

Peterson (2006) makes another argument that rejects PPs as incoherent. He claims that he can prove formally as well as informally that every serious PP formulation is logically inconsistent with reasonable conditions of rational choice, and should therefore be given up as a decision-rule (Peterson 2006, 597).

Rejoinder
Both criticisms have been rejected as being based on a skewed PP interpretation. In the case of Sunstein’s argument, he is attacking a straw-man. His critique of the “strong PP” as paralyzing relies on two assumptions which are not made explicit, namely (a) that a PP is invoked by any and all risks, and (b) that risks of action and inaction are typically equally balanced (Randall 2011, 20). However, this is an atypical PP interpretation. Most formulations make explicit reference to severe dangers, meaning that not just any possible harm, no matter how small, will invoke a PP. And, as the case studies in Harremoës and others (2001) illustrate, the possible harms from action and inaction—or, more precisely, regulation or no regulation—are typically not equally balanced (see also Steel 2014, Chapter 9). Still, Sunstein’s critique calls attention to the important point of risk-risk trade-offs, which every sound interpretation and application of a PP has to take into account: Taking precautions against a possible harm should not lead to an overall higher level of threat (Randall 2011, 84–85). Nevertheless, there seems to be no reason why a PP should not be able to take this into account, and the argument thus fails as a general rejection of PPs.

Similarly, it can be contested whether Peterson’s (2006) PP formalization is a plausible PP candidate: He presupposes that we can completely enumerate the list of possible outcomes, that we have rational preferences that allow for a complete ordering of the outcomes, and that we can estimate at least the relative likelihood of the outcomes. As Randall (2011, 86) points out, this is an ideal setup for ordinary risk management (ORM), and the three conditions for rational choice that Peterson cites and with which he shows his PP to be inconsistent, have their place in the ORM- framework. Thus, one can object that it is not very surprising if a PP, which aims especially at situations in which the ideal conditions are not met, does not do very well under the ideal conditions.

Objection: PPs are vacuous
On the other hand, it is argued that if a PP is attenuated in order not to be paralyzing, it becomes such a weak claim that it is essentially vacuous. Sunstein (2005b, 18) claims that weaker formulations of PPs are, although not incoherent, trivial: They merely state that lack of absolute scientific proof is no reason for inaction, which, according to Sunstein, has no normative force because everyone is already complying with it. Similarly, McKinnon (2009) takes a weak PP formulation to state that precautionary measures are permissible, which she also rejects as a hollow claim, stating that everyone could comply with it without ever taking any precautionary action.

Additionally, PPs are rejected as vacuous because of the multitude of formulations and interpretations. Turner and Hartzell (2004), examining different formulations of PPs, come to the conclusion that they are all beset with unclarity and ambiguities. They argue that there is no common core of the different interpretations, and that the plausibility of a PP actually rests on its vagueness. This makes it unsuitable as a guide for decision-making. Similarly, Peterson (2007b, 306) states that such a “weak” PP has no normative content and no implications for what ought to be done. He claims that in order to have normative content, a PP would need to give us a precise instruction for what to do for each input of information (Peterson 2007b, 306). By formulating a minimal normative PP interpretation and showing that it is incoherent, he argues that there cannot be a PP with normative content.

Rejoinder
Firstly, let us address the criticism that PPs are vacuous because they express a claim that is too weak to have any impact on decision-making. Against this, Steel (2013, 2014) has argued that even if these supposedly “weak” or “argumentative” principles do not directly recommend a specific decision, they nonetheless have an impact on the decision-making process if taken seriously. He interprets them as a meta-principle that puts constraints on what decision-rules should be used, namely, none that would lead to inaction in the face of uncertainty. As, for example, cost-benefit analysis needs numerical probabilities to be applicable, the Meta PP will recommend against it in situations where no such probability information is available. This is a substantial constraint, meaning that the Meta PP is not vacuous. One can reasonably doubt that Sunstein is right that everyone follows such an allegedly “weak” principle anyway. There are many historical cases where there was some positive evidence that an activity caused harm, but the fact that the activity-harm link had not been irrefutably proven was used to argue against regulatory action (Harremoës and others 2001, Gee and others 2013). Thus, in cases where no proof, or at least no reliable probability information, concerning the possibility of harm is available, uncertainty is often used as a reason to not take precautionary action. Additionally, this criticism clearly does not concern all forms of PPs, and only amounts to a full-fledged rejection of PPs if combined with the claim that so-called “stronger” PPs which are not trivial, will always be incoherent. And both Sunstein (2005b) and McKinnon (2009, 2012) do propose other PPs which express a stronger claim, albeit with a restricted scope (for example, only pertaining to catastrophic harm, or damage which entails specific kinds of injustice). This form of the “vacuous” objection can thus be seen not as an attack on the general idea of PPs, but more as the demand that the normative obligation they express should be made clear in order to avoid downplaying it.

Let us now consider the other form of the objection, namely the claim that PPs are essentially vague and that there cannot be a precise formulation of a PP that is both action-guiding and plausible. It is true that so far, there does not seem to exist a “one size fits all” PP that yields clear instructions for every input and that captures all the ideas commonly associated with PPs. However, even if this would be a correct interpretation of what a “principle” is (which many authors deny, compare for example Randall 2011, 97), it is not the only one. Peterson (2007b) presumes that only a strict “if this, then that” rule can have normative force, and consequently be action-guiding. In contrast, other authors stress the difference between a principle and a rule (Fisher 2002; Arcuri 2007; Randall 2011). According to them, while rules specify precise consequences that follow automatically when certain conditions are met, principles express normative obligations that need to be specified according to different contexts, and that need to be implemented and operationalized in rules, laws, policies, and so on (Randall 2011, 97). When authors are rejecting PPs as incoherent (see the objection), they might sometimes make the same mistake, confusing a general principle that needs to be specified on a case-by-case basis with a stand-alone decision rule that should fit for any and all cases.

As for PPs being essentially vague: This criticism seems to presuppose that in order to formulate a clarified PP, we have to capture and unify everything that is associated with it. However, explicating a concept in a way that clarifies it and captures as many of the ideas associated with it as possible does not mean that we have to preserve all of the ideas commonly associated with it. The same is true for explicating a principle such as a PP. Additionally, this article shows that many different ways of interpreting PPs in a precise way are possible, and not all of them exclude each other.

b. PPs are Redundant

Some authors reject PPs by arguing that they are just a narrow and complicated way of expressing what is already incorporated into established, more comprehensive approaches. For example, Bognar (2011) compares Gardiner’s (2006) “Rawlsian Core PP”-interpretation with what he calls a “utilitarian principle” which consists of a combination of the principles of indifference and that of maximizing expected utility. He concludes that this “utilitarian principle” does lead to the same results as the RCPP in the cases where the RCPP applies, but, contrary to it, this “utilitarian principle” is not restricted to such a narrow range of cases. His conclusion is that we can dispose of PPs, at least in formulations of maximin (Bognar 2011, 345).

In the same vein, Peterson (2007b, 600) asserts that if formulated in a consistent way, a PP would not be different from the “old” rules for risk-averse decision-making, while other authors have shown that we can use existing ordinary risk management (ORM) tools to implement a PP (Farrow 2004; Gollier, Moldovanu, and Ellingsen 2001). This allegedly would make PPs redundant (Randall 2011, 25; 87).

Rejoinder
Particularly against the criticism of Bognar (2011), one can counter that his “utilitarian principle” falls victim to the so-called “tuxedo fallacy” (Hansson 2008). Using the principle of indifference, that is, treating all outcomes as equally probable when one does not have enough information to assign reliable probabilities, can be seen as creating an “illusion of control” by assuming that as long as no probability information is available, all outcomes are equally probable. Neither does it pay special attention to catastrophic harms, nor does it take the special challenges of decision-theoretic uncertainty adequately into account.

More generally, one can make the following point: Even though there might be plausible ways how we can translate a PP into the ORM-framework and implement it using ORM-tools, there is more to it than that. Even if we use ORM-methods to implement precaution, in the end this might still be based on a normative obligation to enact precautionary measures. This obligation has to be spelled out, because ORM can allow for precaution, but does not demand it in itself (and, as a regulatory practice, tends to neglect it).

c. PPs are Irrational

The last line of criticism accuses PPs of being based on unfounded fears, expressing cognitive biases, and therefore leading to decisions with undesirable and overall harmful consequences.

Objection: Unfounded Panic
One criticism that is especially frequent in discussions aimed at a broader audience is that PPs lead to unrestrained regulation because they can be invoked by uncertain harm. Therefore, the argument goes, PPs hold the danger of unnecessary expenditures to reduce insignificant risks, forego benefits by regulating or prohibiting potentially beneficial activities, and are prone to being exploited, for example, from interest groups or for protectionism in international trade (Peterson 2006). A PP would stifle innovation, resulting in an overall less safe society: Many (risk-reducing) beneficial innovations of the past were only possible because risks had been taken (Zander 2010, 9), and technical innovation takes place in a process of trial-and-error, which would be seriously disturbed by a PP (Graham 2004, 5).

These critics see this as a consequence of PPs, because PPs do not require scientific certainty in order to take action, which they interpret as making merely speculative harm a reason for strict regulation. Thus, science would be marginalized or even rejected as a basis for decision-making, giving way to cognitive biases of ordinary people.

Objection: Cognitive biases
Sunstein claims that PPs are based on cognitive biases of ordinary people, which tend to systematically mis-assess risks (Sunstein 2005b, Chapter 4). By reducing the importance of scientific risk-assessment and marginalizing the role of experts, decisions resulting from the application of a PP will be influenced by these biases and result in negative consequences, the criticism goes.

Rejoinder
As has been pointed out by Randall (2011, 89), these criticisms seem to be misguided. Lower standards of evidence do not mean no standards at all. It is surely an important challenge for the implementation of a PP to find a way to define plausible possibilities, but this requires by no means less science. Instead, as Sandin, Bengtsson, and others (2004) point out, more, and different scientific approaches are needed. Uncertainties need to be communicated more clearly and tools need to be developed that allow taking uncertainties better into account. For decisions where we lack scientific information, but great harms are possible, ways need to be found for how public concerns can be taken into consideration (Arcuri 2007, 35). This, however, seems more a question of implementation and neither of the formulation nor the justification of a PP.

5. References and Further Reading

  • Ahteensuu, Marko. 2008. “In Dubio Pro Natura? A Philosophical Analysis of the Precautionary Principle in Environmental and Health Risk Governance.” PhD thesis, Turku, Finland: University of Turku.
  • Aldred, Jonathan. 2013. “Justifying Precautionary Policies: Incommensurability and Uncertainty.” Ecological Economics 96 (December): 132–40.
  • Arcuri, Alessandra. 2007. “The Case for a Procedural Version of the Precautionary Principle Erring on the Side of Environmental Preservation.” SSRN Scholarly Paper ID 967779. Rochester, NY: Social Science Research Network.
  • Arrow, Kenneth J., and Anthony C. Fisher. 1974. “Environmental Preservation, Uncertainty, and Irreversibility.” The Quarterly Journal of Economics 88 (2): 312–19.
  • Buchak, Lara. 2013. Risk and Rationality. Oxford University Press.
  • Bognar, Greg. 2011. “Can the Maximin Principle Serve as a Basis for Climate Change Policy?” Edited by Sherwood J. B. Sugden. Monist 94 (3): 329–48. https://doi.org/10.5840/monist201194317.
  • Caney, Simon. 2009. “Climate Change and the Future: Discounting for Time, Wealth, and Risk.” Journal of Social Philosophy 40 (2): 163–86. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9833.2009.01445.x/full.
  • Chisholm, Anthony Hewlings, and Harry R. Clarke. 1993. “Natural Resource Management and the Precautionary Principle.” In Fair Principles for Sustainable Development: Essays on Environmental Policy and Developing Countries, edited by Edward Dommen, 109–22.
  • Dommen, Edward (Ed.). 1993. Fair Principles for Sustainable Development: Essays on Environmental Policy and Developing Countries. Edward Elgar.
  • Farrow, Scott. 2004. “Using Risk Assessment, Benefit-Cost Analysis, and Real Options to Implement a Precautionary Principle.” Risk Analysis 24 (3): 727–35.
  • Fisher, Elizabeth. 2002. “Precaution, Precaution Everywhere: Developing a Common Understanding of the Precautionary Principle in the European Community.” Maastricht Journal of European and Comparative Law 9: 7.
  • Gardiner, Stephen M. 2006. “A Core Precautionary Principle.” Journal of Political Philosophy 14 (1): 33–60.
  • Gee, David, Philippe Grandjean, Steffen Foss Hansen, Sybille van den Hove, Malcolm MacGarvin, Jock Martin, Gitte Nielsen, David Quist and David Stanners. 2013. Late lessons from early warnings: Science, precaution, innovation. European Environment Agency.
  • Gollier, Christian, Benny Moldovanu, and Tore Ellingsen. 2001. “Should We Beware of the Precautionary Principle?” Economic Policy, 303–27.
  • Graham, John D. 2004. The Perils of the Precautionary Principle: Lessons from the American and European Experience. Vol. 818. Heritage Foundation.
  • Hansson, Sven Ove. 1997. “The Limits of Precaution.” Foundations of Science 2 (2): 293–306.
  • Hansson, Sven Ove. 2005a. Decision Theory: A Brief Introduction, Uppsala University class notes.
  • Hansson, Sven Ove. 2005b. “Seven Myths of Risk.” Risk Management 7 (2): 7–17.
  • Hansson, Sven Ove. 2008. “From the Casino to the Jungle.” Synthese 168 (3): 423–32. https://doi.org/10.1007/s11229-008-9444-1.
  • Hansson, Sven Ove. 2013. The Ethics of Risk: Ethical Analysis in an Uncertain World. Palgrave Macmillan.
  • Harremoës, Poul, David Gee, Malcolm MacGarvin, Andy Stirling, Jane Keys, Brian Wynne, and Sofia Guedes Vaz. 2001. Late Lessons from Early Warnings: The Precautionary Principle 1896-2000. Office for Official Publications of the European Communities.
  • Harris, John, and Søren Holm. 2002. “Extending Human Lifespan and the Precautionary Paradox.” Journal of Medicine and Philosophy 27 (3): 355–68.
  • Harsanyi, John C. 1975. “Can the Maximin Principle Serve as a Basis for Morality? A Critique of John Rawls’s Theory.” Edited by John Rawls. The American Political Science Review 69 (2): 594–606. https://doi.org/10.2307/1959090.
  • Hartzell-Nichols, Lauren. 2013. “From ‘the’ Precautionary Principle to Precautionary Principles.” Ethics, Policy and Environment 16 (3): 308–20.
  • Hartzell-Nichols, Lauren. 2017. A Climate of Risk: Precautionary Principles, Catastrophes, and Climate Change. Taylor & Francis.
  • Hartzell-Nichols, Lauren. 2012. “Precaution and Solar Radiation Management.” Ethics, Policy & Environment 15 (2): 158–71. https://doi.org/10.1080/21550085.2012.685561.
  • Hayenhjelm, Madeleine, and Jonathan Wolff. 2012. “The Moral Problem of Risk Impositions: A Survey of the Literature.” European Journal of Philosophy 20 (S1): E26–E51.
  • Jensen, Karsten K. 2002. “The Moral Foundation of the Precautionary Principle.” Journal of Agricultural and Environmental Ethics, 15(1): 39–55. https://doi.org/10.1023/A:1013818230213
  • John, Stephen. 2007. “How to Take Deontological Concerns Seriously in Risk–Cost–Benefit Analysis: A Re-Interpretation of the Precautionary Principle.” Journal of Medical Ethics 33 (4): 221–24.
  • John, Stephen. 2010. “In Defence of Bad Science and Irrational Policies: An Alternative Account of the Precautionary Principle.” Ethical Theory and Moral Practice 13 (1): 3–18.
  • Jonas, Hans. 2003. Das Prinzip Verantwortung: Versuch Einer Ethik Für Die Technologische Zivilisation. 5th ed. Frankfurt am Main: Suhrkamp Verlag.
  • Kaiser, Matthias. 1997. “Fish-Farming and the Precautionary Principle: Context and Values in Environmental Science for Policy.” Foundations of Science 2 (2): 307–41.
  • Lemons, John, Kristin Shrader-Frechette, and Carl Cranor. 1997. “The Precautionary Principle: Scientific Uncertainty and Type I and Type II Errors.” Foundations of Science 2 (2): 207–36.
  • Lenman, James. 2008. Contractualism and risk imposition. Politics, Philosophy & Economics, 7(1): 99–122. https://doi.org/10/fqkwg3
  • Manson, Neil A. 2002. “Formulating the precautionary principle.” Environmental Ethics 24(3): 263–274.
  • McKinney, William J., and H. Hammer Hill. 2000. “Of Sustainability and Precaution: The Logical, Epistemological, and Moral Problems of the Precautionary Principle and Their Implications for Sustainable Development.” Ethics and the Environment 5 (1): 77–87.
  • McKinnon, Catriona. 2009. “Runaway Climate Change: A Justice-Based Case for Precautions.” Journal of Social Philosophy 40 (2): 187–203.
  • McKinnon, Catriona. 2012. Climate Change and Future Justice: Precaution, Compensation and Triage. Routledge.
  • Munthe, Christian. 2011. The Price of Precaution and the Ethics of Risk. Vol. 6. The International Library of Ethics, Law and Technology. Springer.
  • O’Riordan, Timothy, and Andrew Jordan. 1995. “The Precautionary Principle in Contemporary Environmental Politics.” Environmental Values 4 (3): 191–212.
  • Osimani, Barbara. 2013. “An Epistemic Analysis of the Precautionary Principle.” Dilemata: International Journal of Applied Ethics, 149–67.
  • Paterson, John. 2007. “Sustainable Development, Sustainable Decisions and the Precautionary Principle.” Natural Hazards 42 (3): 515–28. https://doi.org/10.1007/s11069-006-9071-4.
  • Peterson, Martin. 2003. “Transformative Decision Rules.” Erkenntnis 58 (1): 71–85.
  • Peterson, Martin. 2006. “The Precautionary Principle Is Incoherent.” Risk Analysis 26 (3): 595–601. ll.
  • Peterson, Martin. 2007a. “Should the Precautionary Principle Guide Our Actions or Our Beliefs?” Journal of Medical Ethics 33 (1): 5–10. https://doi.org/10.1136/jme.2005.015495.
  • Peterson, Martin. 2007b. “The Precautionary Principle Should Not Be Used as a Basis for Decision‐making.” EMBO Reports 8 (4): 305–8. https://doi.org/10.1038/sj.embor.7400947.
  • Petrenko, Anton, and Dan McArthur. 2011. “High-Stakes Gambling with Unknown Outcomes: Justifying the Precautionary Principle.” Journal of Social Philosophy 42 (4): 346–62.
  • Randall, Alan. 2011. Risk and Precaution. Cambridge University Press.
  • Rawls, John. 2001. Justice as fairness: A restatement. Belknap, Harvard University Press.
  • Resnik, David B. 2003. “Is the Precautionary Principle Unscientific?” Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 34 (2): 329–44.
  • Resnik, David B. 2004. “The Precautionary Principle and Medical Decision Making.” Journal of Medicine and Philosophy 29 (3): 281–99.
  • Sandin, Per. 1999. “Dimensions of the Precautionary Principle.” Human and Ecological Risk Assessment: An International Journal 5 (5): 889–907.
  • Sandin, Per. 2004. “Better Safe Than Sorry: Applying Philosophical Methods to the Debate on Risk and the Precautionary Principle.” PhD thesis, Stockholm.
  • Sandin, Per. 2007. “Common-Sense Precaution and Varieties of the Precautionary Principle.” In Risk: Philosophical Perspectives, edited by Tim Lewens, 99–112. London; New York.
  • Sandin, Per. 2009. “A New Virtue-Based Understanding of the Precautionary Principle.” Ethics of Protocells: Moral and Social Implications of Creating Life in the Laboratory, 88–104.
  • Sandin, Per, Bengt-Erik Bengtsson, Ake Bergman, Ingvar Brandt, Lennart Dencker, Per Eriksson, Lars Förlin, and others 2004. “Precautionary Defaults—a New Strategy for Chemical Risk Management.” Human and Ecological Risk Assessment 10 (1): 1–18.
  • Sandin, Per, and Sven Ove Hansson. 2002. “The Default Value Approach to the Precautionary Principle.” Human and Ecological Risk Assessment: An International Journal 8 (3): 463–71. https://doi.org/10.1080/10807030290879772.
  • Sandin, Per, Martin Peterson, Sven Ove Hansson, Christina Rudén, and André Juthe. 2002. “Five Charges Against the Precautionary Principle.” Journal of Risk Research 5 (4): 287–99.
  • Science & Environmental Health Network (SEHN). 1998. Wingspread Statement on the Precautionary Principle.
  • Steel, Daniel. 2011. “Extrapolation, Uncertainty Factors, and the Precautionary Principle.” Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 42 (3): 356–64.
  • Steel, Daniel. 2013. “The Precautionary Principle and the Dilemma Objection.” Ethics, Policy & Environment 16 (3): 321–40.
  • Steel, Daniel. 2014. Philosophy and the Precautionary Principle. Cambridge University Press.
  • Steele, Katie. 2006. “The Precautionary Principle: A New Approach to Public Decision-Making?” Law, Probability and Risk 5 (1): 19–31. https://doi.org/10.1093/lpr/mgl010.
  • Suikkanen, Jussi. 2019. Ex Ante and Ex Post Contractualism: A Synthesis. The Journal of Ethics, 23(1): 77–98. https://doi.org/10/ggjn22
  • Sunstein, Cass R. 2005a. “Irreversible and Catastrophic.” Cornell Law Review 91: 841–97.
  • Sunstein, Cass R. 2007. “The Catastrophic Harm Precautionary Principle.” Issues in Legal Scholarship 6 (3).
  • Sunstein, Cass R. 2009. Worst-Case Scenarios. Harvard University Press.
  • Sunstein, Cass R. 2005b. Laws of Fear: Beyond the Precautionary Principle. Cambridge University Press.
  • Thalos, Mariam. 2012. “Precaution Has Its Reasons.” In Topics in Contemporary Philosophy 9: The Environment, Philosophy, Science and Ethics., edited by W. Kabasenche, M. O’Rourke, and M. Slater, 171–84. Cambridge, MA: MIT Press.
  • Tickner, Joel A. 2001. “Precautionary Assessment: A Framework for Integrating Science, Uncertainty, and Preventive Public Policy.” In The Role of Precaution in Chemicals Policy, edited by Elisabeth Freytag, Thomas Jakl, G. Loibl, and M. Wittmann, 113–27. Diplomatische Akademie Wien.
  • Turner, Derek, and Lauren Hartzell. 2004. “The Lack of Clarity in the Precautionary Principle.” Environmental Values 13 (4): 449–60.
  • United Nations Conference on Environment and Development. 1992. Rio Declaration on Environment and Development.
  • Westra, Laura. 1997. “Post-Normal Science, the Precautionary Principle and the Ethics of Integrity.” Foundations of Science 2 (2): 237–62.
  • Whiteside, Kerry H. 2006. Precautionary Politics: Principle and Practice in Confronting Environmental Risk. MIT Press Cambridge, MA.
  • Zander, Joakim. 2010. The Application of the Precautionary Principle in Practice: Comparative Dimensions. Cambridge: Cambridge University Press.

Research for this article was part of the project “Reflective Equilibrium – Reconception and Application” (Swiss National Science Foundation grant no. 150251).

Author Information

Tanja Rechnitzer
Email: tanja.rechnitzer@philo.unibe.ch
University of Bern
Switzerland

Conspiracy Theories

The term “conspiracy theory” refers to a theory or explanation that features a conspiracy among a group of agents as a central ingredient. Popular examples are the theory that the first moon landing was a hoax staged by NASA, or the theory that the 9/11 attacks on the World Trade Center were not (exclusively) conducted by al-Qaeda, but that the US government conspired to let these attacks succeed. Conspiracy theories have long been an element of popular culture; and cultural theorists, sociologists and psychologists have had things to say about conspiracy theories and the people who believe in them. This article focuses on the philosophy of conspiracy theories, that is, on what philosophers have had to say about conspiracy theories. Conspiracy theories meet philosophy when it comes to questions concerning epistemology, science, society and ethics.

After giving a brief history of philosophical thinking about conspiracy theories in section 1, this article considers in more detail the definition of the term “conspiracy theory” in section 2. As it turns out, the definition of the term has received a lot of attention in philosophy, mainly because the common usage of the term has negative connotations (as in, “It’s just a conspiracy theory!”), raising the question whether our definition should reflect these. As there is a great variety of conspiracy theories on offer, section 3 considers ways of classifying conspiracy theories into distinct types. Such a classification may be useful when it comes to identifying possible problems with a conspiracy theory.

The main part of this article, section 4, is devoted to the question when one should believe in a conspiracy theory. In general, the philosophical literature has been more positive about conspiracy theories than other fields, being careful not to dismiss such theories too easily. Hence, it becomes important to come up with criteria that one may use to evaluate a given conspiracy theory. Section 4 provides such a list of criteria, distilled from the philosophical literature.

Turning from questions about belief to questions about society, ethics and politics, section 5 addresses the societal effects of conspiracy theories that philosophers have identified, also asking to what extent these are positive or negative. Given these effects, the last question this article addresses, in section 6, is what, if anything, we should do about conspiracy theories. Answering this question does not, of course, depend on philosophical thinking alone. For this reason, section 7 briefly mentions some relevant work outside of philosophy.

Table of Contents

  1. History of Philosophizing about Conspiracy Theories
  2. Problems of Definition
  3. Types of Conspiracy Theories
  4. Criteria for Believing in a Conspiracy Theory
    1. Criteria concerning Scientific Methodology
      1. Internal Faults (C1)
      2. Progress: Is the Conspiracy Theory Part of a Progressive Research Program? (C2)
      3. Inference to the Best Explanation: Evidence, Prior, Relative and Posterior Probability (C3)
      4. Errant Data (C4)
    2. Criteria Concerning Motives
      1. Cui Bono: Who Benefits from the Conspiracy? (C5)
      2. Individual Trust (C6)
      3. Institutional Trust (C7)
    3. Other Realist Criteria
      1. Fundamental Attribution Error (C8)
      2. Ontology: Existence Claims the Conspiracy Theory Makes (C9)
      3. Übermensch: Does the Conspiracy Theory Ascribe Superhuman Qualities to Conspirators? (C10)
      4. Scale: The Size and Duration of the Conspiracy (C11)
    4. Non-Realist Criteria
      1. Instrumentalism: Conspiracy Theories as “as if” Theories (C12)
      2. Pragmatism (C13)
  5. Social and Political Effects of Conspiracy Theories
  6. What to Do about Conspiracy Theories?
  7. Related Disciplines
  8. References and Further Reading

1. History of Philosophizing about Conspiracy Theories

Philosophical thinking about conspiracies can be traced back at least as far as Niccolo Machiavelli. Machiavelli discussed conspiracies in his most well-known work, The Prince (for example in chapter 19), but more extensively in his Discourses on the First Ten Books of Titus Livius, where he devotes the whole sixth chapter of the third book to a discussion of conspiracies. Machiavelli’s aim in his discussion of conspiracies is to help the ruler guard against conspiracies directed against him. At the same time, he warns subjects not to engage in conspiracies, partly because he believes these rarely achieve what they desire.

Where Machiavelli discussed conspiracies as a political reality, Karl Raimund Popper is the philosopher who put conspiracy theories on the philosophical agenda. The philosophical discussion of conspiracy theories begins with Popper’s dismissal of what he calls “the conspiracy theory of society” (Popper, 1966 and 1972). Popper sees the conspiracy theory of society as a mistaken approach to the explanation of social phenomena: It attempts to explain a social phenomenon by discovering people who have planned and conspired to bring the phenomenon about. While Popper thinks that conspiracies do occur, he thinks that few conspiracies are ultimately successful, since few things turn out exactly as intended. It is precisely the unforeseen consequences of intentional human action that social science should explain, according to Popper.

Popper’s comments on the conspiracy theory of society comprised only a few pages, and they did not trigger critical discussion until many years later. It was only in 1995 that Charles Pigden critically examined Popper’s views (Pigden, 1995). Besides Pigden’s critique of Popper, it was Brian Keeley (1999) and his attempt at defining what he called “unwarranted conspiracy theories” that started the philosophical literature on conspiracy theories. The question raised by Keeley’s paper is essentially the demarcation problem for conspiracy theories: Just as Popper’s demarcation problem was to separate science from pseudoscience, within the realm of conspiracy theories, the problem Keeley raised was to separate warranted from unwarranted conspiracy theories. However, Keeley concluded that the problem is a difficult one, admitting that the five criteria he proposed were not adequate for specifying when we are (un)warranted to believe in a conspiracy theory. This article returns to this problem in section 4.

After Popper’s work in the late 1960s and early 1970s, and Pigden’s and Keeley’s in the 1990s, philosophical work on conspiracy theories took off in the first decade of the 21st century. Particularly important in this development is the collection of essays by Coady (2006a), which made visible that there is a philosophical debate about conspiracy theories to a wider audience, as well as within philosophy. Since this collection of essays, philosophical thinking has been continuously evolving, as evidenced by special issues of Episteme (volume 4, issue 2, 2007), Critical Review (volume 28, issue 1, 2016), and Argumenta (volume 3, no.2, 2018).

Looking at the history of philosophizing about conspiracy theories, a useful distinction that has been applied to philosophers writing about conspiracy theories is the distinction between generalists and particularists (Buenting and Taylor, 2010). Following in the footsteps of Popper, generalists believe that conspiracy theories in general have an epistemic problem. For them, there is something about a theory being a conspiracy theory that should lower its credibility. It is this kind of generalism which underlies the popular dismissal, “It’s just a conspiracy theory.” Particularists like Pigden, on the other hand, argue that there is nothing problematic about conspiracy theories per se, but that each conspiracy theory needs to be evaluated on its own (de)merits.

2. Problems of Definition

The definition of the term “conspiracy theory” given at the beginning of this article is neutral in the sense that it does not imply that a conspiracy theory is wrong or unlikely to be true. In popular discourse, however, an epistemic deficit is often implied. Tracking this popular use, the Wikipedia entry on the topic (consulted 26 July 2019) defined a conspiracy theory as “an explanation of an event or situation that invokes a conspiracy by sinister and powerful actors, often political in motivation, when other explanations are more probable.”

We can order possible definitions of the term “conspiracy theory” in terms of logical strength. The definition given at the beginning of this article is minimal in this sense; it says that a conspiracy theory is a theory that involves a conspiracy. Slightly more elaborate, but still in line with this weak notion of conspiracy theory, Keeley (1999, p.116) sees a conspiracy theory as an explanation of an event by the causal agency of a small group of people acting in secret. What Keeley has added to the minimal definition is that the group of conspirators is small. Other additions that have been considered are that the group is powerful and/or that it has nefarious intentions. While these additions create a stronger notion of conspiracy theory, they all remain epistemically neutral; that is, they do not state that the explanation is unlikely or otherwise problematic. On the other end of the logical spectrum, definitions like the Wikipedia definition cited above are not only logically stronger than the minimal definition—the conspirators are powerful and sinister—but are also epistemically laden: A conspiracy theory is unlikely.

Within this spectrum of possibilities, philosophers have generally opted for a rather minimal definition that is epistemically neutral. As explicated by Dentith (2016, p.577), the central ingredients of a conspiracy are (a) a group of conspirators, (b) secrecy and (c) a shared goal. Similarly separating out the different ingredients of a conspiracy theory, Mandik (2007, p.206) states that conspiracy theories postulate “(1) explanations of (2) historical events in terms of (3) intentional states of multiple agents (the conspirators) who, among other things, (4) intended the historical events in question to occur and (5) keep their intentions and actions secret.” He sees these five conditions as necessary conditions for being a conspiracy theory, but he remains agnostic as to whether they are jointly sufficient.

A second approach to defining conspiracy theories has been proposed by Coady (2006b, p.2). He sees conspiracy theories as explanations that are opposed to the official explanation of an event at a given time. Coady points out that usually explanations that are conspiracy theories in this sense are also conspiracy theories in the sense discussed earlier, but not vice versa, as also official theories can refer to conspiracies, for example the official account of 9/11. Often, according to Coady, an explanation will be a conspiracy theory in both senses.

Which definition to adopt—strong or weak, epistemically neutral or not—is ultimately a question of what purpose the definition is to serve. No matter what definition one chooses, such a choice will have consequences. As an example, Watergate will not count as a conspiracy theory under the Wikipedia definition, but it will under the minimal definition given at the beginning of this article. Furthermore, this minimal definition of conspiracy theories will have as a consequence that an explanation of a surprise party will be considered a conspiracy theory. Hence, to be put to use, the minimal definition may need to be supplemented by an extra condition like nefariousness.

Finally, besides using the term “conspiracy theory,” some authors also use the term “conspiracism.” This latter term has been used in different ways in the literature. Pipes (1997) has used the term to indicate a particular paranoid style of thinking. Muirhead and Rosenblum (2019) have used it to describe an evolving phenomenon of political culture, distinguishing classic conspiracism from new conspiracism. While classic conspiracism involves the development of conspiracy theories as alternative explanations of phenomena, new conspiracism has shed the interest in explanation and theory building. Instead, it is satisfied with bare assertion or insinuation of a conspiracy and aims at political delegitimation and destabilization.

3. Types of Conspiracy Theories

Conspiracy theories come in great variety, and typologies can help to order this variety and to further guide research to a particular type of conspiracy theory that is particularly interesting or problematic. Räikkä (2009a, p.186 and 2009b, p.458-9) distinguishes political from non-political conspiracy theories. Räikkä mentions conspiracy theories about the death of Jim Morrison or Elvis Presley as examples of non-political conspiracy theories. He furthermore divides political conspiracy theories into local, global and total conspiracy theories depending on the scale of the event to be explained.

Huneman and Vorms (2018, p.251) provide further useful categories for distinguishing different types of conspiracy theories. They distinguish scientific from non-scientific conspiracy theories—that is, whether or not the theories deal with the domain of science, like the AIDS conspiracy theory—ideological from neutral conspiracy theories—whether there is a strong ideology driving the conspiracy theory, like anti-Semitism—official from anti-institutional conspiracy theories—as exemplified by official versus unofficial conspiracy theories about 9/11—and alternative explanations from denials—providing a different explanation for an event versus denying that the event took place.

A further way to distinguish conspiracy theories is by looking at what kind of theoretical object we are dealing with. In general, a conspiracy theory is an explanation of some event or phenomenon, but one can examine what kind of explanation it is. Some conspiracy theories may be full-blown theories, whereas others may not be theories in the scientific or philosophical sense. Clarke (2002 and 2007) thinks that some conspiracy theories are actually only proto-theories, not worked out sufficiently to count as theories, while others may be degenerating research programs in the sense defined by Imre Lakatos. There is more on the relationship between conspiracy theories and Lakatosian research programs in section 4, but here it is important to realize that while all conspiracy theories are explanations of some sort, certain conspiracy theories may be theories, others may be proto-theories or research programs.

4. Criteria for Believing in a Conspiracy Theory

A number of criteria have been offered, sometimes implicitly, in the philosophical literature to evaluate whether we should believe in a particular conspiracy theory, and these are surveyed below. Partly, such criteria will be familiar from scientific theory choice, but given that we are dealing with a specific type of theory, more can be said and more has been said. Due to the number of criteria, it is useful to group them into categories. There are different ways of grouping these criteria. The one adopted here tries to stay close to the labels and classifications common in the philosophy of science.

Although not explicitly stated, the dominant view in the philosophical literature from which the criteria below are taken is a realist view: Our (conspiracy) theories and beliefs should aim at the truth. Alternatively, one may propose an instrumentalist criterion which advocates a (conspiracy) theory or belief for its usefulness, for example in making predictions. Finally, while instrumentalism still has epistemic aims, we can also identify a more radical pragmatist view which focuses more generally on the consequences, for example political and social consequences, of holding a particular (conspiracy) theory or belief.

As mentioned, most of the criteria from the philosophical literature fit into the realist view. Within this view, we can distinguish three groups of criteria. First, we have criteria coming from the philosophy of science. These criteria have to do with the scientific methodology of theory choice, and here the question is how these play out when applied to conspiracy theories. Second, we have criteria dealing with motives. These can be the motives of the agents proposing a conspiracy theory, the motives of institutions relevant to the propagation of a conspiracy theory, or, finally, the motives of the agents the conspiracy theory is about. Third, there are a number of other criteria neither dealing with motives nor with scientific methodology. The picture arising from this way of organizing the criteria is presented in figure 1. The figure is not intended as a decision tree. Rather, it is more like an organized toolbox from which multiple tools may be chosen, depending, for example, on one’s philosophical commitments and one’s existing beliefs.

Figure 1 

a. Criteria concerning Scientific Methodology

i. Internal Faults (C1)

Basham (2001, p.275) advocates skepticism of a conspiracy theory if it suffers from what he calls “internal faults,” among which he lists “problems with self-consistency, explanatory gaps, appeals to unlikely or obviously weak motives and other unrealistic psychological states, poor technological claims, and the theory’s own incongruencies with observed facts it grants (including failed predictions).” Räikkä (2009a, p.196f) also refers to a similar list of criteria. Basham thinks that this criterion, while seemingly straightforward, will already exclude many conspiracy theories. An historical example he mentions is the theory that sees the antichrist of the biblical Book of Revelations to be Adolf Hitler. According to Basham, the fact that Hitler is dead and the kingdom of God nowhere near shows that this theory has internal faults, presumably a big explanatory gap or failed prediction.

Note that the list of things mentioned by Basham as internal faults is rather diverse, and one can debate whether all of these faults should really be considered internal to the theory. More narrowly, one could restrict internal faults to problems with self-consistency. Most of the other elements mentioned by Basham return below as separate criteria. For instance, an appeal to “unlikely or obviously weak motives” is discussed as C5.

ii. Progress: Is the Conspiracy Theory Part of a Progressive Research Program? (C2)

Clarke (2002; 2007) sees conspiracy theories as degenerating research programs in the sense developed by Lakatos (1970). In Clarke’s description of a degenerating research program, “successful novel predictions and retrodictions are not made. Instead, auxiliary hypotheses and initial conditions are successively modified in light of new evidence, to protect the original theory from apparent disconfirmation” (Clarke 2002, p.136). By contrast, a progressive research program would make successful novel predictions and retrodictions. Clarke cites the Watergate conspiracy theory as an example of a progressive research program: It led the journalists to make successful predictions and retrodictions about the behavior of those involved in the conspiracy. By contrast, Clarke uses the conspiracy theory about Elvis Presley’s fake funeral as an example of a degenerating research program (p.136-7), since it did not come up with novel predictions that were confirmed, for example, concerning the unusual behavior of Elvis’s relatives. Going further, Clarke (2007) also views other conspiracy theories—the controlled demolition theory of 9/11, for instance—as only proto-theories, something that is not sufficiently worked out to count as a theoretical core of a degenerating or progressive research program. Proto-theories are similar to what Muirhead and Rosenblum (2019) call new conspiracism.

Pigden (2006, footnote 17 and p.29) criticizes Clarke for not providing any evidence that conspiracy theories are in fact degenerating research programs and points to the many conspiracy theories accepted by historians as counterevidence. In any case, we might consider evaluating a given conspiracy theory by trying to see to what extent it is, or is part of, a progressive or a degenerating research program. Furthermore, as Lakatos’s notion of a research program comes with a hard core—the central characteristic claims not up for modification—and a protective belt—auxiliary hypotheses which can be changed—applying this notion also gives us tools to analyze a conspiracy theory in more detail. Such an analysis might yield, for example, that the problematic aspects of a conspiracy theory all concern its protective belt rather than its hard core.

iii. Inference to the Best Explanation: Evidence, Prior, Relative and Posterior Probability (C3)

Dentith (2016) views conspiracy theories as inferences to the best explanation. To judge such inferences using a Bayesian framework, we need to look at the prior probability of the conspiracy theory, the prior probability of the evidence and its likelihood given the conspiracy theory, thereby allowing us to calculate the posterior probability of the conspiracy theory. Furthermore, we need to look at the relative probability of the conspiracy theory when comparing it to competing hypotheses explaining the same event. Crucial in this calculation is our estimation of the prior probability of the conspiracy theory, which Dentith thinks we usually set too low (p.584) because we tend to underestimate how often conspiracies occur in history.

There is some disagreement between authors about whether conspiracy theories may be selective in their choice of evidence. Hepfer (2015, p.78) warns against the selective acceptance of evidence which he calls selective coherentism (p.92), which for Hepfer explains, for example, the wealth of different conspiracy theories surrounding the assassination of John F. Kennedy. Dentith (2019, section 2), on the other hand, argues that scientific theories are also selective in their use of evidence, and that conspiracy theories are not different from other theories, such as scientific ones, in the way they use evidence. Dentith compares conspiracy theories about 9/11 to the work that historians usually do. In both cases, says Dentith, we see a selection of only part of the total evidence as salient.

Finally, Keeley (2003, p.106) considers whether lack of evidence for a conspiracy should count against a theory positing such a conspiracy. On the one hand, he points out that it is in general true that we should not confuse absence of evidence for a conspiracy with evidence of absence of a conspiracy. After all, since we are dealing with a conspiracy, we should expect that evidence will be hard to come by. This is also why falsifiability is in general not advocated as a criterion for evaluating conspiracy theories (see, e.g., Keeley 1999, p.121 and Basham 2003, p.93): In the case of conspiracy theories, something approaching unfalsifiability is a consequence of the theory. Nonetheless, Keeley (2003, p.106) thinks that if diligent efforts to find evidence for a conspiracy fail where similar efforts in other similar cases have succeeded, we are justified in lowering the credibility of the conspiracy theory.

iv. Errant Data (C4)

While the previous criterion already discussed how conspiracy theories relate to data, there is a particular kind of data that receives special attention both by conspiracy theorists and in the philosophical literature about conspiracy theories. Many conspiracy theories claim that they can explain “errant data” (Keeley, 1999, p.117), data which either contradicts the official theory or which the official theory leaves unexplained. According to Keeley (1999), conspiracy theories place great emphasis on errant data, an emphasis that also exists in periods of scientific innovation. However, Keeley thinks that conspiracy theories wrongly claim that errant data by itself is a problem for a theory, which Keeley thinks it is not, since not all the available data will in fact be true. Clarke (2002 p.139f) and Dentith (2019, section 3) are skeptical of Keeley’s argument: Clarke points out that the data labelled as “errant” will depend on the theory one adheres to, and Dentith thinks that conspiracy theories are no different from other theories in relation to such data.

Dentith (2014, 129ff), following Coady (2006c), points out that any theory, official or unofficial, will have errant data. While advocates of a conspiracy theory will point to data problematic for the official theory which the conspiracy theory can explain, there will usually also be data problematic to the conspiracy theory which the official theory can explain. As an example of data errant with regard to the official theory, Dentith mentions that the official theory about the assassination of John F. Kennedy does not explain why some witnesses heard more gunshots than the three gunshots Oswald is supposed to have fired. As an example of data errant with regard to a conspiracy theory, Dentith points out that some of the conspiracy theories about 9/11 cannot explain why there is a video of Osama Bin Laden claiming responsibility for the attacks. When it comes to evaluating a specific conspiracy theory, the conclusion is that we should be looking at the errant data of both the conspiracy theory and alternative theories.

b. Criteria Concerning Motives

i. Cui Bono: Who Benefits from the Conspiracy? (C5)

Hepfer (2015, p.98ff) uses the assassination of John F. Kennedy in 1963 to illustrate how motives enter into our evaluation of conspiracy theories. While there seems to be widespread agreement that the assassin was in fact Lee Harvey Oswald, conspiracy theories doubt the official theory that he was acting on his own. There are a number of possible conspirators with plausible motives that may have been behind Oswald: The military-industrial complex, the American mafia, the Russian secret service, the United States Secret Service and Fidel Castro. Which of these conspiracy theories we should accept also depends on how plausible we find the ascribed motives given our other beliefs about the world.

According to Hepfer (2015, p.98 and section 2.3), a conspiracy theory should be (a) clear about the motives or goals of the conspirators and (b) rational in the means-ends sense of rationality; that is, if successful, the conspiracy should further the goals the conspirators are claimed to have. If the goals of the conspirators are not explicitly part of the theory, we should be able to infer these goals, and they should be reasonable. Problematic conspiracy theories are those where the motives or goals of the conspirators are unclear, the goals ascribed to the conspirators conflict with our other knowledge about the goals of these agents, or a successful conspiracy would not further the goals the theory itself ascribes to the conspirators.

ii. Individual Trust (C6)

Trust plays a role in two different ways when it comes to conspiracy theories. First, Räikkä (2009b, section 4) raises the question of whether we can trust the motives of the author(s) or proponents of a conspiracy theory. Some conspiracy theorists may not themselves believe the theory they propose, and instead may have other motives for proposing the theory; for example, to manipulate the political debate or make money. Other conspiracy theorists may genuinely believe the conspiracy theory they propose, but the fact that the alleged conspirators are the political enemy of the theory’s proponent may cast doubt on the likelihood of the theory. The general question here is whether the author or proponent of a conspiracy theory has a motive to lie or mislead. Here, Räikkä uses as an example the conspiracy theory about global warming (p.462). If a person working for the fossil-fuel industry claims that there is a global conspiracy propagating the idea of global warming, the financial motive is clear. Conversely, people who reject a particular theory as “just” a conspiracy theory may also have a motive to mislead. As an example, Pidgen disscusses the case of Tony Blair,who labeled the idea that the Iraq war was fought for oil a mere conspiracy theory.

A second way in which trust enters into the analysis of conspiracy theories is in terms of epistemic authority. Many conspiracy theories refer to various authorities for the justification of certain claims. For instance, a 9/11 conspiracy theory may refer to a structural engineer who made a certain claim regarding the collapse of the World Trade Center. The question arises as to what extent we should trust claims of alleged epistemic authorities, that is, people who have relevant expertise in a particular domain. Levy (2007) takes a radically socialized view of knowledge: Since knowledge can only be produced by a complex network of inquiry in which the relevant epistemic authorities are embedded, a conspiracy theory conflicting with the official story coming out of this network is “prima facie unwarranted” (p.182, italics in the original). According to Levy, the best epistemic strategy is simply to “adjust one’s degree of belief in an explanation of an event or process to the degree to which the epistemic authorities accept that explanation” (p.190). Dentith (2018) criticizes Levy’s trust in epistemic authority. First, Dentith argues that since conspiracy theories cross disciplinary boundaries, there is no obvious group of experts when it comes to evaluating a conspiracy theory, since a conspiracy theory will usually involve claims connecting various disciplines. Furthermore, Dentith points out that the fact that a theory has authority in the sense of being official does not necessarily mean that it has epistemic authority, a point Levy also makes. Related to our first point about trust, Dentith also points out that epistemic authorities might have a motive to mislead, for example, when the funding sources might have influenced research. Finally, our trust in epistemic authority will also depend on the trust we place in the institutions accrediting expertise, and hence questions of individual trustworthiness relate to questions of institutional trustworthiness.

iii. Institutional Trust (C7)

As mentioned when discussing individual trust, when we want to assess the credibility of experts, part of that credibility judgment will depend on the extent to which we trust the institution accrediting the expertise, assuming there is such an institution to which the expert is linked. The question of institutional trust is relevant more generally when it comes to conspiracy theories, and this issue has been discussed at length in the philosophical literature on conspiracy theories.

The starting point of the discussion of institutional trust is Keeley (1999, p.121ff) who argues that the problem with conspiracy theories is that these theories cast doubt on precisely those institutions which are the guarantors of reliable data. If a conspiracy theory contradicts an official theory based on scientific expertise, this produces skepticism not only with regard to the institution of science, but may also produce skepticism with regard to other public institutions, for example the press, which accepts the official story instead of uncovering the conspiracy, the parliament and the government, which produce or propagate the conspiracy theory in the first place. Thus, the claim is that believing in a conspiracy theory implies a quite widespread distrust of our public institutions. If this implication is true, it can be used in two ways: Either to discredit the conspiracy theory, which is the route Keeley advocates, or to discredit our public institutions. In any case, our trust in our public institutions will influence the extent to which we hold a particular conspiracy theory to be likely. For this reason, both Keeley (1999, p.121ff) and Coady (2006a, p.10) think that conspiracy theories are more trustworthy in non-democratic societies.

Basham (2001, p.270ff) argues that it would be a mistake to simply assume our public institutions to be trustworthy and dismiss conspiracy theories. His position is one he calls “studied agnosticism” (p.275): In general, we are not in a position to decide for or against a conspiracy theory, except—and this is where the “studied” comes in—where a conspiracy theory can be dismissed due to internal faults (see C1). In fact, we are caught in a vicious circle: “We cannot help but assume an answer to the essential issue of how conspirational our society is in order to derive a well justified position on it” (p.274). Put differently, while an open society provides fewer grounds for believing in conspiracy theories, we cannot really know how open our society actually is (Basham 2003, p.99). In any case, an individual who tries to assess a particular conspiracy theory should thus also consider to what extent they trust or distrust our public institutions.

Clarke (2002,p.139ff) questions Keeley’s link between belief in conspiracy theories and general distrust in our public institutions. He claims that conspiracy theories actually do not require general institutional skepticism. Instead, in order to believe in a conspiracy theory, it will usually suffice to confine one’s skepticism to particular people and issues. Räikkä (2009a) also criticizes Keeley’s supposed link between conspiracy theories and institutional distrust, claiming that most conspiracy theories do not entail such pervasive institutional distrust, but that if such pervasive distrust were entailed by a conspiracy theory, it would lower the conspiracy theory’s credibility. A global conspiracy theory like the Flat Earth theory tends to involve more pervasive institutional distrust, since it involves multiple institutions from various societal domains, than a local conspiracy theory like the Watergate conspiracy. According to Clarke, even the latter does not have to engender institutional distrust with regard to the United States government as an institution, since distrust could remain limited to specific agents within the government.

c. Other Realist Criteria

i. Fundamental Attribution Error (C8)

Starting with Clarke (2002; see also his response to criticism in 2006), philosophers have discussed whether conspiracy theories commit the fundamental attribution error (FAE). In psychology, the fundamental attribution error refers to the human tendency to overestimate dispositional factors and underestimate situational factors in explaining the behavior of others. Clarke (p.143ff) claims that conspiracy theories commit this error: They tend to be dispositional explanations whereas official theories often are more situational explanations. As an example, Clarke considers the funeral of Elvis Presley. The official account is situational since it explains the funeral in terms of his death due to heart problems. On the other hand, the conspiracy theory which claims Elvis is still alive and staged his funeral is dispositional since it sees Elvis and his possible co-conspirators as having the intention to deceive the public.

Dentith (2016, p.580) questions whether conspiracy theories are generally more dispositional than other theories. Also, like in the case of 9/11, the official theory may also be dispositional. Pigden (2006, footnotes 27 and 30, and p.29) is critical of the psychological literature about the FAE, claiming that “if we often act differently because of different dispositions, then the fundamental attribution error is not an error” (footnote 30). Pigden is also critical of Clarke’s application of the FAE to conspiracy theories: Given that conspiracies are common, what Pigden calls “situationism” is either false or it does not imply that conspiracies are unlikely. Hence, Pigden concludes, the FAE has no relevant implications for our thinking about conspiracy theories. Coady (2003) is also critical of the existence of the FAE. Furthermore, he claims that belief in the FAE is paradoxical in that it commits the FAE: Believing that people think dispositionally rather than situationally is itself dispositional thinking.

ii. Ontology: Existence Claims the Conspiracy Theory Makes (C9)

Some conspiracy theories claim the existence or non-existence of certain entities. Among the examples Hepfer (2015, p.45) cites is a theory by Heribert Illig that claims that the years between 614 and 911 never actually happened. Another example would be a theory claiming the existence of a perpetual motion machine that is kept secret. Both existence claims go against the scientific consensus of what exists and what does not. Hepfer (2015, p.42) claims that the more unusual a conspiracy theory’s existence claims are, the more we should doubt its truth. This is because of the ontological baggage (p.49) that comes with such existence claims: Accepting these claims will force us to revise a major part of our hitherto accepted knowledge, and the more substantial the revision needed, the more we should be suspicious of such a theory.

iii. Übermensch: Does the Conspiracy Theory Ascribe Superhuman Qualities to Conspirators? (C10)

Hepfer (2015, p.104) and Räikkä (2009a, p.197) note that some conspiracy theories ascribe superhuman qualities to the conspirators that border on divine attributes like omnipotence and omniscience. Examples here might be the idea that Freemasons, Jews or George Soros control the world economy or the world’s governments. Sometimes the superhuman qualities ascribed to conspirators are moral and negative, that is, conspirators are demonized (Hepfer, 2015, p.131f). The antichrist has not only been seen in Adolf Hitler but also in the pope. In general, the more extraordinary the qualities ascribed to the conspirators, the more they should lower the credibility of the conspiracy theory.

iv. Scale: The Size and Duration of the Conspiracy
(C11)

The general claim here is that the more agents that are supposed to be involved in a conspiracy—its size—and the longer the conspiracy is supposed to be in existence—its duration—the less likely the conspiracy theory. Hepfer (2015, p.97) makes this point, and similarly Keeley (1999, p.122) says that the more institutions are supposed to be involved in a conspiracy, the less believable the theory should become. To some extent, this point is simply a matter of logic: The claim that A and B are involved in a conspiracy cannot be more likely than that A is involved in a conspiracy. Similarly, the claim that a conspiracy theory has been going on for at least 20 years cannot be more likely than the claim that it has been going on for at least 10 years. In this sense, conspiracy theories involving many agents over a long period of time will tend to be less likely than conspiracy theories involving fewer agents over a shorter period of time. Furthermore, Grimes (2016) has conducted simulations showing that large conspiracies with 1000 agents or more are unlikely to succeed due to problems with maintaining secrecy.

Basham (2001, p.272; 2003, p.93) takes an opposing view by referring to social hierarchies and mechanisms of control, saying that “the more fully developed and high placed a conspiracy is, the more experienced and able are its practitioners at controlling information and either co-opting, discrediting, or eliminating those who go astray or otherwise encounter the truth” (Basham 2001, p.272). Dentith (2019, section 7) also counters the scale argument by pointing out that any time an institution is involved in a conspiracy, only very few people of that institution actually are involved in the conspiracy. This reduces the number of total conspirators and questions the relevance of the results by Grimes of which Dentith is very critical.

d. Non-Realist Criteria

i. Instrumentalism: Conspiracy Theories as “as if” Theories (C12)

Grewal (2016) has shown how the philosophical opposition between scientific realism and various kinds of anti-realism also shows up in how we evaluate conspiracy theories. While most authors implicitly seem to interpret the claims of conspiracy theories along the lines of realism, Grewal has suggested that adherents of conspiracy theories may interpret or at least use these theories instrumentally. Viewed this way, conspiracy theories are “as-if”-theories which allow their adherents to make sense of a world that is causally opaque in a way that may often yield quite adequate predictions. “An assumption that the government operated as if it were controlled by a parallel and secret government may fit the historical data…while also providing better predictions than would, say, an exercise motivated by an analysis of constitutional authority or the statutory limitations to executive power” (p.36). As a more concrete example, Grewal mentions that “the most parsimonious way to understand financial decision making in the Eurozone might be to treat it as if it were run by and for the benefit of the Continent’s richest private banks” (p.37). Hence, our evaluation of a given conspiracy theory will also depend on basic philosophical commitments like what we expect our theories to do for us.

ii. Pragmatism (C13)

The previous arguments have mostly been epistemic or epistemological arguments, arguments that bear on the likelihood of a conspiracy theory to be true or at least epistemically useful. However, similar to Blaise Pascal’s pragmatic argument for belief in God (Pascal, 1995), some arguments concerning conspiracy theories that have nothing to do with their epistemic value can be reinterpreted pragmatically as arguments about belief: Pragmatically, our belief or disbelief should depend on the consequences the (dis)belief has for us personally or for society more generally.

Basham (2001) claims that epistemic rejection of conspiracy theories will often not work, and we have to be agnostic about their truth. Still, we should reject them for pragmatic reasons because “[t]here is nothing you can do,” given the impossibility of finding out the truth, and “[t]he futile pursuit of malevolent conspiracy theory sours and distracts us from what is good and valuable in life” (p.277). Similarly, Räikkä (2009a) says that “a person who strives for happiness in her personal life should not ponder on vicious conspiracies too much” (p.199). Then again, contrary to Basham’s claim, what you can do with regard to conspiracy theories will depend on your role. As a journalist, you may decide to investigate certain claims, and Räikkä (2009a, p.199f) thinks that “it is important that in every country there are some people who are interested in investigative journalism and political conspiracy theorizing.”

Like journalists, politicians play a special role when it comes to conspiracy theories. Muirhead and Rosenblum (2016) argue that politicians should oppose conspiracy theories if they (1) are fueled by hatred, or (2) when they present political opposition as treason and illegitimate, or (3) when they undermine epistemic or expert authority generally. Similarly, Räikkä (2018, p.213) argues that we must interfere with conspiracy theories when they include libels or hate speech. The presumed negative consequences of such conspiracy theories would be pragmatic reasons for disbelief.

Räikkä (2009b) lists both positive and negative effects of conspiracy theorizing, and we may apply these to concrete conspiracy theories to see which ones to believe in. The two positive effects he mentions are (a) that “the information gathering activities of conspiracy theorists and investigative journalists force governments and government agencies to watch out for their decisions and practices” (p.460) and (b) that conspiracy theories help to maintain openness in society. As negative effects, he mentions that a conspiracy theory “tends to undermine trust in democratic political institutions and its implications may be morally questionable, as it has close connections to populist discourse, as well as anti-Semitism and racism” (p.461). When a conspiracy theory blames certain people, Räikkä points out that there are moral costs for the people blamed. Furthermore, he thinks that the moral costs will depend on whether the people blamed are private individuals or public figures (p.463f).

5. Social and Political Effects of Conspiracy Theories

Räikkä (2009b, section 3) and Moore (2016, p.5) survey some of the social and political effects of conspiracy theories and conspiracy theorizing. One may look at the positive and negative effects of conspiracy theorizing in general, but it is also useful to consider the effects of a specific conspiracy theory, by looking at which effects mentioned below are likely to obtain for the conspiracy theory in question. Such an evaluation is related to the pragmatist evaluation criterion C13 just discussed, so some of the points mentioned there are revisited in what follows. Also, the effects of a conspiracy theory may be related to the type of conspiracy theory we are dealing with; see section 3 of this article.

On the positive side, conspiracy theories may be tools to uncover actual conspiracies, with the Watergate scandal as the standard example. When these conspiracies take place in our public institutions, conspiracy theories can thereby also help us to keep these institutions in check and to uncover institutional problems. Conspiracy theories can help us to remain critical of those holding power in politics, science and the media. One of the ways they can achieve this is by forcing these institutions to be more transparent. Since conspiracy theories claim the secret activity of certain agents, transparent decision making, open lines of communication and the public availability of documents are possible responses to conspiracy theories which can improve a democratic society, independent of whether they suffice to convince those believing conspiracy theories. We may call this the paradoxical effect of conspiracy theories: Conspiracy theories can help create or maintain the open society whose existence they deny.

Turning from positive to possible negative effects of conspiracy theories, a central point that already came up when discussing criterion C7 is institutional trust. Conspiracy theories can contribute to eroding trust in the institutions of politics, science and the media. The anti-vaccination conspiracy theory which claims that politicians and the pharmaceutical industry are hiding the ineffectiveness or even harmfulness of vaccines is an example of a conspiracy theory which can undermine public trust in science. Huneman and Vorms (2018) discuss how at times it can be difficult to draw the line between rational criticism of science and unwarranted skepticism. One fear is that eroding trust in institutions leads us via unwarranted skepticism to an all-out relativism or nihilism, a post-truth world where it suffices that a claim is repeated by a lot of people to make it acceptable (Muirhead and Rosenblum, 2019). Conspiracy theories have also been linked to increasing polarization, populism and racism (see Moore, 2016). Finally, as alluded to in section 1, Popper’s dislike of conspiracy theories was also because they create wrong ideas about the root causes of social events. By seeing social events as being caused by powerful people acting in secret, rather than as effects of structural social conditions, conspiracy theories arguably undermine effective political action and social change.

Bjerg and Presskorn-Thygesen (2017) have claimed that conspiracy theories cause a state of exception in the way introduced by Giorgio Agamben. Just like terrorism undermines democracy in such a way that it licenses a state of political exception justifying undemocratic measures, a conspiracy theory undermines rational discourse in such a way that it licenses a state of epistemic exception justifying irrational measures. Those measures consist in placing conspiracy theories outside of official public discourse, labeling them as irrational, as “just” conspiracy theories, and as not worthy of serious critical consideration and scrutiny. Seen in this way, conspiracy theories appear as a form of epistemic terrorism, through their erosion of trust in our knowledge-producing institutions.

6. What to Do about Conspiracy Theories?

Besides deciding to believe or not to believe in a conspiracy theory (section 4), there are other actions one may consider with regard to conspiracy theories. Philosophical discussion has mainly focused on what actions governments and politicians can or should take.

The seminal article concerning the question of government action is by Sunstein and Vermeule (2009). Besides describing different psychological and social mechanisms underlying belief in conspiracy theories, they consider a number of policy and legal responses a government might take when it comes to false and harmful conspiracy theories: banning conspiracy theories, taxing the dissemination of conspiracy theories, counterspeech and cognitive infiltration of groups producing conspiracy theories. While dismissing the first two options, Sunstein and Vermeule consider counterspeech and cognitive infiltration in more detail. First, the government may itself speak out against a conspiracy theory by providing its own account. However, Sunstein and Vermeule think that such official counterspeech will have only limited success, in particular when it comes to conspiracy theories involving the government. Alternatively, the government may try to involve private parties to infiltrate online fora and discussion groups associated with conspiracy theories in order to introduce cognitive diversity, breaking up one-sided discussion and introducing non-conspirational views.

The proposals by Sunstein and Vermeule have led to strong opposition, most explicitly by Coady (2018). He points out that Sunstein and Vermeule too easily assume good intentions on the part of the government. Furthermore, these policy proposals, coming from academics who have also been involved in governmental policy making, will only confirm the fears of the conspiracy theorists that the government is involved in conspirational activities. If the cognitive infiltration proposed by Sunstein and Vermeule were discovered, conspiracy theorists would be led to believe in conspiracy theories even more. Put differently, we are running the risk of a pragmatic inconsistency: The government would try to deceive, via covert cognitive infiltration, a certain part of the population to make it believe that it does not deceive, that it is not involved in conspiracies.

As mentioned when discussing evaluation criterion C13 in section 4, Muirhead and Rosenblum (2016) consider three kinds of conspiracy theories that should give politicians cause for official opposition. These are conspiracy theories that fuel hatred, equate political opposition with treason, or that express a general distrust of expertise. In these cases, politicians are called to speak truth to conspiracy, even though this might create a divide between them and their electorate. Muirhead and Rosenblum (2019) also consider what to do against new conspiracism (see the end of section 2). They note that such conspiracism is rampant in our society despite ever more transparency. As a counter measure, they not only advocate speaking truth to conspiracy, but also what they call “democratic enactment,” by which they mean “a strenuous adherence to the regular processes and forms of public decision-making” (p.175).

Both Sunstein and Vermeule, as well as Muirhead and Rosenblum, agree that what we should do about conspiracy theories will depend on the theory we are dealing with. They do not advocate action against all theories about groups acting in secret to achieve some aim. However, when a theory is of a particularly problematic kind—false and harmful, fueling hatred, and so forth—political action may be needed.

7. Related Disciplines

Philosophy is not the only discipline dealing with conspiracy theories, and in particular when it comes to discussing what to do about conspiracy theories, research from other fields is important. We have already seen some ways in which philosophical thinking about conspiracy theories touches on other disciplines, in particular in the previous section’s discussion of political science and law. As for other related fields, psychologists have done a lot of research about conspirational thinking and the psychological characteristics of people who believe in conspiracy theories. Historians have presented histories of conspiracy theories in the United States, the Arab world and elsewhere. Sociologists have studied how conspiracy theories can target racial minorities, as well as the structure and group dynamics of specific conspirational milieus. Uscinski (2018) covers many of the relevant disciplines which this article does not cover and also includes an interdisciplinary history of conspiracy theory research.

8. References and Further Reading

To get an overview of the philosophical thinking about conspiracy theories, the best works to start with are Dentith (2014), Coady (2006a) and Uscinski (2018).

  • Basham, L. (2001). “Living with the Conspiracy”, The Philosophical Forum, vol. 32, no. 3, p.265-280.
  • Basham, L. (2003). “Malevolent Global Conspiracy”, Journal of Social Philosophy, vol. 34, no. 1, p.91-103.
  • Bjerg, O. and T. Presskorn-Thygesen (2017). “Conspiracy Theory: Truth Claim or Language Game?”, Theory, Culture and Society, vol. 34, no. 1, p.137-159.
  • Buenting, J. and J. Taylor (2010). “Conspiracy Theories and Fortuitous Data”, Philosophy of the Social Sciences, vol. 40, no. 4, p. 567-578.
  • Clarke, St. (2002). “Conspiracy Theories and Conspiracy Theorizing”, Philosophy of the Social Sciences, vol. 32, no. 2, p.131-150.
  • Clarke, St. (2006). “Appealing to the Fundamental Attribution Error: Was it All a Big Mistake?”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.129-132.
  • Clarke, St. (2007). “Conspiracy Theories and the Internet: Controlled Demolition and Arrested Development”, Episteme, vol. 4, no. 2, p.167-180.
  • Coady, D. (2003). “Conspiracy Theories and Official Stories”, International Journal of Applied Philosophy, vol. 17, no. 2, p.197-209.
  • Coady, D., ed. (2006a). Conspiracy Theories: The Philosophical Debate. Ashgate.
  • Coady, D. (2006b). “An Introduction to the Philosophical Debate about Conspiracy Theories”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.1-11.
  • Coady, D. (2006c). “Conspiracy Theories and Official Stories”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.115-128.
  • Coady, D. (2018). “Cass Sunstein and Adrian Vermeule on Conspiracy Theories”, Argumenta, vol. 3, no.2, p.291-302.
  • Dentith, M. (2014). The Philosophy of Conspiracy Theories. Palgrace MacMillan.
  • Dentith, M. (2016). “When Inferring to a Conspiracy might be the Best Explanation”, Social Epistemology, vol. 30, nos. 5-6, p.572-591.
  • Dentith, M. (2018). “Expertise and Conspiracy Theories”, Social Epistemology, vol. 32, no. 3, p.196-208.
  • Dentith, M. (2019). “Conspiracy theories on the basis of the evidence”, Synthese, vol. 196, no. 6, p.2243-2261.
  • Grewal, D. (2016). “Conspiracy Theories in a Networked World”, Critical Review, vol. 28, no. 1, p.24-43.
  • Grimes, D. (2016). “On the Viability of Conspirational Beliefs”, PLoS ONE, vol. 11, no. 1.
  • Hepfer, K. (2015). Verschwörungstheorien: Eine philosophische Kritik der Unvernunft. Transcript Verlag.
  • Huneman, Ph. and M. Vorms (2018). “Is a Unified Account of Conspiracy Theories Possible?”, Argumenta, vol. 3, no. 2, p.247-270.
  • Keeley, B. (1999). “Of Conspiracy Theories”, The Journal of Philosophy, vol. 96, no. 3, p.109-126.
  • Keeley, B. (2003). “Nobody Expects the Spanish Inquisition! More Thoughts on Conspiracy Theory”, Journal of Social Philosophy, vol. 34, no. 1, p.104-110
  • Lakatos, I. (1970). “Falsification and the Methodology of Scientific Research Programmes”, in I. Lakatos and A. Musgrave, editors, Criticism and the Growth of Knowledge. Cambridge University Press, p.91-196.
  • Levy, N. (2007). “Radically Socialized Knowledge and Conspiracy Theories”, Episteme, vol. 4 no. 2, p.181-192.
  • Mandik, P. (2007). “Shit Happens”, Episteme, vol. 4 no. 2, p.205-218.
  • Moore, A. (2016). “Conspiracy and Conspiracy Theories in Democratic Politics”, Critical Review, vol. 28, no. 1, p.1-23.
  • Muirhead, R. and N. Rosenblum (2016). “Speaking Truth to Conspiracy: Partisanship and Trust”, Critical Review, vol. 28, no. 1, p.63-88.
  • Muirhead, R. and N. Rosenblum (2019). A Lot of People are Saying: The New Conspiracism and the Assault on Democracy. Princeton University Press.
  • Pascal, B. (1995). Pensées and Other Writings, H. Levi (trans.). Oxford University Press.
  • Pigden, Ch. (1995). “Popper Revisited, or What Is Wrong With Conspiracy Theories?” Philosophy of the Social Sciences, vol. 25, no. 1, p.3-34.
  • Pigden, Ch. (2006). “Complots of Mischief”, in David Coady (ed.), Conspiracy Theories: The Philosophical Debate. Ashgate, p.139-166.
  • Pipes, D. (1997). Conspiracy: How the Paranoid Style Flourishes and Where It Comes From. Free Press.
  • Popper, K.R. (1966). The Open Society and Its Enemies, vol. 2: The High Tide of Prophecy, 5th edition, Routledge and Kegan Paul.
  • Popper, K.R. (1972). Conjectures and Refutations. 4th edition, Routledge and Kegan Paul.
  • Räikkä, J. (2009a). “On Political Conspiracy Theories”, Journal of Political Philosophy, vol. 17, no. 2, p.185-201.
  • Räikkä, J. (2009b). “The Ethics of Conspiracy Theorizing”, Journal of Value Inquiry, vol. 43, p.457-468.
  • Räikkä, J. (2018). “Conspiracies and Conspiracy Theories: An Introduction”, Argumenta, vol. 3, no. 2, p.205-216.
  • Sunstein, C. and A. Vermeule (2009). “Conspiracy Theories: Causes and Cures”, Journal of Political Philosophy, vol. 17, no. 2, p.202-227.
  • Uscinski, J.E., editor (2018). Conspiracy Theories and the People Who Believe Them. Oxford University Press.

Author Information

Marc Pauly
Email: m.pauly@rug.nl
University of Groningen
The Netherlands

René Descartes: Ethics

This article describes the main topics of Descartes’ ethics through discussion of key primary texts and corresponding interpretations in the secondary literature. Although Descartes never wrote a treatise dedicated solely to ethics, commentators have uncovered an array of texts that demonstrate a rich analysis of virtue, the good, happiness, moral judgment, the passions, and the systematic relationship between ethics and the rest of philosophy. The following ethical claims are often attributed to Descartes: the supreme good consists in virtue, which is a firm and constant resolution to use the will well; virtue presupposes knowledge of metaphysics and natural philosophy; happiness is the supreme contentment of mind which results from exercising virtue; the virtue of generosity is the key to all the virtues and a general remedy for regulating the passions; and virtue can be secured even though our first-order moral judgments never amount to knowledge.

Descartes’ ethics was a neglected aspect of his philosophical system until the late 20th century. Since then, standard interpretations of Descartes’ ethics have emerged, debates have ensued, and commentators have carved out key interpretive questions that anyone must answer in trying to understand Descartes’ ethics. For example: what kind of normative ethics does Descartes espouse? Are the passions representational or merely motivational states? At what point in the progress of knowledge can the moral agent acquire and exercise virtue? Is Descartes’ ethics as systematic as he sometimes seems to envision?

Table of Contents

  1. Methodology
    1. Identifying the Texts
    2. The Tree of Philosophy and Systematicity
    3. The Issue of Novelty
  2. The Provisional Morality
    1. The First Maxim
    2. The Second Maxim
    3. The Third Maxim
    4. The Fourth Maxim
  3. Cartesian Virtue
    1. The Unity of the Virtues
    2. Virtue qua Perfection of the Will
  4. The Epistemic Requirements of Virtue
    1. Knowledge of the Truth
      1. Theoretical Knowledge of the Truth
      2. Practical Knowledge of the Truth
    2. Intellect, Will, and Degrees of Virtue
  5. Moral Epistemology
    1. The Contemplation of Truth vs. The Conduct of Life
    2. Moral Certainty and Moral Skepticism
    3. Virtue qua Resolution
  6. The Passions
    1. The Definition of the Passions
    2. The Function of the Passions
    3. Whether the Passions are Representational or Motivational
  7. Generosity
    1. Component One: What Truly Belongs to Us
    2. Acquiring Generosity
    3. Generosity and the Regulation of the Passions
    4. The Other-Regarding Nature of Generosity
  8. Love
    1. The Metaphysical Reading
    2. The Practical Reading
  9. Happiness
  10. Classifying Descartes’ Ethics
    1. Virtue Ethics
    2. Deontological Virtue Ethics
    3. Perfectionism
  11. Systematicity Revisited
    1. The Epistemological Reading
    2. The Organic Reading
  12. References and Further Reading
    1. Abbreviations
    2. Primary Sources
    3. Secondary Sources

1. Methodology

a. Identifying the Texts

When one considers the heyday of early modern ethics, the following philosophers come to mind: Hobbes, Hutcheson, Hume, Butler, and, of course, Kant. Descartes certainly does not. Indeed, many philosophers and students of philosophy are unaware that Descartes wrote about ethics. Standard interpretations of Descartes’ philosophy place weight on the Discourse on the Method, Rules for the Direction of the Mind, Meditations on First Philosophy (with the corresponding Objections and Replies), and the Principles of Philosophy. Consequently, Descartes’ philosophical contributions to the early modern period are typically understood as falling under metaphysics, epistemology, philosophy of mind, and natural philosophy. When commentators do consider Descartes’ ethical writings, these writings are often regarded as an afterthought to his mature philosophical system. Indeed, Descartes’ contemporaries often did not think much of Descartes’ ethics. For example, Leibniz writes: “Descartes has not much advanced the practice of morality” (Letter to Molanus, AG: 241).

This view is understandable. Descartes certainly does not have a treatise devoted solely to ethics. This lack, in and of itself, creates an interpretive challenge for the commentator. Where does one even find Descartes’ ethics? On close inspection of Descartes’ corpus, however, one finds him tackling a variety of ethical themes—such as virtue, happiness, moral judgment, the regulation of the passions, and the good—throughout his treatises and correspondence. The following texts are of central importance in unpacking Descartes’ ethics: the Discourse on Method, the French Preface to the Principles, the Dedicatory Letter to Princess Elizabeth for the Principles, the Passions of the Soul, and perhaps most importantly, the correspondence with Princess Elizabeth of Bohemia, Queen Christina of Sweden, and the envoy Pierre Chanut (for more details on these important interlocutors—Princess Elizabeth in particular—and how they all interacted with each other in bringing about these letters see Shapiro [2007: 1–21]).

These ethical writings can be divided into an early period and a later—and possibly mature—period. That is, the early period of the Discourse (1637) and the later period spanning (roughly) from the French Preface to the Passions of the Soul (1644–1649).

b. The Tree of Philosophy and Systematicity

Why should we take seriously Descartes’ interspersed writings on ethics, especially since he did not take the time to write a systematic treatment of the topic? Indeed, one might think that we should not give much weight to Descartes’ ethical musings, given his expressed aversion to writing about ethics. In a letter to Chanut, Descartes writes:

It is true that normally I refuse to write down my thoughts concerning morality. I have two reasons for this. One is that there is no other subject in which malicious people can so readily find pretexts for vilifying me; and the other is that I believe only sovereigns, or those authorized by them, have the right to concern themselves with regulating the morals of other people. (Letter to Chanut 20 November 1647, AT V: 86–7/CSMK: 326)

However, one should take this text with a grain of salt. For in other texts, Descartes clearly does express a deep interest in ethics. Consider the famous tree of philosophy passage:

The whole of philosophy is like a tree. The roots are metaphysics, the trunk is physics, and the branches emerging from the trunk are all the other sciences, which may be reduced to three principal ones, namely, medicine, mechanics, and morals. By ‘morals’ I understand the highest and most perfect moral system, which presupposes a complete knowledge of the other sciences and is the ultimate level of wisdom.

Now just as it is not the roots or the trunk of a tree from which one gathers the fruit, but only the ends of the branches, so the principal benefit of philosophy depends on those parts of it which can only be learnt last of all. (French Preface to the Principles, AT IXB: 14/CSM I: 186)

This passage is surprising, to say the least. Descartes seems to claim that the proper end of his philosophical program is to establish a perfect moral system, as opposed to (say) overcoming skepticism, proving the existence of God, and establishing a mechanistic science. Moreover, Descartes seems to claim that ethics is systematically grounded in metaphysics, physics, medicine, and mechanics. Ethics is not supposed to float free from the metaphysical and scientific foundations of the system.

The tree of philosophy passage is a guiding text for many commentators in interpreting Descartes’ ethics, primarily because of its vision of philosophical systematicity (Marshall 1998, Morgan 1994, Rodis-Lewis 1987, Rutherford 2004, Shapiro 2008a). Indeed, the nature of the systematicity of Descartes’ ethics has been one of the main interpretive questions for commentators. Two distinct questions of systematicity are of importance here, which the reader should keep in mind as we engage Descartes’ ethical writings.

The first question of systematicity is internal to Descartes’ ethics itself. The early period of Descartes’ ethics, that is, the Discourse, is characterized by Descartes’ provisional morality. Broadly construed, the provisional morality seems to be a temporary moral guide—a stop gap, as it were—so that one can still live in the world of bodies and people while simultaneously engaging in hyperbolic doubt for the sake of attaining true and certain knowledge (scientia). As such, one might expect Descartes to revise the four maxims of the provisional morality once foundational scientia is achieved. Presumably, the perfect moral system that Descartes envisions in the tree of philosophy is not supposed to be a provisional morality. However, some commentators have claimed that the provisional morality is actually Descartes’ final moral view (Cimakasky & Polansky 2012). Others, however, take a developmental view, arguing that Descartes’ later period, although related to the provisional morality, makes novel and distinct advancements (Marshall 1998, Shapiro 2008a).

The second question of systematicity concerns how Descartes’ ethics relates to the rest of his philosophy. To fully understand this question, we must distinguish two senses of ethics (la morale) in the tree of philosophy (Parvizian 2016). First, there is ethics qua theoretical enterprise. This concerns a theory of virtue, happiness, the passions, and other areas. Second, there is ethics qua practical enterprise. That is, the exercise of virtue, the attainment of happiness, the regulation of the passions. Thus, one may distinguish, for example, the question of whether a theory of virtue depends on metaphysics, physics, and the like, from whether exercising virtue depends on knowledge of metaphysics, physics, and the like. Commentators tend to agree that theoretical ethics presupposes the other parts of the tree, although how this is supposed to work out with respect to each field has not been fully fleshed out. For example: what is the relationship between mechanics and ethics? However, there is substantive disagreement about whether exercising virtue presupposes knowledge of metaphysics or contributes to knowledge of metaphysics.

c. The Issue of Novelty

Another broad interpretive question concerns how Descartes’ ethics relates to past ethical theories, and whether Descartes’ ethics is truly novel (as he sometimes claims). It is undeniable that Descartes’ ethics is, in certain respects, underdeveloped. Given that Descartes is well versed in the ethical theories of his predecessors, one might be tempted to fill in the details Descartes does not spell out by drawing on other sources (for example, the Stoics).

This is a complicated matter. In section 3, Descartes claims that he is advancing beyond ancient ethics, particularly with his theory of virtue. This is in line with Descartes’ more general tendency to claim that his philosophical system breaks from the ancient and scholastic philosophical tradition (Discourse I, AT VI: 4–10/CSM I: 112–115). However, in some texts Descartes suggests that he is building upon past ethical theories. For example, Descartes tells Princess Elizabeth:

To entertain you, therefore, I shall simply write about the means which philosophy provides for acquiring that supreme felicity which common souls vainly expect from fortune, but which can be acquired only from ourselves.

One of the most useful of these means, I think, is to examine what the ancients have written on this question, and try to advance beyond them by adding something to their precepts. For in this way we can make the precepts perfectly our own and become disposed to put them into practice. (Letter to Princess Elizabeth 21 July 1645, AT IV: 252/CSMK: 256; emphasis added)

Given such a text, a commentator would certainly be justified in drawing on other sources to illuminate Descartes’ ethical positions (such as the nature of happiness vis-à-vis the Stoics). Thus, although Descartes claims that he is breaking with the past, one still ought to explore the possibility that his ethics builds on, for example, the Aristotelian and Stoic ethics with which he was surely acquainted. Indeed, some commentators have argued that Descartes’ ethics is indebted to Stoicism (Kambouchner 2009, Rodis-Lewis 1957, Rutherford 2004 & 2014).

2. The Provisional Morality

Descartes’ first stab at ethics is in Discourse III. In the Discourse, Descartes lays out a method for conducting reason in order to acquire knowledge. This method requires an engagement with skepticism, which raises the question of how one should live in the world when one has yet to acquire knowledge and must suspend judgment about all dubitable matters. Perhaps to ward off the classic apraxia objection to skepticism, that is, the objection that one cannot engage in practical affairs if one is truly a skeptic (Marshall 2003), Descartes offers a “provisional morality” to help the temporary skeptic and seeker of knowledge still act in the world. Descartes writes:

Now, before starting to rebuild your house, it is not enough simply to pull it down, to make provision for materials and architects (or else train yourself in architecture), and to have carefully drawn up the plans; you must also provide yourself with some other place where you can live comfortably while building is in progress. Likewise, lest I should remain indecisive in my actions while reason obliged me to be so in my judgements, and in order to live as happily as I could during this time, I formed for myself a provisional moral code consisting of just three or four maxims, which I should like to tell you about. (Discourse III, AT VI: 22/CSM I: 122)

Notice that Descartes is ambiguous about whether the provisional morality consists of three or four maxims. There is some interpretive debate about this matter. We will discuss all four candidate maxims. Furthermore, we will bracket the issue of how to understand the provisional nature of this morality (see, for example, LeDoeuff 1989, Marshall 1998 & 2003, Morgan 1994, Shapiro 2008a). However, it should be noted that Descartes does refer to the provisional morality even in his later ethical writings, which suggests that the maxims are not entirely abandoned once skepticism is defeated (see Letter to Princess Elizabeth 4 August 1645, AT IV: 265–6/CSMK: 257–8).

a. The First Maxim

Maxim One can be divided into three claims:

M1a: The moral agent ought to obey the laws and customs of her country.

M1b: The moral agent ought to follow their religion.

M1c: In all other matters not addressed by M1a and M1b, the moral agent ought to follow the most commonly accepted and sensible opinions of her community. (Discourse III, AT VI: 23/CSM I: 122)

Descartes claims that during his skeptical period he found his own “opinions worthless” (Ibid.). In the absence of genuine moral knowledge to guide our practical actions, Descartes claims that the best we can do is conform to the moral guidelines offered in the laws and customs of one’s country, religion, and the moderate and sensible opinions of one’s community. As Vance Morgan notes, M1 is strikingly anti-Cartesian, as it calls the moral agent to an “unreflective social conformism” (1994: 45). But as we see below, M1, at least partially, does not seem to be abandoned in Descartes’ later ethical writings.

b. The Second Maxim

Maxim Two states:

M2: The moral agent ought to be firm and decisive in her actions, and to follow even doubtful opinions once they are adopted, with no less constancy than if they were certain.

The motivation for M2 seems to be the avoidance of irresolution, which Descartes later characterizes as an anxiety of the soul in the face of uncertainty that prevents or delays the moral agent from taking up a course of action (Passions III.170, AT XI: 459–60/CSM I: 390–1). Descartes writes that, since “in everyday life we must often act without delay, it is a most certain truth that when it is not in our power to discern the truest opinions, we must follow the most probable” (Discourse III, AT VI: 25/CSM I: 123). Descartes discusses a traveler lost in a forest to illustrate the usefulness of M2. The traveler is lost, and he does not know how to get out of the woods. Descartes’ advice is that the traveler should pick a route, even if it is uncertain, and resolutely stick to it:

Keep walking as straight as he can in one direction, never changing it for slight reasons even if mere chance made him choose it in the first place; for in this way, even if he does not go exactly where he wishes, he will at least end up in a place where he is likely to be better off than in the middle of a forest. (Ibid.)

Descartes claims that following M2 prevents the moral agent from undergoing regret and remorse. This is important because regret and remorse prevent the moral agent from attaining happiness. The notion of sticking firmly and constantly to one’s moral judgments, even if they are not certain, is a recurring theme in Descartes’ later ethical writings (it is indeed constitutive of his virtue theory).

c. The Third Maxim

Maxim Three states:

M3: The moral agent ought to master herself rather than fortune, and to change her desires rather than the order of the world.

The justification for M3 is that “nothing lies entirely within our power except our thoughts” (Ibid.). Knowing this truth will lead the moral agent to orient her desires properly, because she will have accepted that “after doing our best in dealing with matters external to us, whatever we fail to achieve is absolutely impossible so far as we are concerned” (Ibid.). To be clear, the claim is that we should consider “all external goods as equally beyond our power” (Discourse III, AT VI: 26/CSM I: 124). Unsurprisingly, Descartes claims that it takes much work to accept M3: “it takes long practice and repeated meditation to become accustomed to seeing everything in this light” (Ibid.). The claim that only our thoughts lie within our power—and that knowing this is a key to regulating the passions—is another recurring theme in Descartes’ ethical writings, particularly in his theory of the passions and generosity (see section 7).

d. The Fourth Maxim

When reading Discourse III, it seems that the provisional morality ends after the discussion of M3. Indeed, in some texts Descartes refers to “three rules of morality” (see, for instance, Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 257). However, Descartes does seem to tack on a final Fourth Maxim:

M4: The moral agent ought to devote their life to cultivating reason and acquiring knowledge of the truth, according to the method outlined in the Discourse.

M4 has a different status than the other three maxims: it is the “sole basis of the foregoing three maxims” (Discourse III, AT VI: 27/CSM I: 124). It seems that M4 is not truly a maxim of morality, however, but a re-articulation of Descartes’ commitment to acquiring genuine knowledge. The moral agent must not get stuck in skepticism, resorting to a life of provisional morality, but rather must continue and persist in her search for knowledge of the truth (with the hope of establishing a well-founded morality—perhaps the “perfect moral system” of the tree of philosophy).

3. Cartesian Virtue

 We now turn to Descartes’ later ethical writings (ca. 1644–1649). Arguably, the centerpiece of these writings is a theory of (moral) virtue. Though formulated in different ways, Descartes offers a consistent definition of virtue throughout his later ethical writings, namely, that virtue consists in the firm and constant resolution to use the will well (see Letter to Princess Elizabeth 18 August 1645, AT IV: 277/CSMK: 262; Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 258; Letter to Princess Elizabeth 6 October 1645, AT IV: 305/CSMK: 268; Passions II.148, AT XI: 442/CSM I: 382; Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325). This resolution to use the will well has two main features: (1) the firm and constant resolution to arrive at one’s best moral judgments, and (2) the firm and constant resolution to carry out these best moral judgments to the best of one’s abilities. It is important to note that the scope of the discussion here concerns moral virtue, not epistemic virtue (for an account of epistemic virtue see Davies 2001, Shapiro 2013, Sosa 2012).

a. The Unity of the Virtues

Descartes claims that his definition of virtue is wholly novel, and that he is breaking off from Scholastic and ancient definitions of virtue:

He should have a firm and constant resolution to carry out whatever reason recommends without being diverted by his passions or appetites. Virtue, I believe, consists precisely in sticking firmly to this resolution; though I do not know that anyone has ever so described it. Instead, they have divided it into different species to which they have given various names, because of the various objects to which it applies. (Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 258)

It is unclear what conception of virtue Descartes is criticizing here, but it is not far-fetched that he has in mind Aristotle’s account of virtue (arete) in the Nicomachean Ethics. For, according to Aristotle, there are a number of virtues—such as courage, temperance, and wisdom—each of which are distinct characterological traits that consist of a mean between an excess and a deficiency and guided by practical wisdom (phronesis) (Nicomachean Ethics II, 1106b–1107a). For example, the virtue of courage is the mean between rashness and cowardice. Although Descartes is willing to use a similar conceptual apparatus for distinguishing different virtues—for example, he will talk extensively about a “distinct” virtue of generosity—at bottom he thinks that there are no strict metaphysical divisions between the virtues. All of the so-called virtues have one and the same nature—they are reducible to the resolution to use the will well. As he tells Queen Christina:

I do not see that it is possible to dispose it [that is, the will] better than by a firm and constant resolution to carry out to the letter all the things which one judges to be best, and to employ all the powers of one’s mind in finding out what these are. This by itself constitutes all the virtues. (Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325)

Similarly, he writes in the Dedicatory Letter to Princess Elizabeth for the Principles:

The pure and genuine virtues, which proceed solely from knowledge of what is right, all have one and the same nature and are included under the single term ‘wisdom’. For whoever possesses the firm and powerful resolve always to use his reasoning powers correctly, as far as he can, and to carry out whatever he knows best, is truly wise, so far as his nature permits. And simply because of this, he will possess justice, courage, temperance, and all the other virtues; but they will be interlinked in such a way that no one virtue stands out among the others. (AT VIIIA: 2–3/CSM:191)

In these passages, Descartes is espousing a unique version of the unity of the virtues thesis. An Aristotelian unity of the virtues entails a reciprocity or inseparability among distinct virtues (Nichomachean Ethics VI, 1144b–1145a). According to Descartes, however, there is a unity of the “virtues” because, strictly speaking, there is only one virtue, namely, the resolution to use the will well (Alanen and Svensson 2007: fn. 8; Naaman-Zauderer 2010: 179–181). When the virtues are unified in this way, they exemplify wisdom.

b. Virtue qua Perfection of the Will

But what exactly is the nature of this resolution to use the will well? And how does one go about exercising this virtue? There are three main issues that need to be addressed in order to unpack Cartesian virtue. The first and foundational issue is Descartes’ rationale for locating virtue in a perfection of the will (section 3b). The second concerns the distinct epistemic requirements for virtue (section 4a). The third concerns Descartes’ characterization of virtue as a resolution of the will (section 5c).

According to Descartes, virtue is our “supreme good” (Letter to Princess Elizabeth 6 October 1645, AT IV: 305/CSMK: 268, Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325; see also Svensson 2019b). One avenue for tackling this claim about the supreme good is to think about what we can be legitimately praised or blamed for (Parvizian 2016). According to Descartes, virtue is certainly something that we can be praised for, and vice is certainly something that we can be blamed for. Now, in order to be legitimately praised or blamed for some property, f, f must be fully within our control. If f is not fully within our control, then we cannot truly be praised or blamed for possessing f. For example, Descartes cannot be praised or blamed for being French. This is a circumstantial fact about Descartes that is wholly outside of his control. However, Descartes can be praised or blamed for his choice to join the army of Prince Maurice of Nassau, for this is presumably a decision within his control, and it is either virtuous or vicious.

But what does it mean for f to be within our control? According to Descartes, control needs to be understood vis-à-vis the freedom to dispose of our volitions. The will is the source of our power and control—it is through the will that we affirm and deny perceptions at the cognitive level, and correspondingly act at the bodily level (Fourth Meditation, AT VII: 57/CSM II: 40). We have control over f insofar as f is fully under the purview of the will. As such, the reason why our supreme good lies in our will—or more specifically a virtuous use of our will—is because our will is the only thing we truly have control over. At bottom, everything else—our bodies, historical circumstances, and even intellectual capacities—are beyond the scope of our finite power.

This is not to deny that things outside of our control might be perfections or goods. Descartes clearly recognizes that wealth, beauty, intelligence and so forth are perfections, and desirable ones (Passions III.158, AT XI: 449/CSM I: 386). They can certainly contribute, in some sense, to well-being (see section 9). However, they are neither necessary nor sufficient for virtue and happiness. Descartes certainly allows for the possibility of the virtuous moral agent who is tortured “on the rack.” What matters is how we respond to the contingencies of the world, and how we incorporate contingent perfections into our life. Such responses are, of course, dependent on the will. Thus, it is through the will alone that we attain virtue.

As such, the will is also the only legitimate source of our personal value, and thus justified self-esteem. Indeed, Descartes claims that it is through the will alone that we bear any similarity to God. For it is through the will that we can become masters of ourselves, just as God is a master of Himself (Passions III.152, AT XI: 445/CSM I: 384).

4. The Epistemic Requirements of Virtue

Although virtue is located in a perfection of the will, the intellect does have a role in Cartesian virtue. One cannot use the will well in practical affairs unless the will is guided by the right kinds of perceptions—leaving open for now what we mean by ‘right’ (Morgan 1994: 113–128; Shapiro 2008: 456–7; Williston 2003: 308–310). Nonetheless, Descartes clearly claims that the virtuous will must be guided by the intellect:

Virtue unenlightened by the intellect is false: that is to say, the will and resolution to do well can carry us to evil courses, if we think them good; and in such a case the contentment which virtue brings is not solid. (Letter to Princess Elizabeth 4 August 1645, AT IV: 267/CSMK: 258)

More specifically, Descartes claims that we need knowledge of the truth to exercise virtue. However, Descartes recognizes that this knowledge cannot be comprehensive given our limited intellectual capacities:

It is true that we lack the infinite knowledge which would be necessary for a perfect acquaintance with all the goods between which we have to choose in the various situations of our lives. We must, I think, be contented with a modest knowledge of the most necessary truths. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)

This section tackles the issue of how to judge well based on knowledge of the truth, in other words how to arrive at our best moral or practical judgments. Notice that this seems to mark a departure from the provisional morality of the Discourse, in particular M1, where our moral judgments are not guided by any knowledge given the background engagement with skepticism.

a. Knowledge of the Truth

According to Descartes, in order to judge (and act) well we need to have knowledge of the truth in both a theoretical and practical sense. That is, we must assent to a certain set of truths at a theoretical level. However, in order to judge well in a moral situation, we need to have these truths ready at hand, that is, we need practical habits of belief.

i. Theoretical Knowledge of the Truth

In a letter to Princess Elizabeth, Descartes identifies six truths that we need in order to judge well in moral situations. Four of these truths are general in that they apply to all of our actions, and two of these truths are particular in that they are applicable to specific moral situations. Let us first examine what these truths are at a theoretical level, before turning to how these truths must be transformed into practical habits of belief.

Broadly put, the four general truths are:

T1: The existence of God

T2: The real distinction between mind and body

T3: The immensity of the universe

T4: The interconnectedness of the parts of the universe

The two particular truths are:

T5: The passions can misguide us.

T6: One can follow customary moral opinions when it is reasonable to do so.

On T1: Descartes claims that we must know that “there is a God on whom all things depend, whose perfections are infinite, whose power is immense and whose decrees are infallible” (Letter to Princess Elizabeth 15 September 1645, AT IV: 291/CSMK: 265) Knowing T1 is necessary for virtue, because it “teaches us to accept calmly all the things which happen to us as expressly sent by God,” and it engenders love for God in the moral agent (Ibid.).

On T2: Descartes says that we must know the nature of the soul, “that it subsists apart from the body, and is much nobler than the body, and that it is capable of enjoying countless satisfactions not to be found in this life” (Letter to Princess Elizabeth 15 September 1645, AT IV: 292/CSMK: 265–6). Knowing T2 is necessary for virtue because it prevents the moral agent from fearing death and helps her prioritize her intellectual pursuits over her bodily pursuits.

On T3: Descartes says that we must have a “vast idea of the extent of the universe” (Ibid.). He says this vast idea of the universe is conveyed in Principles III, and that it would be useful for moral agents to have read at least that part of his physics. Having knowledge of physics is necessary for virtue, because it prevents the moral agent from thinking that the universe was only created for her, thus wishing to “belong to God’s council” (Ibid.). It is important to note that this is one of the few places where Descartes draws out any connection between his physics and ethics, although he claims in a number of places that there are fundamental connections between these two disparate fields (Letter to Chanut 15 June 1646, AT IV: 441/CSMK: 289, Letter to Chanut 26 February 1649, AT V: 290-1/CSMK: 368).

On T4: Descartes says that “though each of us is a person distinct from others, whose interests are accordingly in some way different from those of the rest of the world, we ought still to think that none of us could subsist alone and that each one of us is really one of the many parts of the universe” (Letter to Princess Elizabeth 15 September 1645, AT IV: 293/CSMK: 266). Knowing T4 is necessary for virtue, because it helps engender an other-regarding character—perhaps love and generosity—that is particularly relevant to Cartesian virtue. Indeed, virtue requires that the “interests of the whole, of which each of us is a part, must always be preferred to those of our own particular person” (Ibid.).

On T5: Descartes seems to claim that the passions exaggerate the value of the goods they represent (and thus are misrepresentational), and that the passions correspondingly impel us to the pleasures of the body. Knowing T5 is necessary for virtue, because it helps us suspend our judgments when we are in the throes of the passions, so that we are not “deceived by the false appearances of the goods of this world” (Letter to Princess Elizabeth 15 September 1645, AT IV: 294–5/CSMK: 267).

On T6: Descartes claims that “one must also examine minutely all the customs of one’s place of abode to see how far they should be followed” (Ibid.). T6 is necessary for virtue because “though we cannot have demonstrations of everything, still we must take sides, and in matters of custom embrace the opinions that seem the most probable, so that we may never be irresolute when we need to act” (Ibid.). T6 seems to be a re-articulation of M1 in the provisional morality, specifically M1a above.

ii. Practical Knowledge of the Truth

T1–T6 must be known at a theoretical level. However, Descartes claims that we also need to transform T1–T6 into habits of belief:

Besides knowledge of the truth, practice is also required if one is to be always disposed to judge well. We cannot continually pay attention to the same thing; and so, however clear and evident the reasons may have been that convinced us of some truth in the past, we can later be turned away from believing it by some false appearances unless we have so imprinted it on our mind by long and frequent meditation that it has become a settled disposition within us. In this sense the Scholastics are right when they say that virtues are habits; for in fact our failings are rarely due to lack of theoretical knowledge of what we should do, but to lack of practical knowledge—that is, lack of a firm habit of belief. (Letter to Princess Elizabeth 15 September 1645, AT IV: 295–6/CSMK: 267)

The idea seems to be this: in order to actually judge well in a moral situation, T1–T6 need to be ready at hand. We need to bring them forth before the mind swiftly and efficiently in order to respond properly in a moral situation. In order to do that, we must meditate on T1–T6 until they become habits of belief.

b. Intellect, Will, and Degrees of Virtue

There seems to be an inconsistency between Descartes’ theory of virtue and his account of the epistemic requirements for virtue. Descartes is committed to the following two claims:

  • Theoretical and practical knowledge of T1–T6 is a necessary condition for virtue.
  • One can be virtuous even if they do not have theoretical and practical knowledge of T1–

We have seen that Descartes is committed to claim (1). But why is he committed to claim (2)? Consider the following passage from the Dedicatory Letter to Elizabeth:

Now there are two prerequisites for the kind of wisdom [that is, the unity of the virtues] just described, namely the perception of the intellect and the disposition of the will. But whereas what depends on the will is within the capacity of everyone, there are some people who possess far sharper intellectual vision than others. Those who are by nature somewhat backward intellectually should make a firm and faithful resolution to do their utmost to acquire knowledge of what is right, and always to pursue what they judge to be right; this should suffice to enable them, despite their ignorance on many points, to achieve wisdom according to their lights and thus to find great favour with God. (AT VIIIA: 3/CSM I: 191)

Descartes clearly commits himself to (2) in this passage. But in the continuation of this passage he offers a way to reconcile (1) and (2):

Nevertheless, they will be left far behind by those who possess not merely a very firm resolve to act rightly but also the sharpest intelligence combined with the utmost zeal for acquiring knowledge of the truth.

According to Descartes, virtue, at its essence, is a property of the will, not the intellect. Virtue consists in the firm and constant resolution to use the will well, which requires determining one’s best practical judgments and executing them to the best of one’s abilities. However, virtue comes in degrees, in accordance with what these best practical judgments are based on. The more knowledge one has (essentially, the more perfected one’s intellect is), the higher one’s degree of virtue.

In its ideal form, virtue presupposes, at a minimum, theoretical and practical knowledge of T1–T6 (and arguably one’s virtue would be improved by acquiring further relevant knowledge). But Descartes acknowledges that not everyone has the capacity or perhaps is in a position to acquire knowledge of the truth (for instance the peasant). Nonetheless, Descartes does not want to exclude such moral agents from acquiring virtue. Virtue is not just for the philosopher. If such moral agents resolve to acquire as much knowledge as they can, and have a firm and constant resolution to use their will well (according to that knowledge), then they will secure virtue (even if they have the wrong metaphysics, epistemology, natural philosophy, or the like). Claims (1) and (2) are rendered consistent, then, once they are properly revised:

(1)*     Theoretical and practical knowledge of T1–T6 is a necessary condition for ideal virtue.

(2)*     One can be non-ideally virtuous while lacking full theoretical and practical knowledge of T1–T6, so long as they do their best to acquire as much relevant knowledge as they can, and to have the firm and constant resolution to use their will well accordingly.

It is clear that Descartes is usually talking about an ideal form of virtue, whenever he uses the term ‘virtue.’ When he wants to highlight discussion of non-ideal forms of virtue, he is usually clear about his target (see, for example, Dedicatory Letter to Elizabeth, AT VIIIA: 2/CSM I: 190–1). In what follows, then, the reader should assume that the virtue being discussed is of the ideal variety, that is, it is based on some perfection of the intellect. As flagged earlier, there is disagreement, of course, about how much knowledge one must have to acquire certain virtues (for example, generosity).

5. Moral Epistemology

Does Descartes have a distinct moral epistemology? In the epistemology of the Meditations, Descartes distinguishes three different kinds of epistemic states: scientia/perfecte scire (perfect knowledge), cognitio (awareness), and persuasio (conviction or opinion). Broadly construed, the distinction between these three epistemic states is as follows. Scientia is an indefeasible judgment (it is true and absolutely certain), whereas cognitio and persuasio are both defeasible judgments. Nonetheless, cognitio is of a higher status than persuasio because there is—to some degree—better justification for cognitio than persuasio. Persuasio is mere opinion or belief, whereas cognitio is an opinion or awareness backed by some legitimate justification. For example, the atheist geometer can have cognitio of the Pythagorean theorem, and can justify that cognitio with a geometrical proof. However, this cognitio fails to achieve the status of scientia because the atheist geometer is unaware of God, and thus does not know the Truth Rule, namely, that her clear and distinct perceptions are true because God is not a deceiver (Second Replies, AT VII: 141/CSM II: 101; Third Meditation, AT VII: 35/CSM II: 24; Fourth Meditation, AT VII: 60–1/CSM II: 41).

a. The Contemplation of Truth vs. The Conduct of Life

There is an important question that must be raised about the epistemic status of our best moral judgments. In what sense is a “best moral judgment” the best? That is, is a best moral judgment the best because it amounts to scientia or does it fall short—that is, is it the best cognitio or persuasio? In the Meditations, where Descartes is engaged in a sustained hyperbolic doubt, he identifies two jointly necessary and sufficient conditions for knowledge in the strict sense, that is, scientia. In the standard interpretation, a judgment will amount to scientia when it is both true and absolutely certain (Principles I.45, AT VIIIA: 21–22/CSM I: 207). A judgment can meet the conditions of truth and absolute certainty when it is grounded in divinely guaranteed clear and distinct perceptions. Though the details are tricky, it is ultimately clear and distinct perceptions that make scientia indefeasible, because the intellect and its clear and distinct perceptions are epistemically guaranteed, in some sense, by God’s benevolence and non-deceptive nature. According to Descartes, however, the epistemic standards that we must abide by in theoretical matters or “the contemplation of the truth” should not be extended to practical matters or the “conduct of life.” As he writes in the Second Replies,

As far as the conduct of life is concerned, I am very far from thinking that we should assent only to what is clearly perceived. On the contrary, I do not think that we should always wait even for probable truths; from time to time we will have to choose one of many alternatives about which we have no knowledge, and once we have made our choice, so long as no reasons against it can be produced, we must stick to it as firmly as if it had been chosen for transparently clear reasons. (AT VII: 149/CSM II: 106)

This passage tells us that our best practical judgments cannot be the best in virtue of meeting the strict standards for scientia. This is because of the distinguishing factors between the contemplation of truth from the conduct of life. First and foremost, unlike the contemplation of truth, where the goal is to arrive at a true and absolutely certain theoretical judgment that amounts to knowledge, the conduct of life is concerned with arriving at a best practical moral judgment for the sake of carrying out a course of action. Given that in morality we are ultimately concerned with action in the conduct of life, we must keep in mind that there is a temporally indexed window of opportunity to act in a moral situation (Letter to Princess Elizabeth 6 October 1645 AT IV: 307/CSMK: 269). If a moral agent tries to obtain clear and distinct perceptions in moral deliberation—something that can take weeks or even months to attain according to Descartes (Second Replies, AT VII: 131/CSM II: 94; Seventh Replies, AT VII: 506/CSM II: 344)—the opportunity to act will pass and thus the moral agent will have failed to use her will well. In short, attaining clear and distinct perceptions in a moral situation is not advisable.

Second, and perhaps more importantly, it seems that we cannot attain clear and distinct perceptions in the conduct of life. Although our best moral judgments are guided by knowledge of the truth (which are presumably based on clear and distinct perceptions), we also base our best moral judgments, in part, on perceptions of the relevant features of the moral situation. These include information about other mind-body composites, bodies, and the consequences of our action. For example, in the famous trolley problem, the moral agent has to consider her perceptions of the people tied to the track, the train and the rails, and the possible consequences that follow from directing the train one way or another at the fork in the tracks. Such information about the other mind-body composites and bodies in this moral situation is ultimately provided by sensations. And sensations, according to Descartes, provide obscured and confused content to the mind about the nature of bodies (Principles I.45, AT VIIIA: 21–2/CSM I: 207–8, Principles I.66–68, AT VIIIA: 32–33/CSM I: 126–7). As for predicting the consequences of an action, this is done through the imagination, for these consequences do not exist yet. I need to represent to my myself, through imagination, the potential consequences my action will produce in the long run. And such fictitious representations can only be obscure and confused. In short, given the imperfect kinds of perceptions that are involved in moral deliberation, our best moral judgments can never be fully grounded in clear and distinct perceptions.

b. Moral Certainty and Moral Skepticism

These perceptual facts help explain why Descartes claims that our best moral judgments can achieve only moral certainty, that is,

[C]ertainty which is sufficient to regulate our behaviour, or which measures up to certainty we have on matters relating to the conduct of life which we never doubt, though we know that it is possible, absolutely speaking, that they may be false. (Principles IV.204, AT VIIIA: 327/CSM I: 289, fn. 1; see also Schachter 2005, Voss 1993)

Given that even our best moral judgments can achieve only moral certainty, Descartes seems to be claiming that we cannot have first-order moral knowledge. That is, when I make a moral judgment of the following form “I ought to f in x moral situation,” that moral judgment will never amount to knowledge in the strict sense. Nonetheless, morally certain moral judgments are certainly not persuasio, as they are backed with some justification. Thus, we should regard them as attaining the status of cognitio—just shy of scientia (but for different reasons than the cognitio of the atheist geometer, presuming that the moral agent has completed the Meditations and knows that her faculties—in normal circumstances—are reliable).

However, it is important to note that Descartes is not claiming that first-order moral knowledge is impossible tout court. That is, Descartes is not a non-cognitivist about moral judgments, claiming that moral judgments are neither true nor false. Cartesian moral judgments are truth-evaluable; that is, they are capable of being true or false. Descartes, then, is a cognitivist about moral judgments. As Descartes says, we must recognize that although our best practical moral judgments are morally certain, they may still, “absolutely speaking,” be false. If Descartes is a moral skeptic of some stripe, he should be understood as making a plausible claim about our limitations as finite minds. A finite mind, given its limited and imperfect perceptions, cannot attain first-order moral knowledge because it cannot ultimately know whether its first-order moral judgments are true or false. However, an infinite mind—God—surely knows whether the first-order moral judgments of finite minds are true or false. First-order moral knowledge is possible—finite minds just cannot attain it.

One final remark. One might resist the standard interpretation that we cannot have first-order moral knowledge, by claiming that Descartes is not a moral skeptic at all, because the standards for knowledge shift from the contemplation of truth to the conduct of life. That is, Descartes might be an epistemic contextualist. Epistemic contextualism is the claim that the meaning of the term ‘knows’ shifts depending on the context, in the same way the meaning of the indexical ‘here’ shifts depending on the context. If Jones utters the sentence ‘Brown is here,’ the meaning of the sentence will shift depending on where Jones is when he utters it (Rysiew 2007). This kind of contextualist view has been suggested in passing by Lex Newman (2016), who argues that Descartes’ epistemic standards shift depending on whether he is doing metaphysics or science (Principles IV.205–6, AT VIIIA: 327–9/CSM I: 289–291). Although Newman does not extend this contextualist interpretation to Descartes’ moral epistemology, it would take only a few steps to do so. Nonetheless, it strains credulity to think that first-order moral judgments could ever meet the standards of scientia in the Meditations.

c. Virtue qua Resolution

We can now clarify why Descartes characterizes virtue in terms of a resolution. The conduct of life presents us with a unique epistemic challenge that does not arise in the contemplation of truth. That is: (1) we have a short of window of opportunity to arrive at a moral judgment and then act, and (2) the perceptions that in part serve as the basis for our judgments are ultimately obscure and confused. These two features can give rise to irresolution.  Irresolution, according to Descartes, is a kind of anxiety which causes a person to withhold from performing an action, creating a cognitive space for the person to make a choice (Passions III.170, AT XI: 459/CSM I: 390). As such, irresolution can be a good cognitive trait. However, irresolution becomes problematic when one has “too great a desire to do well” (Passions III. 170, AT XI: 460/CSM I: 390). If one wants to arrive at the most perfect moral judgment (such as through grounding their moral judgments in clear and distinct perceptions), they will ultimately fall into an excessive kind of irresolution which prevents them from judging and acting at all. Given the nature of moral situations and what is at stake within them (essentially, how we ought to treat other people), the conduct of life is ripe for producing this excessive kind of irresolution. Arguably, we do want perfection in our moral conduct.

This is why Descartes says virtue involves a resolution: we need to establish a firm and constant resolve to arrive at our best moral judgments and to carry them out, even though we realize that these judgments are only morally certain and can be false. So long as we have this firm resolve (which is of course guided by knowledge of the truth), we can be assured that we have done our duty, even if it turns out that we retrospectively determine that what we did was wrong. For we can control only our will—how our action plays out in the real world is beyond our control, and there is no way we can guarantee that we will always produce the right consequences. As Descartes tells Princess Elizabeth:

There is nothing to repent of when we have done what we judged best at the time when we had to decide to act, even though later, thinking it over at our leisure, we judge that we made a mistake. There would be more ground for repentance if we had acted against our conscience, even though we realized afterwards that we had done better than we thought. For we are only responsible for our thoughts, and it does not belong to human nature to be omniscient, or always to judge as well on the spur of the moment as when there is plenty of time to deliberate.

(Letter to Princess Elizabeth 6 October 1645, AT IV 308/CSMK: 269; see also Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325)

Consistent with Descartes’ grounding of virtue in a perfection of the will, Descartes’ view of moral responsibility is that we are responsible only for what is truly under our control—that is, our thoughts (or more specifically our volitions). Notice that the seeds of this full analysis of virtue qua resolution is present in the provisional morality, namely, M2.

6. The Passions

Strictly speaking, Descartes’ Passions of the Soul is not an ethical treatise. As Descartes writes, “my intention was to explain the passions only as a natural philosopher, and not as a rhetorician or even as a moral philosopher” (Prefatory Letters, AT XI: 326/CSM I: 327). Nonetheless, the passions have a significant status within Descartes’ ethics. At the end of Passions, Descartes writes: “it is on the passions alone that all the good and evil of this life depends” and “the chief use of wisdom lies in its teaching us to be masters of our passions and to control them with such skill that the evils which they cause are quite bearable, and even become a source of joy” (Passions III.212, AT XI: 488/CSM I: 404). Thus, it is important to discuss Cartesian passions in order to understand Descartes’ ethics. We will consider (1) the function of the passions and (2) whether the passions are merely motivational or representational states.

a. The Definition of the Passions

Descartes identifies a general sense of the term ‘passion,’ which covers all states of the soul that are not, in any way, active. That is, passions are passive and thus are perceptions: “all our perceptions, both those we refer to objects outside us and those we refer to the various states of our body, are indeed passions with respect to our soul, so long as we use the term ‘passion’ in its most general sense” (Passions I.25, AT XI: 347–8/CSM I: 337). Thus, a general use of the term ‘passion’ would include the following kinds of perceptions: smells, sounds, colors, hunger, pain, and thirst, all of which are states that we refer to objects outside of us (Passions I.29, AT XI: 350/CSM I: 339). However, the more narrow and strict sense of passions that is examined in the Passions are “those perceptions, sensations, or emotions of the soul which we refer particularly to it, and which are caused, maintained and strengthened by some movement of the spirits” (Passions I.27, AT XI: 349/CSM I: 338–9). Descartes identifies six primitive passions, out of which all of the other more complex passions are composed. These are wonder, love, hatred, joy, sadness, and desire. Each primitive and complex passion is distinguished from the others in terms of its physiological and causal basis (roughly, the animal spirits which give rise to it) and its cognitive nature and specific function (Passions II.51–2, AT XI: 371–2/CSM I: 349).

b. The Function of the Passions

Given Descartes’ general resistance to teleology, there is much to be said about how to understand the nature of Cartesian functions in general, and specifically the function of the passions (Brown 2012). Setting aside the issue of reconciling any metaphysical inconsistencies, it is clear that Descartes does think that the passions have some kind of function, and we must be mindful of this in interpreting Descartes.

In Passions II.52, Descartes identifies the general function of the passions:

I observe, moreover, that the objects which stimulate the senses do not excite different passions in us because of differences in the objects, but only because of the various ways in which they may harm or benefit us or in general have importance for us. The function of all the passions consists solely in this, that they dispose our soul to want the things which nature deems useful for us, and to persist in this volition; and the same agitation of the spirits which normally causes the passions also disposes the body to make movements which help us to attain these things. (AT XI: 372/CSM I: 349, cf. Passions I.40, AT XI: 359/CSM I: 343)

Descartes claims that the general function of the passions is to dispose the soul to want the things which nature deems useful for us, and to also dispose the body to move in the appropriate ways so as to attain those things. Put more simply, the passions are designed to preserve the mind-body composite. How exactly that plays out will depend on the kind of passion under consideration. As Descartes writes in Passions I.40, fear disposes the soul to want to flee (a bodily action) and courage disposes the soul to want to fight (a bodily action as well).

It is important to note that the general function assigned to the passions is similar to, but slightly different than, the one assigned to sensations in the Sixth Meditation. In the context of his sensory theodicy, Descartes writes: “the proper purpose of the sensory perceptions given me by nature is simply to inform the mind of what is beneficial or harmful for the composite of which the mind is a part” (Sixth Meditation, AT VII: 83/CSM II: 57). Supposing that Descartes does not have passions in mind in the Sixth Meditation, and given Descartes’ strict distinction between sensations and passions in the Passions, it seems that passions and sensations have different functions. The function of a passion is to dispose the soul to want what is beneficial or harmful for it, while the function of a sensation is to inform the soul of what is beneficial or harmful for it. This would suggest that sensations are perhaps representational states (De Rosa 2007, Gottlieb & Parvizian 2018, Hatfield 2013, Simmons 1999), whereas the passions are merely motivational.

But matters are more complicated. A vexing issue for commentators has been how the passions fulfill their function of disposing the soul to want certain things. It is clear that the passions are motivational. But the interpretive issue for commentators has been whether the passions are merely motivational (and thus non-intentional, affective states), or whether they are, to some degree, representational as well. Settling this issue is important, because it helps clarify whether the passions ought to serve as guides to our practical behavior.

c. Whether the Passions are Representational or Motivational

The standard interpretation is that the passions are representational in addition to being motivational (Alanen 2003a & 2003b, Brown 2006, Clarke 2005, Franco 2015). Sometimes commentators describe the passions as being informative, but the best way to cash this out given Descartes’ philosophy of mind is in terms of representation. There are broad reasons for claiming that the passions are representational. If one thinks that the passions are a type of idea, then it seems that they must be representational, for Descartes claims in the Third Meditation that all ideas have intentionality: “there can be no ideas which are not as it were of things” (AT VII: 44/CSM II: 30). Moreover, Descartes seems to make a representationalist claim about the passions in T5: “all our passions represent to us the goods to whose pursuit they impel us as being much greater than they really are” (Letter to Princess Elizabeth 15 September 1645, AT IV: 294–295/CSMK: 267; see also Passions II. 90, AT XI: 395/CSM I: 360). Strictly speaking, the claim here seems to be that the passions have representational content—they represent goods for the mind-body composite—but they are ultimately misrepresentational because they exaggerate the value of those goods. However, it is claimed that the passions can be a guide to our survival and preservation once they are regulated by reason. According to John Marshall, once the passions are regulated they can become accurate representations of goods (1998: 119–125). As such, the passions can be reliable guides to our survival and preservation under the right circumstances.

Alternatively, it has been argued that, despite the textual evidence, Descartes’ considered view is that the passions are merely motivational states (Greenberg 2007, Brassfield 2012). Shoshana Brassfield has argued that the passions are motivational states which serve to strengthen and prolong certain thoughts which are good for the soul to cognitively sustain. When Descartes speaks of the passions representing, we need to re-read him as actually saying one of two things. First, he may be clarifying a representational content (distinct from a passion) that a particular passion strengthens and prolongs. Second, he may be discussing how the passions lead us to exaggerate the value of objects in our judgments, by prolonging and strengthening certain judgments, which thus make us mistakenly affirm that a particular object is more valuable than it actually is.

The upshot of this type of motivational reading of the passions is that the passions are not intrinsic guides to our survival and preservation, and that we should suspend judgment about how to act when we are moved by the passions. It is reason alone that is the guide to what is good and beneficial for the mind-body composite. The passions are, in some sense, beneficial when they are regulated by reason (and thus lead, for example, to proper experiences of joy and thus happiness), but they are not beneficial when reason is guided by the passions.

7. Generosity

According to Descartes, generosity—a species of wonder—is both a passion and a virtue (Passions III.153, AT XI: 445–6/CSM I: 384, Passions III.161, AT XI: 453–4/CSM I: 387–8). Generosity transitions from a passion to a virtue once the passion becomes a habit of the soul (Passions III.161, AT XI: 453–4/CSM I: 387–8). Having already discussed passions, we will focus on generosity qua virtue. Generosity is the chief virtue in Descartes’ ethics because it is the “key to all the virtues and a general remedy for every disorder of the passions” (Passions III.161, AT XI: 454/CSM I: 388). Descartes defines generosity as that:

Which causes a person’s self-esteem to be as great as it may legitimately be, [and] has only two components. The first consists in his knowing that nothing truly belongs to him but this freedom to dispose his volitions, and that he ought to be praised or blamed for no other reason than his using this freedom well or badly. The second consists in his feeling within himself a firm and constant resolution to use it well—that is, never to lack the will to undertake and carry out whatever he judges to be best. To do that is to pursue virtue in a perfect manner. (Passions III.153, AT XI: 445–6/CSM I: 384)

Generosity has two components. The first, broadly construed, consists in the knowledge that the only thing that truly belongs to us is our free will. The second, broadly construed, consists in feeling the firm and constant resolution to use this free will well.

a. Component One: What Truly Belongs to Us

What is particularly noteworthy about Descartes’ definition of generosity is the first component. Descartes claims that the first component of generosity consists in knowledge of the following proposition: the only thing that truly belongs to me is my free will. This is certainly a strong claim, which goes beyond Descartes’ account of the role of the will in virtue, as discussed in section 3. Recall that we claimed that virtue is grounded in a perfection of the will, because only our volitions are under our control. Descartes is taking this a step further here: he now seems to be claiming that the only thing that truly belongs to us is free will. In claiming that free will “truly belongs” to us, Descartes seems to be making a new metaphysical claim about the status of free will within a finite mind. But how exactly should this claim be interpreted?

The locution “belongs” and “truly belongs” is typically used by Descartes to make a metaphysical claim about the essence of a substance. For example, Descartes claims that his body does not truly belong to his essence (see Sixth Meditation, AT VII: 78/CSM II: 54). If Descartes is making a claim about our metaphysical essence in the definition of generosity, then this claim seems to be in clear contradiction with the account of our metaphysical essence in the Meditations and Principles. There, Descartes claims that he is essentially a thinking thing, res cogitans (Second Meditation, AT VII: 28/CSM II: 19). Although a body also belongs to him in some sense (Sixth Meditation, AT VII: 80/CSM II: 56; see also Chamberlain 2019), he can still draw a real distinction between his mind and body, which implies that what truly belongs to him is thought. Thought, in the Meditations, has a broad scope: in particular, it includes both the intellect and the will as well as all of the different types of perceptions and volitions that fall under these two faculties (Principles I.9, AT VIIIA: 7–9/CSM I: 195). However, in the first component of generosity, Descartes seems to be claiming that there is a particular kind of thought that truly belongs to us, namely, our free will and its corresponding volitions. As such, the moral agent is not strictly speaking a res cogitans; rather, she is a willing thing, res volans (Brown 2006: 25; Parvizian 2016).

Commentators have picked up on this difficulty in Descartes’ definition of generosity. There are two interpretations in the literature. One reading goes in for a metaphysical reading of ‘truly belongs,’ according to which Descartes is making a metaphysical claim about our true essence (Boehm 2014: 718–19). Another reading takes an evaluative reading which approximates the standard account of why virtue is a perfection of the will, namely, that Descartes is making a claim about what is under our control—that is, our volitions—and thus what we can be truly praised and blamed for (Parvizian 2016). In this reading, there is a sense in which a human being is truly a res volans, but this does not metaphysically exclude the other properties of a res cogitans from its nature.

b. Acquiring Generosity

How is the chief virtue of generosity acquired? Descartes writes:

If we occupy ourselves frequently in considering the nature of free will and the many advantages which proceed from a firm resolution to make good use of it—while also considering, on the other hand, the many vain and useless cares which trouble ambitious people—we may arouse the passion of generosity in ourselves and then acquire the virtue. (Passions III. 161, AT XI: 453–4/CSM I: 388)

Here, Descartes claims that we need to reflect on two aspects of the will. First, we need to reflect on the very nature of the will. This includes facts such as its freedom, its being infinite in scope, and its different functional capacities. Second, we need to reflect on the advantages and disadvantages that come from using it well and poorly, respectively. This reflection on the advantages and disadvantages, interestingly, seems to require observation of other people’s behavior. As Descartes writes, we need to observe “the main vain and useless cares which trouble ambitious people,” which will help us appreciate the value and efficacy of the will. There are some commentators who have claimed that this process for acquiring generosity is exemplified in the Second or Fourth Meditation (Boehm 2014, Shapiro 2005), while other commentators have argued that the meditator cannot engage in the process of acquiring generosity until after the Meditations have been completed (Parvizian 2016).

c. Generosity and the Regulation of the Passions

Throughout the Passions, Descartes indicates different ways to remedy the disorders of the passions. Descartes claims, for example, that the exercise of virtue is a remedy against the disorders of the passions, because then “his conscience cannot reproach him,” which allows the moral agent to be happy amidst “the most violent assaults of the passions” (Passions II.148, AT XI: 441–2/CSM I: 381–2). However, Descartes claims that generosity is a “general remedy for every disorder of the passions” (Passions III.161, AT XI: 454/CSM I: 388). Descartes writes:

They [generous people] have mastery over their desires, and over jealousy and envy, because everything they think sufficiently valuable to be worth pursuing is such that its acquisition depends solely on themselves; over hatred of other people, because they have esteem for everyone; over fear, because of the self-assurance which confidence in their own virtue gives them; and finally over anger, because they have little esteem for everything that depends on others, and so they never give their enemies any advantage by acknowledging that they are injured by them. (Passions III.156, AT XI: 447–8/CSM I: 385)

Generosity is a general remedy for the disorders of the passions because it ultimately leads the moral agent to a proper conception of what she ought to esteem. At bottom, the problem of the passions is that they lead us to misunderstand the value of various external objects, and to place our own self-esteem in them. Once we understand that the only property that is truly valuable is a virtuous will, then all the passions will be regulated.

d. The Other-Regarding Nature of Generosity

Although Descartes’ definition of generosity is certainly not standard, his account of how generosity manifests in the world does coincide with our standard intuitions about what generosity looks like. According to Descartes, the truly generous person is fundamentally other-regarding:

Those who are generous in this way are naturally led to do great deeds, and at the same time not to undertake anything of which they do not feel themselves capable. And because they esteem nothing more highly than doing good to others and disregarding their own self-interest, they are always perfectly courteous, gracious, and obliging to everyone. (Passions III.156, AT XI: 447–8/CSM I: 385)

The fundamental reason why the generous person is other-regarding is that she realizes that the very same thing that causes her own self-esteem, a virtuous will, is present or at least capable of being present in other people (Passions III.154, AT XI: 446–7/CSM I: 384). That is, since others have a free will, they are also worthy of value and esteem and thus must be treated in the best possible way. A fundamental task of the generous person is to help secure the conditions for other people to realize their potential to acquire a virtuous will.

8. Love

Love is a passion that has direct ethical implications for Descartes, for in its ideal form love is altruistic, other-regarding, and requires self-sacrifice. Descartes distinguishes between different kinds of love: affection, friendship, devotion, sensory love, and intellectual love (Passions II. 83 AT XI: 389–90/CSM I: 357–8; Letter to Chanut 1 February 1647, AT IV: 600–617/CSMK: 305–314). We examine love in general:

Love is an emotion of the soul caused by a movement of the spirits, which impels the soul to join itself willingly to objects that appear to be agreeable to it. (Passions II.79, AT XI: 387/CSM I: 356)

In explicating what it means for the soul to join itself willingly to objects, Descartes writes:

In using the word ‘willingly’ I am not speaking of desire, which is a completely separate passion relating to the future. I mean rather the assent by which we consider ourselves henceforth as joined with what we love in such a manner that we imagine a whole, of which we take ourselves to be only one part, and the thing loved to be the other. (Passions II.80 AT XI: 387/CSM I: 356)

In short, love involves an expansion of the self. The lover regards herself and the beloved as two parts of a larger whole. But this raises an important question: is there a metaphysical basis for this part-whole relationship? Or is the part-whole relationship merely a product of the imagination and the will?

a. The Metaphysical Reading

One could try to provide a metaphysical basis for love by arguing that people are metaphysical parts of larger wholes. If so, then there would be metaphysical grounds “to justify a very expansive love” (Frierson 2002: 325). Indeed, Descartes seems to claim as much in his account of T4:

Though each of us is a person distinct from others, whose interests are accordingly in some way different from those of the rest of the world, we ought still to think that none of us could subsist alone and that each one of us is really one of the many parts of the universe, and more particularly a part of the earth, the state, the society and the family to which we belong by our domicile . . . and the interests of the whole, of which each of us is a part, must always be preferred to those of our own particular person. (Letter to Princess Elizabeth 15 September 1645, AT IV: 293/CSMK: 266)

Descartes uses suggestive metaphysical language here. Indeed, he claims that people cannot subsist without the other parts of the universe (which includes other people), and that we are parts of a larger whole. Given this metaphysical basis of love, then, the interests of the whole should be preferred to the interests of any given part.

There are interpretive problems for a metaphysical basis of love, however. For one, Descartes does not spell out this metaphysical relation in any detail. Moreover, such a metaphysical relation seems to fly in the face of Descartes’ account of the independent nature of substances and the real distinction between minds and bodies. To say that persons (mind-body composites) are parts of larger wholes would seem to suggest that (1) mind-body composites are modes and not substances, and consequently that (2) there is no real distinction between mind-body composites.

b. The Practical Reading

Alternatively, one could give a practical basis for love, by arguing that we ought to consider or imagine ourselves as parts of larger wholes, even though metaphysically we are not (Frierson 2002). As Descartes writes to Princess Elizabeth:

If we thought only of ourselves, we could enjoy only the goods which are peculiar to ourselves; whereas, if we consider ourselves as parts of some other body, we share also in the goods which are common to its members, without losing any of those which belong only to ourselves. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)

There are practical reasons for loving others, because doing so allows us to partake in their joy and perfections. Of course, this raises the problem that we will also partake in their imperfections and suffering. On this issue Descartes writes:

With evils, the case is not the same, because philosophy teaches that evil is nothing real, but only a privation. When we are sad on account of some evil which has happened to our friends, we do not share in the defect in which this evil consists. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)

In either the metaphysical or practical reading, however, it is clear that love has a central role in Descartes’ ethics. According to Descartes, inculcating and exercising love is central for curbing one’s selfishness and securing the happiness, well-being, and virtue of others (see also Letter to Chanut 1 February 1647, AT VI: 600–617/CSMK: 305–314). For further important work on Cartesian love see Frigo (2016), Boros (2003), Beavers (1989), Williston (1997).

9. Happiness

In general, Descartes characterizes happiness as an inner contentment or satisfaction of the mind that results from the satisfaction of one’s desires. However, he draws a distinction between mere happiness (bonheur) and genuine happiness or blessedness (felicitas; félicité/béatitude). Mere happiness, according to Descartes, is contentment of mind that is acquired through luck and fortune. This occurs through the acquisition of goods—such as honors, riches, and health—that do not truly depend on the moral agent (that is, her will) but external conditions. Although the moral agent is satisfying her desires, these desires are not regulated by reason. As such, she seeks things beyond her control. Blessedness, however, is a supreme contentment of mind achieved when the moral agent satisfies desires that are regulated by reason, and reason dictates that we ought to prioritize and desire virtue and wisdom. This is because virtue and wisdom are goods that truly depend on the moral agent, as they truly proceed from the right use of the will, and do not depend on any external conditions. As Descartes writes:

We must consider what makes a life happy, that is, what are the things which can give us this supreme contentment. Such things, I observe, can be divided into two classes: those which depend on us, like virtue and wisdom, and those which do not, like honors, riches, and health. For it is certain that a person of good birth who is not ill, and who lacks nothing, can enjoy a more perfect contentment than another who is poor, unhealthy and deformed, provided the two are equally wise and virtuous. Nevertheless, a small vessel may be just as full as a large one, although it contains less liquid; and similarly if we regard each person’s contentment as the full satisfaction of all his desires duly regulated by reason, I do not doubt that the poorest people, least blest by nature and fortune, can be entirely content and satisfied just as much as everyone else, although they do not enjoy as many good things. It is only this sort of contentment which is here in question; to seek the other sort would be a waste of time, since it is not in our own power. (Letter to Princess Elizabeth 4 August 1645, AT IV: 264–5/CSMK: 257)

It is important to note that Descartes is not denying that honors, riches, beauty, health, and so on are genuine goods or perfection. Nor is he claiming that they are not desirable. Rather, he is merely claiming that such goods are neither necessary nor sufficient for blessedness. Virtue alone is necessary and sufficient for blessedness (Svensson 2015).

However, such external goods are conducive to well-being (the quality of life), and for that reason they are desirable (Svensson 2011). Compare a virtuous person, S, who is poor, unhealthy, and ugly and a virtuous person, S*, who is rich, healthy, and beautiful. In Svensson’s reading, S and S* will have the same degree of happiness. However, Descartes does have room to acknowledge that S* has more well-being than S, because S* possesses more perfections.

10. Classifying Descartes’ Ethics

We have examined the main features of Descartes’ ethics. But what kind of ethics does Descartes espouse? There are three distinct classifications of Descartes’ ethics in the literature: virtue ethics, deontological virtue ethics, and perfectionism.

a. Virtue Ethics

Given that virtue is the undeniable centerpiece of Descartes’ ethics, it is natural to read Descartes as a virtue ethicist. Broadly construed, according to virtue ethics, the standard for morality in ethics is possession of the right kinds of character traits (virtues), as opposed to producing the right sorts of consequences, or following the right kinds of moral laws, duties, or rules.

Lisa Shapiro has argued that Descartes is a virtue ethicist. Her contention is that Descartes’ commitment to virtue (as opposed to happiness) being the supreme good makes Descartes a virtue ethicist (2008a: 454). In this view, the ultimate explanation for why an action is good or bad is whether it proceeds from virtue. This would place Descartes in the tradition of Aristotelian virtue ethics, but Shapiro notes that there are significant differences. For Aristotle, virtue must be successful: “virtue requires the world cooperate with our intentions” (2008a: 455). Whereas given Descartes’ moral epistemology, for Descartes “good intentions are sufficient for virtue” (Ibid.).

b. Deontological Virtue Ethics

Noa Naaman-Zauderer (2010) agrees with Lisa Shapiro that Descartes is a virtue ethicist, due to his commitment to virtue being the supreme good. However, Naaman-Zauderer claims that Descartes has a deontological understanding of virtue, and thus Descartes is actually a deontological virtue ethicist. Broadly construed, deontological ethics maintain that the standard of morality consists in the fulfillment of imperatives, duties, or ends.

Descartes indeed speaks of virtue in deontological terms. For example, he writes that the supreme good (virtue) is “undoubtedly the thing we ought to set ourselves as the goal of all our actions” (Letter to Princess Elizabeth 18 August 1645, AT IV: 275/CSMK: 2561). According to Naaman-Zauderer, Descartes is claiming that we have a duty to practice virtue: “the practice of virtue as a command of reason, as a constitutive moral imperative that we must fulfill for its own sake” (2010: 185).

c. Perfectionism

Frans Svensson (2010; compare 2019a) has argued that Descartes is not a virtue ethicist, and that other commentators have mistakenly classified him as such due to a misunderstanding of the criteria of virtue ethics. Recall that Shapiro and Naaman-Zauderer claim that Descartes must be a virtue ethicist (of whatever stripe) due to his claim that virtue is the supreme good. However, Svensson claims that virtue ethics, deontological ethics, and consequential ethics alike can, strictly speaking, admit that virtue is the supreme good, in the sense that virtue should be the goal in all of our actions (2010: 217). Descartes’ account of the supreme good, then, does not make him a virtue ethicist.

The criterion for being a virtue ethicist is that “morally right conduct should be grounded ultimately in an account of virtue or a virtuous agent” (Ibid. 218). This requires an explanation of the nature of virtue that does not depend on some independent account of morally right conduct. The problem, however, is that although Descartes agrees that virtue can be explained without reference to some independent account of morally right conduct, Descartes departs from the virtue ethicist in that he thinks that virtue is not constitutive of morally right conduct.

Instead, Svensson proposes that Descartes is committed to perfectionism. In this view, what Descartes’ ethics demands is that the moral agent pursue “everything in his power in order to successfully promote his own overall perfection as far as possible” (Ibid. 221). As such, Svensson claims that Descartes’ ethics is “outcome-based, rather than virtue-based, and it is thus best understood as a kind of teleological, or even consequentialist ethics” (Ibid. 224).

11. Systematicity Revisited

Are there systematic connections between Descartes’ ethics and his metaphysics, epistemology, and natural philosophy? There are broadly two answers to this question in the literature: the epistemological reading and the organic reading.

a. The Epistemological Reading

In the epistemological reading, the tree of philosophy conveys an epistemological order to Cartesian philosophy (Marshall 1998, 2–4, 72– 74, 59–60; Morgan 1994, 204–211; Rutherford 2004, 190). One must learn philosophy in the following order: metaphysics and epistemology, physics, and then the various sub-branches of natural philosophy, and finally ethics. As applied to ethics, proponents of the epistemological reading are primarily concerned with an epistemological order to ethics qua practical enterprise, not theoretical enterprise. For example, in order to acquire virtue and happiness, one must have knowledge of metaphysics and epistemology. As Donald Rutherford writes: virtue and happiness “can be guaranteed only if reason itself has been perfected through the acquisition and proper ordering of intellectual knowledge” (2004: 190).

A consequence of the epistemological reading is that one cannot read any ethical practices into the Meditations. While there may be ethical themes in the Meditations, the meditator cannot acquire or exercise any kind of moral virtue (epistemic virtue is a separate matter). The issue of whether virtue has a role in the Meditations has been a contemporary topic of debate. In particular, there has been a debate about whether the meditator acquires the virtue of generosity. Recall that the virtue of generosity consists of two components: the knowledge that the only thing that truly belongs to us is free will, and the firm and constant resolution to use the will well. It seems that the meditator, in the Fourth Meditation, acquires both of these components through her reflection on the nature of the will and her resolution to use the will well. Indeed, Lisa Shapiro has argued extensively that this is exactly what is happening, and thus generosity—and ethics more generally—has a role in the epistemic achievements of the meditator and the regulation of her passions. Omri Boehm (2014) has also argued that the virtue of generosity is actually acquired in the Second Meditation vis-à-vis the cogito. Parvizian (2016) has argued against Shapiro and Boehm’s view, arguing that generosity presupposes the knowledge of T1–T6 explained in section 4, which the meditator does not have access to by the Second or Fourth Meditation. But let us turn to the view that argues that ethics does have a role in metaphysics and epistemology.

b. The Organic Reading

In the organic reading, the tree of philosophy does not represent strict divisions between philosophical fields, and there is not a strict epistemological order to philosophy, and especially ethics qua practical enterprise. Rather the tree is organic. This reading is drawn from Lisa Shapiro (2008a), Genevieve Rodis-Lewis (1987), Amy Schmitter (2002), and Vance Morgan (1994) (although Morgan does not draw the same conclusion about ethics as the rest of these commentators). Morgan writes: “in a living organism such as a tree, all the connected parts grow simultaneously, dependent upon one another . . . hence the basic structure of the tree, branches and all, is apparent at the very early stage in its development” (1994, 25). Developing Rodis-Lewis’ interpretation, Shapiro writes:

Generosity is a seed-bearing fruit, and that seed, if properly cultivated, will grow into the tree of philosophy. In this way, morals is not simply one branch among the three branches of philosophy, but provides the ‘ultimate level of wisdom’ by leading us to be virtuous and ensuring the tree of philosophy continues to thrive. (2008a: 459)

Applying this view to generosity, Shapiro claims that generosity is “the key to Cartesian metaphysics and epistemology” (2008a: 459). Placing generosity in the Meditations has interpretive benefits. In particular, it may be able to explain the presence and regulation of the meditator’s passions from the First to Sixth Meditation (Shapiro 2005). Moreover, it shows the deep systematicity of Descartes’ ethics, for ethical themes are present right at the foundations of the system.

12. References and Further Reading

a. Abbreviations

  • AG: Philosophical Essays (cited by page)
  • AT: Oeuvres de Descartes (cited by volume and page)
  • CSM: The Philosophical Writings of Descartes, vol. 1 & 2 (cited by volume and page) ‘CSMK’: The Philosophical Writings of Descartes, vol. 3 (cited by page).

b. Primary Sources

  • Aristotle. (2000). Nicomachean Ethics. Translated by Terence Irwin (Second Edition). Indianapolis: Hackett.
  • Descartes, R. (1996), Oeuvres de Descartes. (C. Adam, & P. Tannery, Eds.) Paris: J. Vrin.
  • Descartes, R. (1985). The Philosophical Writings of Descartes (Vol. II). (J. Cottingham, R. Stoothoff, & D. Murdoch, Trans.) Cambridge: Cambridge University Press.
  • Descartes, R. (1985). The Philosophical Writings of Descartes (Vol. I). (J. Cottingham, R. Stoothoff, & D. Murdoch, Trans.) Cambridge: Cambridge University Press.
  • Descartes, R. (1991). The Philosophical Writings of Descartes: The Correspondence (Vol. III). (J. Cottingham, R. Stoothoff, D. Murdoch, & A. Kenny, Trans.) Cambridge: Cambridge University Press.
  • Leibniz, G. W. (1989). Philosophical Essays. Trans. Ariew, R. and Garber, D. Indianapolis: Hackett.
  • Princess Elizabeth and Descartes (2007). The Correspondence Between Princess Elizabeth of Bohemia and René Descartes. Edited and Translated by Lisa Shapiro. University of Chicago Press.

c. Secondary Sources

  • Alanen, L. (2003a). Descartes’s Concept of Mind. Harvard University Press.
  • Alanen, L. (2003b). “The Intentionality of Cartesian Emotions,” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books. 107–27.
  • Alanen, L. and Svennson, F. (2007). Descartes on Virtue. In Hommage à Wlodek Philosophical Papers Dedicated to Wlodek Rabinowicz, ed. by J.B. Petersson, D. Josefsson, and T. Egonsson. Rønnow-Rasmussen. http://www.fil.lu.se/hommageawlodek.
  • Ariew, R. (1992). “Descartes and the Tree of Knowledge,” Synthese, 1:101–116.
  • Beardsley, W. (2005), “Love in the Ruins: Passions in Descartes’ Meditations.” In J. Jenkins, J. Whiting, & C. Williams (Eds.), Persons and Passions: Essays in Honor of Annette Baier (pp. 34–47). Notre Dame: University of Notre Dame Press.
  • Beavers, A. F. (1989). “Desire and Love in Descartes’s Late Philosophy.” History of Philosophy Quarterly 6 (3):279–294.
  • Boehm, O. (2014), “Freedom and the Cogito,” British Journal for the History of Philosophy, 22: 704–724.
  • Boros, G. (2003). “Love as a Guiding Principle of Descartes’s Late Philosophy.” History of Philosophy Quarterly, 20(2), 149–163.
  • Brassfield, S. (2013), “Descartes and the Danger of Irresolution.” Essays in Philosophy, 14: 162–78.
  • Brown, D. J. (2006), Descartes and the Passionate Mind. Cambridge: Cambridge University Press.
  • Brown, D. J. (2012). Cartesian Functional Analysis. Australasian Journal of Philosophy 90 (1):75–92.
  • Chamberlain, C. (2019). “The body I call ‘mine’”: A sense of bodily ownership in Descartes. European Journal of Philosophy 27 (1): 3–24.
  • Clarke, D. M. (2005). Descartes’s Theory of Mind. Oxford University Press.
  • Cimakasky, Joseph & Polansky, Ronald (2012). Descartes’ ‘provisional morality’. Pacific Philosophical Quarterly 93 (3):353–372.
  • Davies, R. (2001), Descartes: Belief, Skepticism, and Virtue. London: Routledge.
  • Des Chene, D. (2012), “Using the Passions,” in M. Pickavé and L. Shapiro (eds.), Emotion and Cognitive Life in Medieval and Early Modern Philosophy. Oxford: Oxford University Press.
  • De Rosa, R. (2007a). ‘The Myth of Cartesian Qualia,’ Pacific Philosophical Quarterly 88(2), pp. 181–207.
  • Franco, A. B. (2015). “The Function and Intentionality of Cartesian Émotions.” Philosophical Papers 44 (3):277–319.
  • Franco, A. B. (2016). “Cartesian Passions: Our (Imperfect) Natural Guides Towards Perfection.” Journal of Philosophical Research 41: 401–438
  • Frierson, Patrick (2002). “Learning to love: From egoism to generosity in Descartes.” Journal of the History of Philosophy 40 (3):313–338.
  • Frigo, Alberto (2016). A very obscure definition: Descartes’s account of love in the Passions of the Soul and its scholastic background. British Journal for the History of Philosophy 24 (6):1097–1116.
  • Gottlieb, Joseph & Parvizian, Saja (2018). “Cartesian Imperativism.” Pacific Philosophical Quarterly (99): 702–725
  • Greenberg, Sean (2007). Descartes on the passions: Function, representation, and motivation. Noûs 41 (4):714–734.
  • Hatfield, G. (2013). ‘Descartes on Sensory Representation, Objective Reality, and Material Falsity,’ in K. Detlefsen (ed.) Descartes’ Meditations: A Critical Guide. Cambridge: Cambridge University Press, pp. 127–150.
  • Kambouchner, D. (2009). Descartes, la philosophie morale, Paris: Hermann.
  • LeDoeuff, M. (1989). “Red Ink in the Margins.” In The Philosophical Imaginary, trans. C. Gordon. Standford: Stanford Unviersity Press.
  • Marshall, J. (1998), Descartes’s Moral Theory. Ithaca: Cornell University Press.
  • Marshall, J. (2003). “Descartes’ Morale Par Provision,” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books.191–238
  • Mihali, A. (2011). “Sum Res Volans: The Centrality of Willing for Descartes.” International Philosophical Quarterly 51 (2):149–179.
  • Morgan, V. G. (1994), Foundations of Cartesian Ethics. Atlantic Highlands: Humanities Press.
  • Murdoch, D. (1993). “Exclusion and Abstraction in Descartes’ Metaphysics,” The
  • Philosophical Quarterly, 43: 38–57.
  • Naaman-Zauderer, N. (2010), Descartes’ Deontological Turn: Reason, Will, and Virtue in the Later Writings. Cambridge: Cambridge University Press.
  • Newman, L., “Descartes’ Epistemology”, The Stanford Encyclopedia of Philosophy (Winter 2014 Edition), Edward N. Zalta (ed.), URL = .
  • Parvizian, Saja (2016). “Generosity, the Cogito, and the Fourth Meditation.” Res Philosophica 93 (1):219–243
  • Pereboom, Derk, 1994. “Stoic Psychotherapy in Descartes and Spinoza,” Faith and Philosophy, 11: 592–625.
  • Rodis-Lewis, G. (1957). La morale de Descartes. [1. ed.] Paris: Presses universitaires de France
  • Rodis-Lewis, G. (1987), “Le Dernier Fruit de la Métaphysique Cartésienne: la Generosity”, Etudes Philosophiques, 1: 43–54.
  • Rutherford, D. (2004), “On the Happy Life: Descartes vis-à-vis Seneca,” in S. K. Strange, & J. Zupko (eds.), Stoicism: Traditions and Transformations. Cambridge: Cambridge University Press.
  • Rutherford, D. (2014). “Reading Descartes as a Stoic: Appropriate Actions, Virtue, and the Passions,” Philosophie antique, 14: 129–155.
  • Rysiew, Patrick, “Epistemic Contextualism”, The Stanford Encyclopedia of Philosophy (Winter 2016 Edition), Edward N. Zalta (ed.), URL = .
  • Schmitter, A. M. (2002), “Descartes and the Primacy of Practice: The Role of the Passions in the Search for Truth,” Philosophical Studies (108), 99–108.
  • Shapiro, L. (1999), “Cartesian Generosity,” Acta Philosophica Fennica, 64: 249–75.
  • Shapiro, L. , “What Are the Passions Doing in the Meditations?,” in J. Jenkins, J. Whiting, & C. Williams (eds.), Persons and Passions: Essays in Honor of Annette Baier. Notre Dame: University of Notre Dame Press.
  • Shapiro, L. (2008a), “Descartes’s Ethics,” In J. Broughton, & J. Carriero (eds.), A Companion to Descartes. Malden: Blackwell Publishing.
  • Shapiro, L. (2008b), ‘”Turn My Will in Completely the Opposite Direction”: Radical Doubt and Descartes’s Account of Free Will,” in P. Hoffman, D. Owen, & G. Yaffe (eds.), Contemporary Perspectives on Early Modern Philosophy. Buffalo: Broadview Press.
  • Shapiro, L. (2011), “Descartes on Human Nature and the Human Good,” in C. Fraenkel, J. E. Smith, & P. Dario (eds.), The Rationalists: Between Tradition and Innovation. New York: Springer.
  • Shapiro, L. (2013), “Cartesian Selves,” in K. Detlefsen (ed.), Descartes’ Meditations: A Critical Guide. Cambridge: Cambridge University Press.
  • Simmons, A. (1999). ‘Are Cartesian Sensations Representational?’ Noûs 33(3), pp. 347–369.
  • Svensson, F. (2010). The Role of Virtue in Descartes’ Ethical Theory, Or: Was Descartes a Virtue Ethicist?. History of Philosophy Quarterly 27(3): 215–236
  • Svensson, F. (2011). Happiness, Well-being, and Their Relation to Virtue in Descartes’ Ethics. Theoria 77 (3):238–260.
  • Svensson, F. (2015). Non-Eudaimonism, The Sufficiency of Virtue for Happiness, and Two Senses of the Highest Good in Descartes’s Ethics. British Journal for the History of Philosophy 23 (2):277–296.
  • Svensson, F. (2019a) “A Cartesian Distinction in Virtue: Moral and Perfect.” In Mind, Body, and Morality: New Perspectives on Descartes and Spinoza edited by Martina Reuter, Frans Svensson. Routledge.
  • Svensson, F. (2019b). “Descartes on the Highest Good.” American Catholic Philosophical Quarterly 93 (4):701–721.
  • Sosa, E. (2012), “Descartes and Virtue Epistemology,” in K. J. Clark, & M. Rea (eds.), Reason, Metaphysics, and Mind: New Essays on the Philosophy of Alvin Plantinga. Oxford: Oxford University Press.
  • Williston, B. (1997). Descartes on Love and/as Error. Journal of the History of Ideas 58 (3):429–444.
  • Williston, B. (2003). “The Cartesian Sage and the Problem of Evil” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books. 301–331.

Author Information

Saja Parvizian
Email: sparvizia@coastal.edu
Coastal Carolina University
U. S. A.