Fallacies

A fallacy is a kind of error in reasoning. The list of fallacies below contains 231 names of the most common fallacies, and it provides brief explanations and examples of each of them. Fallacious reasoning should not be persuasive, but it too often is.

The vast majority of the commonly identified fallacies involve arguments, although some involve only explanations, or definitions, or questions, or other products of reasoning. Some researchers, although not most, use the term “fallacy” very broadly to indicate any false belief or cause of a false belief. The long list below includes some fallacies of these sorts if they have commonly-known names, but most are fallacies that involve kinds of errors made while arguing informally in natural language, that is, in everyday discourse.

A charge of fallacious reasoning always needs to be justified. The burden of proof is on your shoulders when you claim that someone’s reasoning is fallacious. Even if you do not explicitly give your reasons, it is your responsibility to be able to give them if challenged.

A piece of reasoning can have more than one fault and thereby commit more than one fallacy. If it is fallacious, this can be because of its form or its content or both. The formal fallacies are fallacious only because of their logical form, their structure. The Slippery Slope Fallacy is an informal fallacy that has the following form: Step 1 often leads to step 2. Step 2 often leads to step 3. Step 3 often leads to…until we reach an obviously unacceptable step, so step 1 is not acceptable. That form occurs in both good arguments and faulty arguments. The quality of an argument of this form depends crucially on the strength of the probabilities in going from one step to the next. The probabilities involve the argument’s content, not merely its logical form.

The discussion below that precedes the long alphabetical list of fallacies begins with an account of the ways in which the term “fallacy” is imprecise. Attention then turns to some of the competing and overlapping ways to classify fallacies of argumentation. Researchers in the field of fallacies disagree about which name of a fallacy is more helpful to use, whether some fallacies should be de-emphasized in favor of others, and which is the best taxonomy of the fallacies. Researchers in the field are also deeply divided about how to define the term “fallacy” itself and how to define certain fallacies. There is no agreement on whether there are necessary and sufficient conditions for distinguishing between fallacious and non-fallacious reasoning generally. Analogously, there is doubt in the field of ethics regarding whether researchers should pursue the goal of providing necessary and sufficient conditions for distinguishing moral actions from immoral ones.

Introduction
Taxonomy of Fallacies
Pedagogy
What is a Fallacy?
Other Controversies
Partial List of Fallacies
References and Further Reading

1. Introduction

The first known systematic study of fallacies was due to Aristotle in his De Sophisticis Elenchis (Sophistical Refutations), an appendix to his Topics, which is one of his six works on logic. The six are collectively known as the Organon. He listed thirteen types of fallacies. Very few advances were made for many centuries after this. After the Dark Ages, fallacies again were studied systematically in Medieval Europe. This is why so many fallacies have Latin names. The third major period of study of the fallacies began in the later twentieth century due to renewed interest from the disciplines of philosophy, logic, communication studies, rhetoric, psychology, and artificial intelligence.

The more frequent the error within public discussion and debate the more likely it is to have a name. Nevertheless, there is no specific name for the fallacy of subtracting five from thirteen and concluding that the answer is seven, even though the error is common.

The term “fallacy” is not a precise term. One reason is that it is ambiguous. Depending on the particular theory of fallacies, it might refer either to (a) a kind of error in an argument, (b) a kind of error in reasoning (including arguments, definitions, explanations, questions, and so forth), (c) a false belief, or (d) the cause of any of the previous errors including what are normally referred to as “rhetorical techniques.”

Regarding (d), being ill, being hungry, being stupid, being hypercritical, and being careless are all sources of potential error in reasoning, so they could qualify as fallacies of kind (d), but they are not included in the list below, and most researchers on fallacies normally do not call them fallacies. These sources of errors are more about why people commit a fallacy than about what the fallacy is. On the other hand, wishful thinking, stereotyping, being superstitious, rationalizing, and having a poor sense of proportion also are sources of potential error and are included in the list below, though they would not be included in the lists of some researchers. Thus there is a certain arbitrariness to what appears in lists such as this. What have been left off the list below are the following persuasive techniques commonly used to influence others and to cause errors in reasoning: apple polishing, ridiculing, applying financial pressure, being sarcastic, selecting terms with strong negative or positive associations, using innuendo, weasling, and using other propaganda techniques. Basing any reasoning primarily on the effectiveness of one or more of these techniques is fallacious.

The fallacy literature has given some attention to the epistemic role of reasoning. Normally, the goal in reasoning is to take the audience from not knowing to knowing, or from not being justified in believing something to being justified in believing it.

Reasoning validly is not a guarantee of avoiding a fallacy since begging the question is valid. Arriving at the truth is also not a guarantee. Truth can be obtained from bad reasoning if the errors cancel out.

In describing the fallacies below, the custom is followed of not distinguishing between a reasoner using a fallacy and the reasoning itself containing the fallacy.

Real arguments are often embedded within a very long discussion. Richard Whately, one of the greatest of the 19th century researchers into informal logic, wisely said “A very long discussion is one of the most effective veils of Fallacy; …a Fallacy, which when stated barely…would not deceive a child, may deceive half the world if diluted in a quarto volume (an eight-page booklet).”

2. Taxonomy of Fallacies

The importance of understanding the common fallacy labels is that they provide an efficient way to communicate criticisms of someone’s reasoning. However, there are a number of competing and overlapping ways to classify the labels. The taxonomy of the fallacies is in dispute.

Multiple names of fallacies are often grouped together under a common name intended to bring out how the specific fallacies are similar. Here are three examples. (1) Fallacies of relevance include fallacies that occur due to reliance on an irrelevant reason. There are different kinds of these fallacies. Ad Hominem, Appeal to Pity, and Affirming the Consequent are all fallacies of relevance. (2) Accent, Amphiboly and Equivocation are examples of fallacies of ambiguity. (3) The fallacies of illegitimate presumption include Begging the Question, False Dilemma, No True Scotsman, Complex Question and Suppressed Evidence.

The fallacies of argumentation can be classified as either formal or informal. A formal fallacy can be detected by examining the logical form of the reasoning, whereas an informal fallacy usually cannot be detected this way because it depends upon the content of the reasoning and possibly the purpose of the reasoning. So, informal fallacies are errors of reasoning that cannot easily be expressed in our standard system of formal logic, the first-order predicate logic. The long list below contains very few formal fallacies. Fallacious arguments (as well as perfectly correct arguments) can be classified as deductive or inductive, depending upon whether the fallacious argument is most properly assessed by deductive standards or instead by inductive standards. Deductive standards demand deductive validity, but inductive standards require inductive strength such as making the conclusion more likely.

Fallacies of argumentation can be divided into other categories. Some classifications depend upon the psychological factors that lead people to use them. Those fallacies also can be divided into categories according to the epistemological factors that cause the error. For example, arguments depend upon their premises, even if a person has ignored or suppressed one or more of them, and a premise can be justified at one time, given all the available evidence at that time, even if we later learn that the premise was false. Also, even though appealing to a false premise is often fallacious, it is not if we are reasoning about what would have happened even if it did not happen.

3. Pedagogy

It is commonly claimed that giving a fallacy a name and studying it will help the student identify the fallacy in the future and will steer them away from using the fallacy in their own reasoning. As Steven Pinker says in The Stuff of Thought (p. 129),

If a language provides a label for a complex concept, that could make it easier to think about the concept, because the mind can handle it as a single package when juggling a set of ideas, rather than having to keep each of its components in the air separately. It can also give a concept an additional label in long-term memory, making it more easily retrievable than ineffable concepts or those with more roundabout verbal descriptions.

For pedagogical purposes, researchers in the field of fallacies disagree about the following topics: which name of a fallacy is more helpful to students’ understanding; whether some fallacies should be de-emphasized in favor of others; and which is the best taxonomy of the fallacies.

It has been suggested that, from a pedagogical perspective, having a representative set of fallacies pointed out to you in others’ reasoning is much more effective than your taking the trouble to learn the rules of avoiding all fallacies in the first place. But fallacy theory is criticized by some teachers of informal reasoning for its over-emphasis on poor reasoning rather than good reasoning. Do colleges teach Calculus by emphasizing all the ways one can make mathematical mistakes? Besides, studying fallacies will make students be overly critical. These critics want more emphasis on the forms of good arguments and on the implicit rules that govern proper discussion designed to resolve a difference of opinion.

4. What is a Fallacy?

Researchers disagree about how to define the very term “fallacy.” For example, most researchers say fallacies may be created unintentionally or intentionally, but some researchers say that a supposed fallacy created unintentionally should be called a blunder and not a fallacy.

Could there be a computer program, for instance, that could always successfully distinguish a fallacy from a non-fallacy? A fallacy is a mistake, but not every mistake is a fallacy.

Focusing just on fallacies of argumentation, some researchers define such a fallacy as an argument that is deductively invalid or that has very little inductive strength. Because examples of false dilemma, inconsistent premises, and begging the question are valid arguments in this sense, this definition misses some standard fallacies. Other researchers say a fallacy is a mistake in an argument that arises from something other than merely false premises. But the false dilemma fallacy is due to false premises. Still other researchers define a fallacy as an argument that is not good. Good arguments are then defined as those that are deductively valid or inductively strong, and that contain only true, well-established premises, but are not question-begging. A complaint with this definition is that its requirement of truth would improperly lead to calling too much scientific reasoning fallacious; every time a new scientific discovery caused scientists to label a previously well-established claim as false, all the scientists who used that claim as a premise would become fallacious reasoners. This consequence of the definition is acceptable to some researchers but not to others. Because informal reasoning regularly deals with hypothetical reasoning and with premises for which there is great disagreement about whether they are true or false, many researchers would relax the requirement that every premise must be true or at least known to be true. One widely accepted definition defines a fallacious argument as one that either is deductively invalid or is inductively very weak or contains an unjustified premise or that ignores relevant evidence that is available and that should be known by the arguer. Finally, yet another theory of fallacy says a fallacy is a failure to provide adequate proof for a belief, the failure being disguised to make the proof look adequate.

Other researchers recommend characterizing a fallacy as a violation of the norms of good reasoning, the rules of critical discussion, dispute resolution, and adequate communication. The difficulty with this approach is that there is so much disagreement about how to characterize these norms.

In addition, all the above definitions are often augmented with some remark to the effect that the fallacies need to be convincing or persuasive to too many people. It is notoriously difficult to be very precise about these notions. Some researchers in fallacy theory have therefore recommended dropping the notions altogether; other researchers suggest replacing them in favor of the phrase “can be used to persuade.”

Some researchers complain that all the above definitions of fallacy are too broad and do not distinguish between mere blunders and actual fallacies, the more serious errors.

Researchers in the field are deeply divided, not only about how to define the term “fallacy” and how to define some of the individual fallacies, but also about whether there are necessary and sufficient conditions for distinguishing between fallacious and non-fallacious reasoning generally. Analogously, there is doubt in the field of ethics whether researchers should pursue the goal of providing necessary and sufficient conditions for distinguishing moral actions from immoral ones.

5. Other Controversies

How do we defend the claim that an item of reasoning should be labeled as a particular fallacy? A major goal in the field of informal logic is provide some criteria for each fallacy. Schwartz presents the challenge this way:

Fallacy labels have their use. But fallacy-label texts tend not to provide useful criteria for applying the labels. Take the so-called ad verecundiam fallacy, the fallacious appeal to authority. Just when is it committed? Some appeals to authority are fallacious; most are not. A fallacious one meets the following condition: The expertise of the putative authority, or the relevance of that expertise to the point at issue, are in question. But the hard work comes in judging and showing that this condition holds, and that is where the fallacy-label-texts leave off. Or rather, when a text goes further, stating clear, precise, broadly applicable criteria for applying fallacy labels, it provides a critical instrument [that is] more fundamental than a taxonomy of fallacies and hence to that extent goes beyond the fallacy-label approach. The further it goes in this direction, the less it need to emphasize or even to use fallacy labels (Schwartz, 232).

The controversy here is the extent to which it is better to teach students what Schwartz calls “the critical instrument” than to teach the fallacy-label approach. Is the fallacy-label approach better for some kinds of fallacies than others? If so, which others?

One controversy involves the relationship between the fields of logic and rhetoric. In the field of rhetoric, the primary goal is to persuade the audience, not guide them to the truth. Philosophers concentrate on convincing the ideally rational reasoner.

Advertising in magazines and on television is designed to achieve visual persuasion. And a hug or the fanning of fumes from freshly baked donuts out onto the sidewalk are occasionally used for visceral persuasion. There is some controversy among researchers in informal logic as to whether the reasoning involved in this nonverbal persuasion can always be assessed properly by the same standards that are used for verbal reasoning.

6. Partial List of Fallacies

Consulting the list below will give a general idea of the kind of error involved in passages to which the fallacy name is applied. However, simply applying the fallacy name to a passage cannot substitute for a detailed examination of the passage and its context or circumstances because there are many instances of reasoning to which a fallacy name might seem to apply, yet, on further examination, it is found that in these circumstances the reasoning is really not fallacious.

Abusive Ad Hominem
Accent
Accentus
Accident
Ad Baculum
Ad Consequentiam
Ad Crumenum
Ad Hoc Rescue
Ad Hominem
Ad Hominem, Circumstantial
Ad Ignorantiam
Ad Misericordiam
Ad Novitatem
Ad Numerum
Ad Populum
Ad Verecundiam
Affirming the Consequent
Against the Person
All-or-Nothing
Ambiguity
Amphiboly
Anecdotal Evidence
Anthropomorphism
Appeal to Authority
Appeal to Consequence
Appeal to Emotions
Appeal to Force
Appeal to Ignorance
Appeal to Money
Appeal to Past Practice
Appeal to Pity
Appeal to Snobbery
Appeal to the Gallery
Appeal to the Masses
Appeal to the Mob
Appeal to the People
Appeal to the Stick
Appeal to Traditional Wisdom
Appeal to Vanity
Appeal to Unqualified Authority
Argument from Ignorance
Argument from Outrage
Argument from Popularity
Argumentum Ad ….
Argumentum Consensus Gentium
Availability Heuristic
Avoiding the Issue
Avoiding the Question
Bad Seed
Bald Man
Bandwagon
Begging the Question
Beside the Point
Biased Generalizing
Biased Sample
Biased Statistics
Bifurcation
Black-or-White
Caricaturization
Changing the Question
Cherry-Picking
Circular Reasoning
Circumstantial Ad Hominem
Clouding the Issue
Common Belief
Common Cause.
Common Practice
Complex Question
Composition
Confirmation Bias
Confusing an Explanation with an Excuse
Conjunction
Consensus Gentium
Consequence
Contextomy
Converse Accident
Cover-up
Cum Hoc, Ergo Propter Hoc
Curve Fitting
Definist
Denying the Antecedent
Digression
Disregarding Known Science
Distraction
Division
Domino
Double Standard
Either/Or
Equivocation
Etymological
Every and All
Exaggeration
Excluded Middle
False Analogy
False Balance
False Cause
False Dichotomy
False Dilemma
False Equivalence
Far-Fetched Hypothesis
Faulty Comparison
Faulty Generalization
Faulty Motives
Formal
Four Terms
Gambler’s
Genetic
Group Think
Guilt by Association
Hasty Conclusion
Hasty Generalization
Heap
Hedging
Hooded Man
Hyperbolic Discounting
Hypostatization
Ideology-Driven Argumentation
Ignoratio Elenchi
Ignoring a Common Cause
Ignoring Inconvenient Data
Improper Analogy
Incomplete Evidence
Inconsistency
Inductive Conversion
Insufficient Statistics
Intensional
Invalid Reasoning
Irrelevant Conclusion
Irrelevant Reason
Is-Ought
Jumping to Conclusions
Lack of Proportion
Line-Drawing
Loaded Language
Loaded Question
Logic Chopping
Logical Fallacy
Lying
Maldistributed Middle
Many Questions
Misconditionalization
Misleading Accent
Misleading Vividness
Misplaced Burden of Proof
Misplaced Concreteness
Misrepresentation
Missing the Point
Mob Appeal
Modal
Monte Carlo
Name Calling
Naturalistic
Neglecting a Common Cause
No Middle Ground
No True Scotsman
Non Causa Pro Causa
Non Sequitur
Obscurum per Obscurius
One-Sidedness
Opposition
Outrage, Argument from
Over-Fitting
Overgeneralization
Oversimplification
Past Practice
Pathetic
Peer Pressure
Perfectionist
Persuasive Definition
Petitio Principii
Poisoning the Well
Popularity, Argument from
Post Hoc
Prejudicial Language
Proof Surrogate
Prosecutor’s Fallacy
Prosody
Quantifier Shift
Question Begging
Questionable Analogy
Questionable Cause
Questionable Premise
Quibbling
Quoting out of Context
Rationalization
Red Herring
Refutation by Caricature
Regression
Reification
Reversing Causation
Scapegoating
Scare Tactic
Scope
Secundum Quid
Selective Attention
Self-Fulfilling Prophecy
Self-Selection
Sharpshooter’s
Slanting
Slippery Slope
Small Sample
Smear Tactic
Smokescreen
Sorites
Special Pleading
Specificity
Stacking the Deck
Stereotyping
Straw Man
Style Over Substance
Subjectivist
Superstitious Thinking
Suppressed Evidence
Sweeping Generalization
Syllogistic
Texas Sharpshooter’s
Tokenism
Traditional Wisdom
Tu Quoque
Two Wrongs do not Make a Right
Undistributed Middle
Unfalsifiability
Unrepresentative Sample
Unrepresentative Generalization
Untestability
Vested Interest
Victory by Definition
Willed ignorance
Wishful Thinking
You Too

Abusive Ad Hominem

See Ad Hominem.

Accent

The Accent Fallacy is a fallacy of ambiguity due to the different ways a word or syllable is emphasized or accented. Also called Accentus, Misleading Accent, and Prosody.

Example:

A member of Congress is asked by a reporter if she is in favor of the President’s new missile defense system, and she responds, “I’m in favor of a missile defense system that effectively defends America.”

With an emphasis on the word “favor,” her response is likely to be for the President’s missile defense system. With an emphasis, instead, on the word “effectively,” her remark is likely to be against the President’s missile defense system. And by using neither emphasis, she can later claim that her response was on either side of the issue. For an example of the Fallacy of Accent involving the accent of a syllable within a single word, consider the word “invalid” in the sentence, “Did you mean the invalid one?” When we accent the first syllable, we are speaking of a sick person, but when we accent the second syllable, we are speaking of an argument failing to meet the deductive standard of being valid. By not supplying the accent, and not supplying additional information to help us disambiguate, then we are committing the Fallacy of Accent.

Accentus

See the Fallacy of Accent.

Accident

We often arrive at a generalization but don’t or can’t list all the exceptions. When we then reason with the generalization as if it has no exceptions, our reasoning contains the Fallacy of Accident. This fallacy is sometimes called the “Fallacy of Sweeping Generalization.”

Example:

People should keep their promises, right? I loaned Dwayne my knife, and he said he’d return it. Now he is refusing to give it back, but I need it right now to slash up my neighbors who disrespected me.

People should keep their promises, but there are exceptions to this generalization as in this case of the psychopath who wants Dwayne to keep his promise to return the knife.

Ad Hoc Rescue

Psychologically, it is understandable that you would try to rescue a cherished belief from trouble. When faced with conflicting data, you are likely to mention how the conflict will disappear if some new assumption is taken into account. However, if there is no good reason to accept this saving assumption other than that it works to save your cherished belief, your rescue is an Ad Hoc Rescue.

Example:

Yolanda: If you take four of these tablets of vitamin C every day, you will never get a cold.

Juanita: I tried that last year for several months, and still got a cold.

Yolanda: Did you take the tablets every day?

Juanita: Yes.

Yolanda: Well, I’ll bet you bought some bad tablets.

The burden of proof is definitely on Yolanda’s shoulders to prove that Juanita’s vitamin C tablets were probably “bad”—that is, not really vitamin C. If Yolanda can’t do so, her attempt to rescue her hypothesis (that vitamin C prevents colds) is simply a dogmatic refusal to face up to the possibility of being wrong.

Ad Hominem

Your reasoning contains this fallacy if you make an irrelevant attack on the person arguing and suggest that this attack undermines the argument itself. “Ad Hominem” means “to the person” as in being “directed at the person.” It is a smear tactic.

Example:

What she says about Johannes Kepler’s astronomy of the 1600s must be just so much garbage. Do you realize she’s only fifteen years old?

This attack may undermine the young woman’s credibility as a scientific authority, but it does not undermine her reasoning itself because her age is irrelevant to the quality of her reasoning about Kepler. That reasoning should stand or fall on the scientific evidence, not on the arguer’s age or anything else about her personally.

The major difficulty with labeling a piece of reasoning an Ad Hominem Fallacy is deciding whether the personal attack is relevant or irrelevant. For example, attacks on a person for their immoral sexual conduct are irrelevant to the quality of the person’s reasoning about Kepler’s astronomy, but they are relevant to arguments promoting the person for a leadership position in a church or mosque or city council.

If the fallacious reasoner points out irrelevant circumstances that the reasoner is in, such as the arguer’s having a vested interest in people accepting the reasoning, then the ad hominem fallacy also may be called a Circumstantial Ad Hominem. If the fallacious attack points out some despicable trait of the arguer, it also may be called an Abusive Ad Hominem. An Ad hominem that attacks an arguer by attacking the arguer’s associates is called the Fallacy of Guilt by Association. If the fallacy focuses on a complaint about the origin of the arguer’s views, then it is a kind of Genetic Fallacy. If the fallacy is due to claiming the person does not practice what is preached, it is the Tu Quoque Fallacy. Two Wrongs do Not Make a Right is also a type of Ad Hominem fallacy.

The intentional use of the ad hominem fallacy is a tactic used by all dictators and authoritarian leaders. If you say something critical of them or their regime, their immediate response is to attack you as unreliable, or as being a puppet of the enemy, or as being a traitor.

Ad Hominem, Circumstantial

See Guilt by Association.

Ad Ignorantiam

See Appeal to Ignorance.

Ad Misericordiam

See Appeal to Emotions.

Ad Novitatem

See Bandwagon.

Ad Numerum

See Appeal to the People.

Ad Populum

See Appeal to the People.

Ad Verecundiam

See Appeal to Authority.

Affirming the Consequent

If you have enough evidence to affirm the consequent of a conditional and then suppose that as a result you have sufficient reason for affirming the antecedent, your reasoning contains the Fallacy of Affirming the Consequent. This formal fallacy is often mistaken for Modus Ponens, which is a valid form of reasoning also using a conditional. A conditional is an if-then statement; the if-part is the antecedent, and the then-part is the consequent. The following argument affirms the consequent that she does speak Portuguese. Its form is an invalid form.

Example:

If she’s Brazilian, then she speaks Portuguese. Hey, she does speak Portuguese. So, she is Brazilian.

Noticing that she speaks Portuguese suggests that she might be Brazilian, but it is weak evidence by itself, and if the argument is assessed by deductive standards, then it is deductively invalid. That is, if the arguer believes or suggests that her speaking Portuguese definitely establishes that she is Brazilian, then the argumentation contains the Fallacy of Affirming the Consequent.

Against the Person

See Ad Hominem.

All-or-Nothing

See Black-or-White Fallacy.

Ambiguity

Any fallacy that turns on ambiguity. See the fallacies of Amphiboly, Accent, and Equivocation. Amphiboly is ambiguity of syntax. Equivocation is ambiguity of semantics. Accent is ambiguity of emphasis.

Amphiboly

This is an error due to taking a grammatically ambiguous phrase in two different ways during the reasoning.

Example:

Tests show that the dog is not part wolf, as the owner suspected.

Did the owner suspect the dog was part wolf, or was not part wolf? Who knows? The sentence is ambiguous, and needs to be rewritten to remove the fallacy. Unlike Equivocation, which is due to multiple meanings of a phrase, Amphiboly is due to syntactic ambiguity, that is, ambiguity caused by multiple ways of understanding the grammar of the phrase.

Anecdotal Evidence

This is fallacious generalizing on the basis of some story that provides an inadequate sample. If you discount evidence arrived at by systematic search or by testing in favor of a few firsthand stories, then your reasoning contains the fallacy of overemphasizing anecdotal evidence.

Example:

Yeah, I’ve read the health warnings on those cigarette packs and I know about all that health research, but my brother smokes, and he says he’s never been sick a day in his life, so I know smoking can’t really hurt you.

Anthropomorphism

This is the error of projecting uniquely human qualities onto something that isn’t human. Usually this occurs with projecting the human qualities onto animals, but when it is done to nonliving things, as in calling the storm cruel, the Pathetic Fallacy is created. It is also, but less commonly, called the Disney Fallacy or the Walt Disney Fallacy.

Example:

My dog is wagging his tail and running around me. Therefore, he knows that I love him.

The fallacy would be averted if the speaker had said “My dog is wagging his tail and running around me. Therefore, he is happy to see me.” Animals do not have the ability to ascribe knowledge to other beings such as humans. Your dog knows where it buried its bone, but not that you also know where the bone is.

Appeal to Authority

You appeal to authority if you back up your reasoning by saying that it is supported by what some authority says on the subject. Most reasoning of this kind is not fallacious, and much of our knowledge properly comes from listening to authorities. However, appealing to authority as a reason to believe something is fallacious whenever the authority appealed to is not really an authority in this particular subject, when the authority cannot be trusted to tell the truth, when authorities disagree on this subject (except for the occasional lone wolf), when the reasoner misquotes the authority, and so forth. Although spotting a fallacious appeal to authority often requires some background knowledge about the subject matter and who is claimed to be the authority, in brief it can be said we are reasoning fallaciously if we accept the words of a supposed authority when we should be suspicious of the authority’s words.

Example:

The moon is covered with dust because the president of our neighborhood association said so.

This is a Fallacious Appeal to Authority because, although the president is an authority on many neighborhood matters, you are given no reason to believe the president is an authority on the composition of the moon. It would be better to appeal to some astronomer or geologist. A TV commercial that gives you a testimonial from a famous film star who wears a Wilson watch and that suggests you, too, should wear that brand of watch is using a fallacious appeal to authority. The film star is an authority on how to act, not on which watch is best for you.

Appeal to Consequence

Arguing that a belief is false because it implies something you’d rather not believe. Also called Argumentum Ad Consequentiam.

Example:

That can’t be Senator Smith there in the videotape going into her apartment. If it were, he’d be a liar about not knowing her. He’s not the kind of man who would lie. He’s a member of my congregation.

Smith may or may not be the person in that videotape, but this kind of arguing should not convince us that it’s someone else in the videotape.

Appeal to Emotions

Your reasoning contains the Fallacy of Appeal to Emotions when someone’s appeal to you to accept their claim is accepted merely because the appeal arouses your feelings of anger, fear, grief, love, outrage, pity, pride, sexuality, sympathy, relief, and so forth. Example of appeal to relief from grief:

[The speaker knows he is talking to an aggrieved person whose house is worth much more than $100,000.] You had a great job and didn’t deserve to lose it. I wish I could help somehow. I do have one idea. Now your family needs financial security even more. You need cash. I can help you. Here is a check for $100,000. Just sign this standard sales agreement, and we can skip the realtors and all the headaches they would create at this critical time in your life.

There is nothing wrong with using emotions when you argue, but it’s a mistake to use emotions as the key premises or as tools to downplay relevant information. Regarding the Fallacy of Appeal to Pity, it is proper to pity people who have had misfortunes, but if as the person’s history instructor you accept Max’s claim that he earned an A on the history quiz because he broke his wrist while playing in your college’s last basketball game, then you’ve used the fallacy of appeal to pity.

Appeal to Force

See Scare Tactic.

Appeal to Ignorance

The Fallacy of Appeal to Ignorance comes in two forms: (1) Not knowing that a certain statement is true is taken to be a proof that it is false. (2) Not knowing that a statement is false is taken to be a proof that it is true. The fallacy occurs in cases where absence of evidence is not good enough evidence of absence. The fallacy uses an unjustified attempt to shift the burden of proof. The fallacy is also called “Argument from Ignorance.”

Example:

Nobody has ever proved to me there’s a God, so I know there is no God.

This kind of reasoning is generally fallacious. It would be proper reasoning only if the proof attempts were quite thorough, and it were the case that, if the being or object were to exist, then there would be a discoverable proof of this. Another common example of the fallacy involves ignorance of a future event: You people have been complaining about the danger of Xs ever since they were invented, but there’s never been any big problem with Xs, so there’s nothing to worry about.

Appeal to Money

The Fallacy of Appeal to Money uses the error of supposing that, if something costs a great deal of money, then it must be better, or supposing that if someone has a great deal of money, then they’re a better person in some way unrelated to having a great deal of money. Similarly it’s a mistake to suppose that if something is cheap it must be of inferior quality, or to suppose that if someone is poor financially then they’re poor at something unrelated to having money.

Example:

He’s rich, so he should be the president of our Parents and Teachers Organization.

Appeal to Past Practice

See Appeal to the People.

Appeal to Pity

See Appeal to Emotions.

Appeal to Snobbery

See Appeal to Emotions.

Appeal to the Gallery

See Appeal to the People.

Appeal to the Masses

See Appeal to the People.

Appeal to the Mob

See Appeal to the People.

Appeal to the People

If you suggest too strongly that someone’s claim or argument is correct simply because it’s what most everyone believes, then your reasoning contains the Fallacy of Appeal to the People. Similarly, if you suggest too strongly that someone’s claim or argument is mistaken simply because it’s not what most everyone believes, then your reasoning also uses the fallacy. Agreement with popular opinion is not necessarily a reliable sign of truth, and deviation from popular opinion is not necessarily a reliable sign of error, but if you assume it is and do so with enthusiasm, then you are using this fallacy. It is essentially the same as the fallacies of Ad Numerum, Appeal to the Gallery, Appeal to the Masses, Argument from Popularity, Argumentum ad Populum, Common Practice, Mob Appeal, Past Practice, Peer Pressure, and Traditional Wisdom. The “too strongly” mentioned above is important in the description of the fallacy because what most everyone believes is, for that reason, somewhat likely to be true, all things considered. However, the fallacy occurs when this degree of support is overestimated.

Example:

You should turn to channel 6. It’s the most watched channel this year.

This is fallacious because of its implicitly accepting the questionable premise that the most watched channel this year is, for that reason alone, the best channel for you. If you stress the idea of appealing to a new idea held by the gallery, masses, mob, peers, people, and so forth, then it is a Bandwagon Fallacy.

Appeal to the Stick

See Appeal to Emotions (fear).

Appeal to Unqualified Authority

See Appeal to Authority.

Appeal to Vanity

See Appeal to Emotions.

Argument from Ignorance

See Appeal to Ignorance.

Argument from Outrage

See Appeal to Emotions.

Argument from Popularity

See Appeal to the People.

Argumentum Ad ….

See Ad …. without the word “Argumentum.”

Argumentum Consensus Gentium

See Appeal to Traditional Wisdom.

Availability Heuristic

We have an unfortunate instinct to base an important decision on an easily recalled, dramatic example, even though we know the example is atypical. It is a specific version of the fallacy of Confirmation Bias.

Example:

I just saw a video of a woman dying by fire in a car crash because she was unable to unbuckle her seat belt as the flames increased in intensity. So, I am deciding today no longer to wear a seat belt when I drive.

This reasoning commits the Fallacy of the Availability Heuristic because the reasoner would realize, if he would stop and think for a moment, that a great many more lives are saved due to wearing seat belts rather than due to not wearing seat belts, and the video of the situation of the woman unable to unbuckle her seat belt in the car crash is an atypical situation. The name of this fallacy is not very memorable, but it is in common use.

Avoiding the Issue

A reasoner who is supposed to address an issue but instead goes off on a tangent is properly accused of using the Fallacy of Avoiding the Issue. Also called missing the point, straying off the subject, digressing, and not sticking to the issue.

Example:

A city official is charged with corruption for awarding contracts to his wife’s consulting firm. In speaking to a reporter about why he is innocent, the city official talks only about his wife’s conservative wardrobe, the family’s lovable dog, and his own accomplishments in supporting Little League baseball.

However, the fallacy isn’t used by a reasoner who says that some other issue must first be settled and then continues by talking about this other issue, provided the reasoner is correct in claiming this dependence of one issue upon the other.

Avoiding the Question

The Fallacy of Avoiding the Question is a type of Fallacy of Avoiding the Issue that occurs when the issue is how to answer some question. The fallacy occurs when someone’s answer doesn’t really respond to the question asked. The fallacy is also called “Changing the Question.”

Example:

Question: Would the Oakland Athletics be in first place if they were to win tomorrow’s game?

Answer: What makes you think they’ll ever win tomorrow’s game?

Bad Seed

Attempting to undermine someone’s reasoning by pointing our their “bad” family history, when it is an irrelevant point. See Genetic Fallacy.

Bald Man

See Line-Drawing.

Bandwagon

If you suggest that someone’s claim is correct simply because it’s what most everyone is coming to believe, then you’re are using the Bandwagon Fallacy. Get up here with us on the wagon where the band is playing, and go where we go, and don’t think too much about the reasons. The Latin term for this Fallacy of Appeal to Novelty is Argumentum ad Novitatem.

Example:

[Advertisement] More and more people are buying sports utility vehicles. It is time you bought one, too.

Like its close cousin, the Fallacy of Appeal to the People, the Bandwagon Fallacy needs to be carefully distinguished from properly defending a claim by pointing out that many people have studied the claim and have come to a reasoned conclusion that it is correct. What most everyone believes is likely to be true, all things considered, and if one defends a claim on those grounds, this is not a fallacious inference. What is fallacious is to be swept up by the excitement of a new idea or new fad and to unquestionably give it too high a degree of your belief solely on the grounds of its new popularity, perhaps thinking simply that ‘new is better.’ The key ingredient that is missing from a bandwagon fallacy is knowledge that an item is popular because of its high quality.

Begging the Question

A form of circular reasoning in which a conclusion is derived from premises that presuppose the conclusion. Normally, the point of good reasoning is to start out at one place and end up somewhere new, namely having reached the goal of increasing the degree of reasonable belief in the conclusion. The point is to make progress, but in cases of begging the question there is no progress, and the arguer is essentially arguing by repeating the point.

Example:

“Women have rights,” said the Bullfighters Association president. “But women shouldn’t fight bulls because a bullfighter is and should be a man.”

The president is saying basically that women shouldn’t fight bulls because women shouldn’t fight bulls. This reasoning isn’t making any progress.

Insofar as the conclusion of a deductively valid argument is “contained” in the premises from which it is deduced, this containing might seem to be a case of presupposing, and thus any deductively valid argument might seem to be begging the question. It is still an open question among logicians as to why some deductively valid arguments are considered to be begging the question and others are not. Some logicians suggest that, in informal reasoning with a deductively valid argument, if the conclusion is psychologically new insofar as the premises are concerned, then the argument isn’t an example of the fallacy. Other logicians suggest that we need to look instead to surrounding circumstances, not to the psychology of the reasoner, in order to assess the quality of the argument. For example, we need to look to the reasons that the reasoner used to accept the premises. Was the premise justified on the basis of accepting the conclusion? A third group of logicians say that, in deciding whether the fallacy is present, more evidence is needed. We must determine whether any premise that is key to deducing the conclusion is adopted rather blindly or instead is a reasonable assumption made by someone accepting their burden of proof. The premise would here be termed reasonable if the arguer could defend it independently of accepting the conclusion that is at issue.

Beside the Point

Arguing for a conclusion that is not relevant to the current issue. Also called Irrelevant Conclusion. It is a form of the Red Herring Fallacy

Biased Generalizing

Generalizing from a biased sample. Using an unrepresentative sample and overestimating the strength of an argument based on that sample.
See Unrepresentative Sample.

Biased Sample

See Unrepresentative Sample.

Biased Statistics

See Unrepresentative Sample.

Bifurcation

See Black-or-White.

Black-or-White

The Black-or-White fallacy or Black-White fallacy is a False Dilemma Fallacy that limits you unfairly to only two choices, as if you were made to choose between black and white.

Example:

Well, it’s time for a decision. Will you contribute $20 to our environmental fund, or are you on the side of environmental destruction?

A proper challenge to this fallacy could be to say, “I do want to prevent the destruction of our environment, but I don’t want to give $20 to your fund. You are placing me between a rock and a hard place.” The key to diagnosing the Black-or-White Fallacy is to determine whether the limited menu is fair or unfair. Simply saying, “Will you contribute $20 or won’t you?” is not unfair. The fallacy shows up in psychology when a person is too apt to treat people simply as friend or enemy, or smart or an idiot. The black-or-white fallacy is often committed intentionally in jokes such as: “My toaster has two settings—burnt and off.” In thinking about this kind of fallacy it is helpful to remember that everything is either black or not black, but not everything is either black or white.

Caricaturization

Attacking a person’s argument by presenting a caricaturization is a form of the Straw Man Fallacy and the Ad Hominem Fallacy. A critical thinker should attack the real man and his argument, not a caricaturization of the man or the argument. Ditto for women, of course. The fallacy is a form of the Straw Man Fallacy because Ideally an argument should not be assessed by a technique that unfairly misrepresents it. The Caricaturization Fallacy is the same as the Fallacy of Refutation by Caricature.

Changing the Question

This is another name for the Fallacy of Avoiding the Question.

Cherry-Picking

Cherry-Picking the Evidence is another name for the Fallacy of Suppressed Evidence.

Circular Reasoning

The Fallacy of Circular Reasoning occurs when the reasoner begins with what he or she is trying to end up with.

Here is Steven Pinker’s example:

Definition: endless loop, n. See loop, endless.

Definition: loop, endless, n. See endless loop.

The most well known examples of circular reasoning are cases of the Fallacy of Begging the Question. Here the circle is as short as possible. However, if the circle is very much larger, including a wide variety of claims and a large set of related concepts, then the circular reasoning can be informative and so is not considered to be fallacious. For example, a dictionary contains a large circle of definitions that use words which are defined in terms of other words that are also defined in the dictionary. Because the dictionary is so informative, it is not considered as a whole to be fallacious. However, a small circle of definitions is considered to be fallacious.

In properly-constructed recursive definitions, defining a term by using that same term is not fallacious. For example, here is an appropriate recursive definition of the term “a stack of coins.” Basis step: Two coins, with one on top of the other, is a stack of coins. Recursion step: If p is a stack of coins, then adding a coin on top of p produces a stack of coins. For a deeper discussion of circular reasoning see Infinitism in Epistemology.

Circumstantial Ad Hominem

See Ad Hominem, Circumstantial.

Clouding the Issue

See Smokescreen.

Common Belief

See Appeal to the People and Traditional Wisdom.

Common Cause

This fallacy occurs during causal reasoning when a causal connection between two kinds of events is claimed when evidence is available indicating that both are the effect of a common cause.

Example:

Noting that the auto accident rate rises and falls with the rate of use of windshield wipers, one concludes that the use of wipers is somehow causing auto accidents.

However, it’s the rain that’s the common cause of both.

Common Practice

See Appeal to the People and Traditional Wisdom.

Complex Question

You use this fallacy when you frame a question so that some controversial presupposition is made by the wording of the question.

Example:

[Reporter’s question] Mr. President: Are you going to continue your policy of wasting taxpayer’s money on missile defense?

The question unfairly presumes the controversial claim that the policy really is a waste of money. The Fallacy of Complex Question is a form of Begging the Question.

Composition

The Composition Fallacy occurs when someone mistakenly assumes that a characteristic of some or all the individuals in a group is also a characteristic of the group itself, the group “composed” of those members. It is the converse of the Division Fallacy.

Example:

Each human cell is very lightweight, so a human being composed of cells is also very lightweight.

Confirmation Bias

The tendency to look for evidence in favor of one’s controversial hypothesis and not to look for disconfirming evidence, or to pay insufficient attention to it. This is the most common kind of Fallacy of Selective Attention, and it is the foundation of many conspiracy theories.

Example:

She loves me, and there are so many ways that she has shown it. When we signed the divorce papers in her lawyer’s office, she wore my favorite color. When she slapped me at the bar and called me a “handsome pig,” she used the word “handsome” when she didn’t have to. When I called her and she said never to call her again, she first asked me how I was doing and whether my life had changed. When I suggested that we should have children in order to keep our marriage together, she laughed. If she can laugh with me, if she wants to know how I am doing and whether my life has changed, and if she calls me “handsome” and wears my favorite color on special occasions, then I know she really loves me.

Using the Fallacy of Confirmation Bias is usually a sign that one has adopted some belief dogmatically and isn’t willing to disconfirm the belief, or is too willing to interpret ambiguous evidence so that it conforms to what one already believes. Confirmation bias often reveals itself in the fact that people of opposing views can each find support for those views in the same piece of evidence.

Conjunction

Mistakenly supposing that event E is less likely than the conjunction of events E and F. Here is an example from the psychologists Daniel Kahneman and Amos Tversky.

Example:

Suppose you know that Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice. Then you are asked to choose which is more likely: (A) Linda is a bank teller or (B) Linda is a bank teller and active in the feminist movement. If you choose (B) you commit the Conjunction Fallacy

Confusing an Explanation with an Excuse

Treating someone’s explanation of a fact as if it were a justification of the fact. Explaining a crime should not be confused with excusing the crime, but it too often is.
Example:

Speaker: The German atrocities committed against the French and Belgians during World War I were in part due to the anger of German soldiers who learned that French and Belgian soldiers were ambushing German soldiers, shooting them in the back, or even poisoning, blinding and castrating them.

Respondent: I don’t understand how you can be so insensitive as to condone those German atrocities.

Consensus Gentium

Fallacy of Argumentum Consensus Gentium (argument from the consensus of the nations). See Traditional Wisdom.

Consequence

See Appeal to Consequence.

Contextomy

See Quoting out of Context.

Converse Accident

If we reason by paying too much attention to exceptions to the rule, and generalize on the exceptions, our reasoning contains this fallacy. This fallacy is the converse of the Accident Fallacy. It is a kind of Hasty Generalization, by generalizing too quickly from a peculiar case.

Example:

I’ve heard that turtles live longer than tarantulas, but the one turtle I bought lived only two days. I bought it at Dowden’s Pet Store. So, I think that turtles bought from pet stores do not live longer than tarantulas.

The original generalization is “Turtles live longer than tarantulas.” There are exceptions, such as the turtle bought from the pet store. Rather than seeing this for what it is, namely an exception, the reasoner places too much trust in this exception and generalizes on it to produce the faulty generalization that turtles bought from pet stores do not live longer than tarantulas.

Cover-up

See Suppressed Evidence.

Cum Hoc, Ergo Propter Hoc

Latin for “with this, therefore because of this.” This is a False Cause Fallacy that doesn’t depend on time order (as does the post hoc fallacy), but on any other chance correlation of the supposed cause being in the presence of the supposed effect.

Example:

Loud musicians live near our low-yield cornfields. So, loud musicians must be causing the low yield.

Curve Fitting

Curve fitting is the process of constructing a curve that has the best fit to a series of data points. The curve is a graph of some mathematical function. The function or functional relationship might be between variable x and variable y, where x is the time of day and y is the temperature of the ocean. When you collect data about some relationship, you inevitably collect information that is affected by noise or statistical fluctuation. If you create a function between x and y that is too sensitive to your data, you will be overemphasizing the noise and producing a function that has less predictive value than need be. If you create your function by interpolating, that is, by drawing straight line segments between all the adjacent data points, or if you create a polynomial function that exactly fits every data point, it is likely that your function will be worse than if you’d produced a function with a smoother curve. Your original error of too closely fitting the data-points is called the Fallacy of Curve Fitting or the Fallacy of Overfitting.

Example:

You want to know the temperature of the ocean today, so you measure it at 8:00 A.M. with one thermometer and get the temperature of 60.1 degrees. Then you measure the ocean at 8:05 A.M. with a different thermometer and get the temperature of 60.2 degrees; then at 8:10 A.M. and get 59.1 degrees perhaps with the first thermometer, and so. If you fit your curve exactly to your data points, then you falsely imply that the ocean’s temperature is shifting all around every five minutes. However, the temperature is probably constant, and the problem is that your prediction is too sensitive to your data, so your curve fits the data points too closely.

Definist

The Definist Fallacy occurs when someone unfairly defines a term so that a controversial position is made easier to defend. Same as the Persuasive Definition.

Example:

During a controversy about the truth or falsity of atheism, the fallacious reasoner says, “Let’s define ‘atheist’ as someone who doesn’t yet realize that God exists.”

Denying the Antecedent

You are using this fallacy if you deny the antecedent of a conditional and then suppose that doing so is a sufficient reason for denying the consequent. This formal fallacy is often mistaken for Modus Tollens, a valid form of argument using the conditional. A conditional is an if-then statement; the if-part is the antecedent, and the then-part is the consequent.

Example:

If she were Brazilian, then she would know that Brazil’s official language is Portuguese. She isn’t Brazilian; she’s from London. So, she surely doesn’t know this about Brazil’s language.

Disregarding Known Science

This fallacy is committed when a person makes a claim that knowingly or unknowingly disregards well known science, science that weighs against the claim. They should know better. This fallacy is a form of the Fallacy of Suppressed Evidence.

Example:

John claims in his grant application that he will be studying the causal effectiveness of bone color on the ability of leg bones to support indigenous New Zealand mammals. He disregards well known scientific knowledge that color is not what causes any bones to work the way they do by saying that this knowledge has never been tested in New Zealand.

Digression

See Avoiding the Issue.

Distraction

See Smokescreen.

Division

Merely because a group as a whole has a characteristic, it often doesn’t follow that individuals in the group have that characteristic. If you suppose that it does follow, when it doesn’t, your reasoning contains the Fallacy of Division. It is the converse of the Composition Fallacy.

Example:

Joshua’s soccer team is the best in the division because it had an undefeated season and won the division title, so their goalie must be the best in the division.

As an example of division, Aristotle gave this example: The number 5 is 2 and 3. But 2 is even and 3 is odd, so 5 is even and odd.

Domino

See Slippery Slope.

Double Standard

There are many situations in which you should judge two things or people by the same standard. If in one of those situations you use different standards for the two, your reasoning contains the Fallacy of Using a Double Standard.

Example:

I know we will hire any man who gets over a 70 percent on the screening test for hiring Post Office employees, but women should have to get an 80 to be hired because they often have to take care of their children.

This example is a fallacy if it can be presumed that men and women should have to meet the same standard for becoming a Post Office employee.

Either/Or

See Black-or-White.

Equivocation

Equivocation is the illegitimate switching of the meaning of a term that occurs twice during the reasoning; it is the use of one word taken in two ways. The fallacy is a kind of Fallacy of Ambiguity.

Example:

Brad is a nobody, but since nobody is perfect, Brad must be perfect, too.

The term “nobody” changes its meaning without warning in the passage. Equivocation can sometimes be very difficult to detect, as in this argument from Walter Burleigh:

If I call you a swine, then I call you an animal.
If I call you an animal, then I’m speaking the truth.
Therefore, if I call you a swine, then I’m speaking the truth.

Etymological

The Etymological Fallacy occurs whenever someone falsely assumes that the meaning of a word can be discovered from its etymology or origins.

Example:

The word “vise” comes from the Latin “that which winds,” so it means anything that winds. Since a hurricane winds around its own eye, it is a vise.

Every and All

The Fallacy of Every and All turns on errors due to the order or scope of the quantifiers “every” and “all” and “any.” This is a version of the Scope Fallacy.

Example:

Every action of ours has some final end. So, there is some common final end to all our actions.

In proposing this fallacious argument, Aristotle believed the common end is the supreme good, so he had a rather optimistic outlook on the direction of history.

Exaggeration

When we overstate or overemphasize a point that is a crucial step in a piece of reasoning, then we are guilty of the Fallacy of Exaggeration. This is a kind of error called Lack of Proportion.

Example:

She’s practically admitted that she intentionally yelled at that student while on the playground in the fourth grade. That’s verbal assault. Then she said nothing when the teacher asked, “Who did that?” That’s lying, plain and simple. Do you want to elect as secretary of this club someone who is a known liar prone to assault? Doing so would be a disgrace to our Collie Club.

When we exaggerate in order to make a joke, though, we do not use the fallacy because we do not intend to be taken literally.

Excluded Middle

See False Dilemma or Black-or-White.

False Analogy

The problem is that the items in the analogy are too dissimilar. When reasoning by analogy, the fallacy occurs when the analogy is irrelevant or very weak or when there is a more relevant disanalogy. See also Faulty Comparison.

Example:

The book Investing for Dummies really helped me understand my finances better. The book Chess for Dummies was written by the same author, was published by the same press, and costs about the same amount. So, this chess book would probably help me understand my finances, too.

False Balance

A specific form of the False Equivalence Fallacy that occurs in the context of news reporting, in which the reporter misleads the audience by suggesting the evidence on two sides of an issue is equally balanced, when the reporter knows that one of the two sides is an extreme outlier. Reporters regularly commit this fallacy in order to appear “fair and balanced.”

Example:

The news report of the yesterday’s city council meeting says, “David Samsung challenged the council by saying the Gracie Mansion is haunted, so it should not be torn down. Councilwoman Miranda Gonzales spoke in favor of dismantling the old mansion saying its land is needed for an expansion of the water treatment facility. Both sides seemed quite fervent in promoting their position.” Then the news report stops there, covering up the facts that the preponderance of scientific evidence implies there is no such thing as being haunted, and that David Samsung is the well known “village idiot” who last month came before the council demanding a tax increase for Santa Claus’ workers at the North Pole.

False Cause

Improperly concluding that one thing is a cause of another. The Fallacy of Non Causa Pro Causa is another name for this fallacy. Its four principal kinds are the Post Hoc Fallacy, the Fallacy of Cum Hoc, Ergo Propter Hoc, the Regression Fallacy, and the Fallacy of Reversing Causation.

Example:

My psychic adviser says to expect bad things when Mars is aligned with Jupiter. Tomorrow Mars will be aligned with Jupiter. So, if a dog were to bite me tomorrow, it would be because of the alignment of Mars with Jupiter.

False Dichotomy

See False Dilemma or Black-or-White.

False Dilemma

A reasoner who unfairly presents too few choices and then implies that a choice must be made among this short menu of choices is using the False Dilemma Fallacy, as does the person who accepts this faulty reasoning.

Example:

A pollster asks you this question about your job: “Would you say your employer is drunk on the job about (a) once a week, (b) twice a week, or (c) more times per week?

The pollster is committing the fallacy by limiting you to only those choices. What about the choice of “no times per week”? Think of the unpleasant choices as being the horns of a bull that is charging toward you. By demanding other choices beyond those on the unfairly limited menu, you thereby “go between the horns” of the dilemma, and are not gored. The fallacy is called the “False Dichotomy Fallacy” or the “Black-or-White” Fallacy when the unfair menu contains only two choices, and thus two horns.

False Equivalence

The Fallacy of False Equivalence is committed when someone implies falsely (and usually indirectly) that the two sides on some issue have basically equivalent evidence, while knowingly covering up the fact that one side’s evidence is much weaker. A form of the Fallacy of Suppressed Evidence.

Example:

A popular science article suggests there is no consensus about the Earth’s age, by quoting one geologist who says she believes the Earth is billions of years old, and then by quoting Bible expert James Ussher who says he calculated from the Bible that the world began on Friday, October 28, 4,004 B.C.E. The article suppresses the evidence that geologists (who are the relevant experts on this issue) have reached a consensus that the Earth is billions of years old.

Far-Fetched Hypothesis

This is the fallacy of offering a bizarre (far-fetched) hypothesis as the correct explanation without first ruling out more mundane explanations.

Example:

Look at that mutilated cow in the field, and see that flattened grass. Aliens must have landed in a flying saucer and savaged the cow to learn more about the beings on our planet.

Faulty Comparison

If you try to make a point about something by comparison, and if you do so by comparing it with the wrong thing, then your reasoning uses the Fallacy of Faulty Comparison or the Fallacy of Questionable Analogy.

Example:

We gave half the members of the hiking club Durell hiking boots and the other half good-quality tennis shoes. After three months of hiking, you can see for yourself that Durell lasted longer. You, too, should use Durell when you need hiking boots.

Shouldn’t Durell hiking boots be compared with other hiking boots, not with tennis shoes?

Faulty Generalization

A fallacy produced by some error in the process of generalizing. See Hasty Generalization or Unrepresentative Generalization for examples.

Faulty Motives

An irrelevant appeal to the motives of the arguer, and supposing that this revelation of their motives will thereby undermine their reasoning. A kind of Ad Hominem Fallacy.

Example:

The councilman’s argument for the new convention center can’t be any good because he stands to gain if it’s built.

Formal Fallacy

Formal fallacies are all the cases or kinds of reasoning that fail to be deductively valid. Formal fallacies are also called Logical Fallacies or Invalidities. That is, they are deductively invalid arguments that are too often believed to be deductively valid.

Example:

Some cats are tigers. Some tigers are animals. So, some cats are animals.

This might at first seem to be a good argument, but actually it is fallacious because it has the same logical form as the following more obviously invalid argument:

Some women are Americans. Some Americans are men. So, some women are men.

Nearly all the infinity of types of invalid inferences have no specific fallacy names.

Four Terms

The Fallacy of Four Terms (quaternio terminorum) occurs when four rather than three categorical terms are used in a standard-form syllogism.

Example:

All rivers have banks. All banks have vaults. So, all rivers have vaults.

The word “banks” occurs as two distinct terms, namely river bank and financial bank, so this example also is an equivocation. Without an equivocation, the four term fallacy is trivially invalid.

Gambler’s

This fallacy occurs when the gambler falsely assumes that the history of outcomes will affect future outcomes.

Example:

I know this is a fair coin, but it has come up heads five times in a row now, so tails is due on the next toss.

The fallacious move was to conclude that the probability of the next toss coming up tails must be more than a half. The assumption that it’s a fair coin is important because, if the coin comes up heads five times in a row, one would otherwise become suspicious that it’s not a fair coin and therefore properly conclude that the probably is high that heads is more likely on the next toss.

Genetic

A critic uses the Genetic Fallacy if the critic attempts to discredit or support a claim or an argument because of its origin (genesis) when such an appeal to origins is irrelevant.

Example:

Whatever your reasons are for buying that gift, they’ve got to be ridiculous. You said yourself that you got the idea for buying it from last night’s fortune cookie. Cookies can’t think!

Fortune cookies are not reliable sources of information about what gift to buy, but the reasons the person is willing to give are likely to be quite relevant and should be listened to. The speaker is committing the Genetic Fallacy by paying too much attention to the genesis of the idea rather than to the reasons offered for it.

If I learn that your plan for building the shopping center next to the Johnson estate originated with Johnson himself, who is likely to profit from the deal, then my request that the planning commission not accept your proposal without independent verification of its merits wouldn’t be committing the genetic fallacy. Because appeals to origins are sometimes relevant and sometimes irrelevant and sometimes on the borderline, in those latter cases it can be very difficult to decide whether the fallacy has been committed. For example, if Sigmund Freud shows that the genesis of a person’s belief in God is their desire for a strong father figure, then does it follow that their belief in God is misplaced, or is Freud’s reasoning committing the Genetic Fallacy?

Group Think

A reasoner uses the Group Think Fallacy if he or she substitutes pride of membership in the group for reasons to support the group’s policy. If that’s what our group thinks, then that’s good enough for me. It’s what I think, too. “Blind” patriotism is a rather nasty version of the fallacy.

Example:

We K-Mart employees know that K-Mart brand items are better than Wall-Mart brand items because, well, they are from K-Mart, aren’t they?

Guilt by Association

Guilt by Association is a version of the Ad Hominem Fallacy in which a person is said to be guilty of error because of the group he or she associates with. The fallacy occurs when we unfairly try to change the issue to be about the speaker’s circumstances rather than about the speaker’s actual argument. Also called “Ad Hominem, Circumstantial.”

Example:

Secretary of State Dean Acheson is too soft on communism, as you can see by his inviting so many fuzzy-headed liberals to his White House cocktail parties.

Has any evidence been presented here that Acheson’s actions are inappropriate in regards to communism? This sort of reasoning is an example of McCarthyism, the technique of smearing liberal Democrats that was so effectively used by the late Senator Joe McCarthy in the early 1950s. In fact, Acheson was strongly anti-communist and the architect of President Truman’s firm policy of containing Soviet power.

Hasty Conclusion

See Jumping to Conclusions.

Hasty Generalization

A Hasty Generalization is a Fallacy of Jumping to Conclusions in which the conclusion is a generalization. See also Biased Statistics.

Example:

I’ve met two people in Nicaragua so far, and they were both nice to me. So, all people I will meet in Nicaragua will be nice to me.

In any Hasty Generalization the key error is to overestimate the strength of an argument that is based on too small a sample for the implied confidence level or error margin. In this argument about Nicaragua, using the word “all” in the conclusion implies zero error margin. With zero error margin you’d need to sample every single person in Nicaragua, not just two people.

Heap

See Line-Drawing.

Hedging

You are hedging if you refine your claim simply to avoid counterevidence and then act as if your revised claim is the same as the original.

Example:

Samantha: David is a totally selfish person.

Yvonne: I thought we was a boy scout leader. Don’t you have to give a lot of your time for that?

Samantha: Well, David’s totally selfish about what he gives money to. He won’t spend a dime on anyone else.

Yvonne: I saw him bidding on things at the high school auction fundraiser.

Samantha: Well, except for that he’s totally selfish about money.

You do not use the fallacy if you explicitly accept the counterevidence, admit that your original claim is incorrect, and then revise it so that it avoids that counterevidence.

Hooded Man

This is an error in reasoning due to confusing the knowing of a thing with the knowing of it under all its various names or descriptions.

Example:

You claim to know Socrates, but you must be lying. You admitted you didn’t know the hooded man over there in the corner, but the hooded man is Socrates.

Hyperbolic Discounting

The Fallacy of Hyperbolic Discounting occurs when someone too heavily weighs the importance of a present reward over a significantly greater reward in the near future, but only slightly differs in their valuations of those two rewards if they are to be received in the far future. The person’s preferences are biased toward the present.

Example:

When asked to decide between receiving an award of $50 now or $60 tomorrow, the person chooses the $50; however, when asked to decide between receiving $50 in two years or $60 in two years and one day, the person chooses the $60.

If the person is in a situation in which $50 now will solve their problem but $60 tomorrow will not, then there is no fallacy in having a bias toward the present.

Hypostatization

The error of inappropriately treating an abstract term as if it were a concrete one. Also known as the Fallacy of Misplaced Concreteness and the Fallacy of Reification.

Example:

Nature decides which organisms live and which die.

Nature isn’t capable of making decisions. The point can be made without reasoning fallaciously by saying: “Which organisms live and which die is determined by natural causes.” Whether a phrase commits the fallacy depends crucially upon whether the use of the inaccurate phrase is inappropriate in the situation. In a poem, it is appropriate and very common to reify nature, hope, fear, forgetfulness, and so forth, that is, to treat them as if they were objects or beings with intentions. In any scientific claim, it is inappropriate.

Ideology-Driven Argumentation

This occurs when an arguer presupposes some aspect of their own ideology that they are unable to defend.

Example:

Senator, if you pass that bill to relax restrictions on gun ownership and allow people to carry concealed handguns, then you are putting your own voters at risk.

The arguer is presupposing a liberal ideology which implies that permitting private citizens to carry concealed handguns increases crime and decreases safety. If the arguer is unable to defend this presumption, then the fallacy is committed regardless of whether the presumption is defensible. If the senator were to accept this liberal ideology, then the senator is likely to accept the arguer’s conclusion, and the argument could be considered to be effective, but still it would be fallacious—such is the difference between rhetoric and logic.

Ignoratio Elenchi

See Irrelevant Conclusion. Also called missing the point.

Ignoring a Common Cause

See Common Cause.

Ignoring Inconvenient Data

See Suppressed Evidence.

Incomplete Evidence

See Suppressed Evidence.

Improper Analogy

Another name for the Fallacy of False Analogy.

Inconsistency

The fallacy occurs when we accept an inconsistent set of claims, that is, when we accept a claim that logically conflicts with other claims we hold.

Example:

I never generalize because everyone who does is a hypocrite.

That last remark implies the speaker does generalize, although the speaker doesn’t notice this inconsistency with what is said.

Inductive Conversion

Improperly reasoning from a claim of the form “All As are Bs” to “All Bs are As” or from one of the form “Many As are Bs” to “Many Bs are As” and so forth.

Example:

Most professional basketball players are tall, so most tall people are professional basketball players.

The term “conversion” is a technical term in formal logic.

Insufficient Statistics

Drawing a statistical conclusion from a set of data that is clearly too small.

Example:

A pollster interviews ten London voters in one building about which candidate for mayor they support, and upon finding that Churchill receives support from six of the ten, declares that Churchill has the majority support of London voters.

This fallacy is a form of the Fallacy of Jumping to Conclusions.

Intensional

The mistake of treating different descriptions or names of the same object as equivalent even in those contexts in which the differences between them matter. Reporting someone’s beliefs or assertions or making claims about necessity or possibility can be such contexts. In these contexts, replacing a description with another that refers to the same object is not valid and may turn a true sentence into a false one.

Example:

Michelle said she wants to meet her new neighbor Stalnaker tonight. But I happen to know Stalnaker is a spy for North Korea, so Michelle said she wants to meet a spy for North Korea tonight.

Michelle said no such thing. The faulty reasoner illegitimately assumed that what is true of a person under one description will remain true when said of that person under a second description even in this context of indirect quotation. What was true of the person when described as “her new neighbor Stalnaker” is that Michelle said she wants to meet him, but it wasn’t legitimate for me to assume this is true of the same person when he is described as “a spy for North Korea.”

Extensional contexts are those in which it is legitimate to substitute equals for equals with no worry. But any context in which this substitution of co-referring terms is illegitimate is called an intensional context. Intensional contexts are produced by quotation, modality, and intentionality (propositional attitudes). Intensionality is failure of extensionality, thus the name “Intensional Fallacy”.

Invalid Reasoning

An invalid inference. An argument can be assessed by deductive standards to see if the conclusion would have to be true if the premises were to be true. If the argument cannot meet this standard, it is invalid. An argument is invalid only if it is not an instance of any valid argument form. The Fallacy of Invalid Reasoning is a formal fallacy.

Example:

If it’s raining, then there are clouds in the sky. It’s not raining. Therefore, there are no clouds in the sky.

This invalid argument is an instance of Denying the Antecedent. Any invalid inference that is also inductively very weak is a Non Sequitur.

Irrelevant Conclusion

The conclusion that is drawn is irrelevant to the premises; it misses the point.

Example:

In court, Thompson testifies that the defendant is a honorable person, who wouldn’t harm a flea. The defense attorney uses the fallacy by rising to say that Thompson’s testimony shows once again that his client was not near the murder scene.

The testimony of Thompson may be relevant to a request for leniency, but it is irrelevant to any claim about the defendant not being near the murder scene. Other examples of this fallacy are Ad Hominem, Appeal to Authority, Appeal to Emotions, and Argument from Ignorance.

Irrelevant Reason

This fallacy is a kind of Non Sequitur in which the premises are wholly irrelevant to drawing the conclusion.

Example:

Lao Tze Beer is the top selling beer in Thailand. So, it will be the best beer for Canadians.

Is-Ought

The Is-Ought Fallacy occurs when a conclusion expressing what ought to be so is inferred from premises expressing only what is so, in which it is supposed that no implicit or explicit ought-premises are need. There is controversy in the philosophical literature regarding whether this type of inference is always fallacious.

Example:

He’s torturing the cat.

So, he shouldn’t do that.

This argument would not use the fallacy if there were an implicit premise indicating that he is a person and that persons should not torture other beings.

Jumping to Conclusions

It is not always a mistake to make a quick decision, but when we draw a conclusion without taking the trouble to acquire enough of the relevant evidence, our reasoning commits the fallacy of jumping to conclusions, provided there was sufficient time to acquire and assess that extra evidence, and provided that the extra effort it takes to get the evidence isn’t prohibitive.

Example:

This car is really cheap. I’ll buy it.

Hold on. Before concluding that you should buy it, ask yourself whether you need to buy another car and, if so, whether you should lease or rent or just borrow a car when you need to travel by car. If you do need to buy a car, you ought to have someone check its operating condition, or else you should make sure you get a guarantee about the car’s being in working order. And, if you stop to think about it, there may be other factors you should consider before making the purchase, such as its age, size, appearance, and mileage.

Lack of Proportion

The Fallacy of Lack of Proportion occurs either by exaggerating or downplaying or simply not noticing a point that is a crucial step in a piece of reasoning. You exaggerate when you make a mountain out of a molehill. You downplay when you suppress relevant evidence. The Genetic Fallacy blows the genesis of an idea out of proportion.

Example:

Did you hear about that tourist being mugged in Russia last week? And then there was the awful train wreck last year just outside Moscow where three of the twenty-five persons killed were tourists. I’ll never visit Russia.

The speaker is blowing these isolated incidents out of proportion. Millions of tourists visit Russia with no problems. Another example occurs when the speaker simply lacks the information needed to give a factor its proper proportion or weight:

I don’t use electric wires in my home because it is well known that the human body can be injured by electric and magnetic fields.

The speaker does not realize all experts agree that electric and magnetic fields caused by home wiring are harmless. However, touching the metal within those wires is very dangerous.

Line-Drawing

If we improperly reject a vague claim because it is not as precise as we’d like, then we are using the line-drawing fallacy. Being vague is not being hopelessly vague. Also called the Bald Man Fallacy, the Fallacy of the Heap and the Sorites Fallacy.

Example:

Dwayne can never grow bald. Dwayne isn’t bald now. Don’t you agree that if he loses one hair, that won’t make him go from not bald to bald? And if he loses one hair after that, then this one loss, too, won’t make him go from not bald to bald. Therefore, no matter how much hair he loses, he can’t become bald.

Loaded Language

Loaded language is emotive terminology that expresses value judgments. When used in what appears to be an objective description, the terminology unfortunately can cause the listener to adopt those values when in fact no good reason has been given for doing so. Also called Prejudicial Language.

Example:

[News broadcast] In today’s top stories, Senator Smith carelessly cast the deciding vote today to pass both the budget bill and the trailer bill to fund yet another excessive watchdog committee over coastal development.

This broadcast is an editorial posing as a news report.

Loaded Question

Asking a question in a way that unfairly presumes the answer. This fallacy occurs commonly in polls, especially push polls, which are polls designed to push information onto the person being polled and not designed to learn the person’s views.

Example:

“If you knew that candidate B was a liar and crook, would you support candidate A or instead candidate B who is neither a liar nor a crook?”

Logic Chopping

Obscuring the issue by using overly-technical logic tools, especially the techniques of formal symbolic logic, that focus attention on trivial details. A form of Smokescreen and Quibbling.

Logical

See Formal.

Lying

A fallacy of reasoning that depends on intentionally saying something that is known to be false. If the lying occurs in an argument’s premise, then it is an example of the Fallacy of Questionable Premise.

Example:

Abraham Lincoln, Theodore Roosevelt, and John Kennedy were assassinated.

They were U.S. presidents.

Therefore, at least three U.S. presidents have been assassinated.

Roosevelt was never assassinated.

Maldistributed Middle

See Undistributed Middle.

Many Questions

See Complex Question.

Misconditionalization

See Modal Fallacy.

Misleading Accent

See the Fallacy of Accent.

Misleading Vividness

When the Fallacy of Jumping to Conclusions is due to a special emphasis on an anecdote or other piece of evidence, then the Fallacy of Misleading Vividness has occurred.

Example:

Yes, I read the side of the cigarette pack about smoking being harmful to your health. That’s the Surgeon General’s opinion, him and all his statistics. But let me tell you about my uncle. Uncle Harry has smoked cigarettes for forty years now and he’s never been sick a day in his life. He even won a ski race at Lake Tahoe in his age group last year. You should have seen him zip down the mountain. He smoked a cigarette during the award ceremony, and he had a broad smile on his face. I was really proud. I can still remember the cheering. Cigarette smoking can’t be as harmful as people say.

The vivid anecdote is the story about Uncle Harry. Too much emphasis is placed on it and not enough on the statistics from the Surgeon General.

Misplaced Concreteness

Mistakenly supposing that something is a concrete object with independent existence, when it’s not. Also known as the Fallacy of Reification and the Fallacy of Hypostatization.

Example:

There are two footballs lying on the floor of an otherwise empty room. When asked to count all the objects in the room, John says there are three: the two balls plus the group of two.

John mistakenly supposed a group or set of concrete objects is also a concrete object.

A less metaphysical example would be a situation where John says a criminal was caught by K-9 aid, and thereby supposed that K-9 aid was some sort of concrete object. John could have expressed the same point less misleadingly by saying a K-9 dog aided in catching a criminal.

Misplaced Burden of Proof

Committing the error of trying to get someone else to prove you are wrong, when it is your responsibility to prove you are correct.

Example:

Person A: I saw a green alien from outer space.
Person B: What!? Can you prove it?
Person A: You can’t prove I didn’t.

If someone says, “I saw a green alien from outer space,” you properly should ask for some proof. If the person responds with no more than something like, “Prove I didn’t,” then they are not accepting their burden of proof and are improperly trying to place it on your shoulders.

Misrepresentation

If the misrepresentation occurs on purpose, then it is an example of lying. If the misrepresentation occurs during a debate in which there is misrepresentation of the opponent’s claim, then it would be the cause of a Straw Man Fallacy.

Missing the Point

See Irrelevant Conclusion.

Mob Appeal

See Appeal to the People.

Modal

This is the error of treating modal conditionals as if the modality applies only to the then-part of the conditional when it more properly applies to the entire conditional.

Example:

James has two children. If James has two children, then he necessarily has more than one child. So, it is necessarily true that James has more than one child.

This apparently valid argument is invalid. It is not necessarily true that James has more than one child; it’s merely true that he has more than one child. He could have had no children. It is logically possible that James has no children even though he actually has two. The solution to the fallacy is to see that the premise “If James has two children, then he necessarily has more than one child,” requires the modality “necessarily” to apply logically to the entire conditional “If James has two children,then he has more than one child” even though grammatically it applies only to “he has more than one child.” The Modal Fallacy is the most well known of the infinitely many errors involving modal concepts. Modal concepts include necessity, possibility, and so forth.

Monte Carlo

See Gambler’s Fallacy.

Name Calling

See Ad Hominem.

Naturalistic

On a broad interpretation of this fallacy, it applies to any attempt to argue from an “is” to an “ought,” that is, from a list of facts to a conclusion about what ought to be done.

Example:

Because women are naturally capable of bearing and nursing children while men are not, women ought to be the primary caregivers of children.

Here is another example. Owners of financially successful companies are more successful than poor people in the competition for wealth, power and social status. Therefore, the poor deserve to be poor. There is considerable disagreement among philosophers regarding what sorts of arguments the term “Naturalistic Fallacy” legitimately applies to.

Neglecting a Common Cause

See Common Cause.

No Middle Ground

See False Dilemma.

No True Scotsman

This error is a kind of Ad Hoc Rescue of one’s generalization in which the reasoner re-characterizes the situation solely in order to escape refutation of the generalization.

Example:

Smith: All Scotsmen are loyal and brave.

Jones: But McDougal over there is a Scotsman, and he was arrested by his commanding officer for running from the enemy.

Smith: Well, if that’s right, it just shows that McDougal wasn’t a TRUE Scotsman.

Non Causa Pro Causa

This label is Latin for mistaking the “non-cause for the cause.” See False Cause.

Non Sequitur

When a conclusion is supported only by extremely weak reasons or by irrelevant reasons, the argument is fallacious and is said to be a Non Sequitur. However, we usually apply the term only when we cannot think of how to label the argument with a more specific fallacy name. Any deductively invalid inference is a non sequitur if it also very weak when assessed by inductive standards.

Example:

Nuclear disarmament is a risk, but everything in life involves a risk. Every time you drive in a car you are taking a risk. If you’re willing to drive in a car, you should be willing to have disarmament.

The following is not an example: “If she committed the murder, then there’d be his blood stains on her hands. His blood stains are on her hands. So, she committed the murder.” This deductively invalid argument uses the Fallacy of Affirming the Consequent, but it isn’t a non sequitur because it has significant inductive strength.

Obscurum per Obscurius

Explaining something obscure or mysterious by something that is even more obscure or more mysterious.

Example:

Let me explain what a lucky result is. It is a fortuitous collapse of the quantum mechanical wave packet that leads to a surprisingly pleasing result.

One-Sidedness

See the related fallacies of Confirmation Bias, Slanting and Suppressed Evidence.

Opposition

Being opposed to someone’s reasoning because of who they are, usually because of what group they are associated with. See the Fallacy of Guilt by Association.

Over-Fitting

See Curve Fitting.

Overgeneralization

See Sweeping Generalization.

Oversimplification

You oversimplify when you cover up relevant complexities or make a complicated problem appear to be too much simpler than it really is.

Example:

President Bush wants our country to trade with Fidel Castro’s Communist Cuba. I say there should be a trade embargo against Cuba. The issue in our election is Cuban trade, and if you are against it, then you should vote for me for president.

Whom to vote for should be decided by considering quite a number of issues in addition to Cuban trade. When an oversimplification results in falsely implying that a minor causal factor is the major one, then the reasoning also uses the False Cause Fallacy.

Past Practice

See Traditional Wisdom.

Pathetic

The Pathetic Fallacy is a mistaken belief due to attributing peculiarly human qualities to inanimate objects (but not to animals). The fallacy is caused by anthropomorphism.

Example:

Aargh, it won’t start again. This old car always breaks down on days when I have a job interview. It must be afraid that if I get a new job, then I’ll be able to afford a replacement, so it doesn’t want me to get to my interview on time.

Peer Pressure

See Appeal to the People.

Persuasive Definition

Some people try to win their arguments by getting you to accept their faulty definition. If you buy into their definition, they’ve practically persuaded you already. Same as the Definist Fallacy. Poisoning the Well when presenting a definition would be an example of a using persuasive definition.

Example:

Let’s define a Democrat as a leftist who desires to overtax the corporations and abolish freedom in the economic sphere.

Perfectionist

If you remark that a proposal or claim should be rejected solely because it doesn’t solve the problem perfectly, in cases where perfection isn’t really required, then you’ve used the Perfectionist Fallacy.

Example:

You said hiring a house cleaner would solve our cleaning problems because we both have full-time jobs. Now, look what happened. Every week, after cleaning the toaster oven, our house cleaner leaves it unplugged. I should never have listened to you about hiring a house cleaner.

Petitio Principii

See Begging the Question.

Poisoning the Well

Poisoning the well is a preemptive attack on a person in order to discredit their testimony or argument in advance of their giving it. A person who thereby becomes unreceptive to the testimony reasons fallaciously and has become a victim of the poisoner. This is a kind of Ad Hominem, Circumstantial Fallacy.

Example:

[Prosecuting attorney in court] When is the defense attorney planning to call that twice-convicted child molester, David Barnington, to the stand? OK, I’ll rephrase that. When is the defense attorney planning to call David Barnington to the stand?

Post Hoc

Suppose we notice that an event of kind A is followed in time by an event of kind B, and then hastily leap to the conclusion that A caused B. If so, our reasoning contains the Post Hoc Fallacy. Correlations are often good evidence of causal connection, so the fallacy occurs only when the leap to the causal conclusion is done “hastily.” The Latin term for the fallacy is Post Hoc, Ergo Propter Hoc (“After this, therefore because of this”). It is a kind of False Cause Fallacy.

Example:

I have noticed a pattern about all the basketball games I’ve been to this year. Every time I buy a good seat, our team wins. Every time I buy a cheap, bad seat, we lose. My buying a good seat must somehow be causing those wins.

Your background knowledge should tell you that this pattern probably won’t continue in the future; it’s just an accidental correlation that tells you nothing about the cause of your team’s wins.

Prejudicial Language

See Loaded Language.

Proof Surrogate

Substituting a distracting comment for a real proof.

Example:

I don’t need to tell a smart person like you that you should vote Republican.

This comment is trying to avoid a serious disagreement about whether one should vote Republican.

Prosecutor’s Fallacy

This is the mistake of over-emphasizing the strength of a piece of evidence while paying insufficient attention to the context.

Example:

Suppose a prosecutor is trying to gain a conviction and points to the evidence that at the scene of the burglary the police found a strand of the burglar’s hair. A forensic test showed that the burglar’s hair matches the suspect’s own hair. The forensic scientist testified that the chance of a randomly selected person producing such a match is only one in two thousand. The prosecutor concludes that the suspect has only a one in two thousand chance of being innocent. On the basis of only this evidence, the prosecutor asks the jury for a conviction.

That is fallacious reasoning, and if you are on the jury you should not be convinced. Here’s why. The prosecutor paid insufficient attention to the pool of potential suspects. Suppose that pool has six million people who could have committed the crime, all other things being equal. If the forensic lab had tested all those people, they’d find that about one in every two thousand of them would have a hair match, but that is three thousand people. The suspect is just one of the 3000, so the suspect is very probably innocent unless the prosecutor can provide more evidence. The prosecutor over-emphasized the strength of a

piece of evidence by focusing on one suspect while paying insufficient attention to the context which suggests a pool of many more suspects.

Prosody

See the Fallacy of Accent.

Quantifier Shift

Confusing the phrase “For all x there is some y” with “There is some (one) y such that for all x.”

Example:

Everybody loves someone, so there is someone whom everybody loves.

The error is also made if you reason this way: “Everything has a cause, so there’s one cause of everything.”

Questionable Begging

See Begging the Question

Questionable Analogy

See False Analogy.

Questionable Cause

See False Cause.

Questionable Premise

If you have sufficient background information to know that a premise is questionable or unlikely to be acceptable, then you use this fallacy if you accept an argument based on that premise. This broad category of fallacies of argumentation includes Appeal to Authority, False Dilemma, Inconsistency, Lying, Stacking the Deck, Straw Man, Suppressed Evidence, and many others.

Quibbling

We quibble when we complain about a minor point and falsely believe that this complaint somehow undermines the main point. To avoid this error, the logical reasoner will not make a mountain out of a mole hill nor take people too literally. Logic Chopping is a kind of quibbling.

Example:

I’ve found typographical errors in your poem, so the poem is neither inspired nor perceptive.

Quoting out of Context

If you quote someone, but select the quotation so that essential context is not available and therefore the person’s views are distorted, then you’ve quoted “out of context.” Quoting out of context in an argument creates a Straw Man Fallacy. The fallacy is also called “contextomy.”

Example:

Smith: I’ve been reading about a peculiar game in this article about vegetarianism. When we play this game, we lean out from a fourth-story window and drop down strings containing “Free food” signs on the end in order to hook unsuspecting passers-by. It’s really outrageous, isn’t it? Yet isn’t that precisely what sports fishermen do for entertainment from their fishing boats? The article says it’s time we put an end to sport fishing.

Jones: Let me quote Smith for you. He says “We…hook unsuspecting passers-by.” What sort of moral monster is this man Smith?

Jones’s selective quotation is fallacious because it makes Smith appear to advocate this immoral activity when the context makes it clear that he doesn’t.

Rationalization

We rationalize when we inauthentically offer reasons to support our claim. We are rationalizing when we give someone a reason to justify our action even though we know this reason is not really our own reason for our action, usually because the offered reason will sound better to the audience than our actual reason.

Example:

“I bought the matzo bread from Kroger’s Supermarket because it is the cheapest brand and I wanted to save money,” says Alex [who knows he bought the bread from Kroger’s Supermarket only because his girlfriend works there].

Red Herring

A red herring is a smelly fish that would distract even a bloodhound. It is also a digression that leads the reasoner off the track of considering only relevant information.

Example:

Will the new tax in Senate Bill 47 unfairly hurt business? I notice that the main provision of the bill is that the tax is higher for large employers (fifty or more employees) as opposed to small employers (six to forty-nine employees). To decide on the fairness of the bill, we must first determine whether employees who work for large employers have better working conditions than employees who work for small employers. I am ready to volunteer for a new committee to study this question. How do you suppose the committee should go about collecting the data we need?

Bringing up the issue of working conditions and the committee is the red herring diverting us from the main issue of whether Senate Bill 47 unfairly hurts business. An intentional false lead in a criminal investigation is another example of a red herring.

Refutation by Caricature

See the Fallacy of Caricaturization.

Regression

This fallacy occurs when regression to the mean is mistaken for a sign of a causal connection. Also called the Regressive Fallacy. It is a kind of False Cause Fallacy.

Example:

You are investigating the average heights of groups of people living in the United States. You sample some people living in Columbus, Ohio and determine their average height. You have the numerical figure for the mean height of people living in the U.S., and you notice that members of your sample from Columbus have an average height that differs from this mean. Your second sample of the same size is from people living in Dayton, Ohio. When you find that this group’s average height is closer to the U.S. mean height [as it is very likely to be due to common statistical regression to the mean], you falsely conclude that there must be something causing people living in Dayton to be more like the average U.S. resident than people living in Columbus.

There is most probably nothing causing people from Dayton to be more like the average resident of the U.S.; but rather what is happening is that averages are regressing to the mean.

Reification

Considering a word to be referring to an object, when the meaning of the word can be accounted for more mundanely without assuming the object exists. Also known as the Fallacy of Misplaced Concreteness and the Hypostatization.

Example:

The 19th century composer Tchaikovsky described the introduction to his Fifth Symphony as “a complete resignation before fate.”

He is treating “fate” as if it is naming some object, when it would be less misleading, but also less poetic, to say the introduction suggests that listeners will resign themselves to accepting whatever events happen to them. The Fallacy occurs also when someone says, “I succumbed to nostalgia.” Without committing the fallacy, one can make the same point by saying, “My mental state caused actions that would best be described as my reflecting an unusual desire to return to some past period of my life.” Another common way the Fallacy is used is when someone says that if you understand what “Sherlock Holmes” means, then Sherlock Holmes exists in your understanding. The larger point being made in this last example is that nouns can be meaningful without them referring to an object, yet those who use the Fallacy of Reification do not understand this point.

Reversing Causation

Drawing an improper conclusion about causation due to a causal assumption that reverses cause and effect. A kind of False Cause Fallacy.

Example:

All the corporate officers of Miami Electronics and Power have big boats. If you’re ever going to become an officer of MEP, you’d better get a bigger boat.

The false assumption here is that having a big boat helps cause you to be an officer in MEP, whereas the reverse is true. Being an officer causes you to have the high income that enables you to purchase a big boat.

Scapegoating

If you unfairly blame an unpopular person or group of people for a problem, then you are scapegoating. This is a kind of Fallacy of Appeal to Emotions.

Example:

Augurs were official diviners of ancient Rome. During the pre-Christian period, when Christians were unpopular, an augur would make a prediction for the emperor about, say, whether a military attack would have a successful outcome. If the prediction failed to come true, the augur would not admit failure but instead would blame nearby Christians for their evil influence on his divining powers. The elimination of these Christians, the augur would claim, could restore his divining powers and help the emperor. By using this reasoning tactic, the augur was scapegoating the Christians.

Scare Tactic

If you suppose that terrorizing your opponent is giving him a reason for believing that you are correct, then you are using a scare tactic and reasoning fallaciously.

Example:

David: My father owns the department store that gives your newspaper fifteen percent of all its advertising revenue, so I’m sure you won’t want to publish any story of my arrest for spray painting the college.

Newspaper editor: Yes, David, I see your point. The story really isn’t newsworthy.

David has given the editor a financial reason not to publish, but he has not given a relevant reason why the story is not newsworthy. David’s tactics are scaring the editor, but it’s the editor who uses the Scare Tactic Fallacy, not David. David has merely used a scare tactic. This fallacy’s name emphasizes the cause of the fallacy rather than the error itself. See also the related Fallacy of Appeal to Emotions.

Scope

The Scope Fallacy is caused by improperly changing or misrepresenting the scope of a phrase.

Example:

Every concerned citizen who believes that someone living in the US is a terrorist should make a report to the authorities. But Shelley told me herself that she believes there are terrorists living in the US, yet she hasn’t made any reports. So, she must not be a concerned citizen.

The first sentence has ambiguous scope. It was probably originally meant in this sense: Every concerned citizen who believes (of someone that this person is living in the US and is a terrorist) should make a report to the authorities. But the speaker is clearly taking the sentence in its other, less plausible sense: Every concerned citizen who believes (that there is someone or other living in the US who is a terrorist) should make a report to the authorities. Scope fallacies usually are Amphibolies.

Secundum Quid

See Accident and Converse Accident, two versions of the fallacy.

Selective Attention

Improperly focusing attention on certain things and ignoring others.

Example:

Father: Justine, how was your school day today? Another C on the history test like last time?

Justine: Dad, I got an A- on my history test today. Isn’t that great? Only one student got an A.

Father: I see you weren’t the one with the A. And what about the math quiz?

Justine: I think I did OK, better than last time.

Father: If you really did well, you’d be sure. What I’m sure of is that today was a pretty bad day for you.

The pessimist who pays attention to all the bad news and ignores the good news thereby use the Fallacy of Selective Attention. The remedy for this fallacy is to pay attention to all the relevant evidence. The most common examples of selective attention are the fallacy of Suppressed Evidence and the fallacy of Confirmation Bias. See also the Sharpshooter’s Fallacy.

Self-Fulfilling Prophecy

The fallacy occurs when the act of prophesying will itself produce the effect that is prophesied, but the reasoner doesn’t recognize this and believes the prophesy is a significant insight.

Example:

A group of students are selected to be interviewed individually by the teacher. Each selected student is told that the teacher has predicted they will do significantly better in their future school work. Actually, though, the teacher has no special information about the students and has picked the group at random. If the students believe this prediction about themselves, then, given human psychology, it is likely that they will do better merely because of the teacher’s making the prediction.

The prediction will fulfill itself, so to speak, and the students’ reasoning contains the fallacy.

This fallacy can be dangerous in an atmosphere of potential war between nations when the leader of a nation predicts that their nation will go to war against their enemy. This prediction could very well precipitate an enemy attack because the enemy calculates that if war is inevitable then it is to their military advantage not to get caught by surprise.

Self-Selection

A Biased Generalization in which the bias is due to self-selection for membership in the sample used to make the generalization.

Example:

The radio announcer at a student radio station in New York asks listeners to call in and say whether they favor Jones or Smith for president. 80% of the callers favor Jones, so the announcer declares that Americans prefer Jones to Smith.

The problem here is that the callers selected themselves for membership in the sample, but clearly the sample is unlikely to be representative of Americans.

Sharpshooter’s

The Sharpshooter’s Fallacy gets its name from someone shooting a rifle at the side of the barn and then going over and drawing a target and bulls eye concentrically around the bullet hole. The fallacy is caused by overemphasizing random results or making selective use of coincidence. See the Fallacy of Selective Attention.

Example:

Psychic Sarah makes twenty-six predictions about what will happen next year. When one, but only one, of the predictions comes true, she says, “Aha! I can see into the future.”

Slanting

This error occurs when the issue is not treated fairly because of misrepresenting the evidence by, say, suppressing part of it, or misconstruing some of it, or simply lying. See the following related fallacies: Confirmation Bias, Lying, Misrepresentation, Questionable Premise, Quoting out of Context, Straw Man, Suppressed Evidence.

Slippery Slope

Suppose someone claims that a first step (in a chain of causes and effects, or a chain of reasoning) will probably lead to a second step that in turn will probably lead to another step and so on until a final step ends in trouble. If the likelihood of the trouble occurring is exaggerated, the Slippery Slope Fallacy is present.

Example:

Mom: Those look like bags under your eyes. Are you getting enough sleep?

Jeff: I had a test and stayed up late studying.

Mom: You didn’t take any drugs, did you?

Jeff: Just caffeine in my coffee, like I always do.

Mom: Jeff! You know what happens when people take drugs! Pretty soon the caffeine won’t be strong enough. Then you will take something stronger, maybe someone’s diet pill. Then, something even stronger. Eventually, you will be doing cocaine. Then you will be a crack addict! So, don’t drink that coffee.

The form of a Slippery Slope Fallacy looks like this:

A often leads to B.

B often leads to C.

C often leads to D.

…

Z leads to HELL.

We don’t want to go to HELL.

So, don’t take that first step A.

The key claim in the fallacy is that taking the first step will lead to the final, unacceptable step. Arguments of this form may or may not be fallacious depending on the probabilities involved in each step. The analyst asks how likely it is that taking the first step will lead to the final step. For example, if A leads to B with a probability of 80 percent, and B leads to C with a probability of 80 percent, and C leads to D with a probability of 80 percent, is it likely that A will eventually lead to D? No, not at all; there is about a 50% chance. The proper analysis of a slippery slope argument depends on sensitivity to such probabilistic calculations. Regarding terminology, if the chain of reasoning A, B, C, D, …, Z is about causes, then the fallacy is called the Domino Fallacy.

Small Sample

This is the fallacy of using too small a sample. If the sample is too small to provide a representative sample of the population, and if we have the background information to know that there is this problem with sample size, yet we still accept the generalization upon the sample results, then we use the fallacy. This fallacy is the Fallacy of Hasty Generalization, but it emphasizes statistical sampling techniques.

Example:

I’ve eaten in restaurants twice in my life, and both times I’ve gotten sick. I’ve learned one thing from these experiences: restaurants make me sick.

How big a sample do you need to avoid the fallacy? Relying on background knowledge about a population’s lack of diversity can reduce the sample size needed for the generalization. With a completely homogeneous population, a sample of one is large enough to be representative of the population; if we’ve seen one electron, we’ve seen them all. However, eating in one restaurant is not like eating in any restaurant, so far as getting sick is concerned. We cannot place a specific number on sample size below which the fallacy is produced unless we know about homogeneity of the population and the margin of error and the confidence level.

Smear Tactic

A smear tactic is an unfair characterization either of the opponent or the opponent’s position or argument. Smearing the opponent causes an Ad Hominem Fallacy. Smearing the opponent’s argument causes a Straw Man Fallacy.

Smokescreen

This fallacy occurs by offering too many details in order either to obscure the point or to cover-up counter-evidence. In the latter case it would be an example of the Fallacy of Suppressed Evidence. If you produce a smokescreen by bringing up an irrelevant issue, then you produce a Red Herring Fallacy. Sometimes called Clouding the Issue.

Example:

Senator, wait before you vote on Senate Bill 88. Do you realize that Delaware passed a bill on the same subject in 1932, but it was ruled unconstitutional for these twenty reasons. Let me list them here…. Also, before you vote on SB 88 you need to know that …. And so on.

There is no recipe to follow in distinguishing smokescreens from reasonable appeals to caution and care.

Sorites

See Line-Drawing.

Special Pleading

Special pleading is a form of inconsistency in which the reasoner doesn’t apply his or her principles consistently. It is the fallacy of applying a general principle to various situations but not applying it to a special situation that interests the arguer even though the general principle properly applies to that special situation, too.

Example:

Everyone has a duty to help the police do their job, no matter who the suspect is. That is why we must support investigations into corruption in the police department. No person is above the law. Of course, if the police come knocking on my door to ask about my neighbors and the robberies in our building, I know nothing. I’m not about to rat on anybody.

In our example, the principle of helping the police is applied to investigations of police officers but not to one’s neighbors.

Specificity

Drawing an overly specific conclusion from the evidence. A kind of jumping to conclusions.

Example:

The trigonometry calculation came out to 5,005.6833 feet, so that’s how wide the cloud is up there.

Stacking the Deck

See Suppressed Evidence and Slanting.

Stereotyping

Using stereotypes as if they are accurate generalizations for the whole group is an error in reasoning. Stereotypes are general beliefs we use to categorize people, objects, and events; but these beliefs are overstatements that shouldn’t be taken literally. For example, consider the stereotype “She’s Mexican, so she’s going to be late.” This conveys a mistaken impression of all Mexicans. On the other hand, even though most Mexicans are punctual, a German is more apt to be punctual than a Mexican, and this fact is said to be the “kernel of truth” in the stereotype. The danger in our using stereotypes is that speakers or listeners will not realize that even the best stereotypes are accurate only when taken probabilistically. As a consequence, the use of stereotypes can breed racism, sexism, and other forms of bigotry.

Example:

German people aren’t good at dancing our sambas. She’s German. So, she’s not going to be any good at dancing our sambas.

This argument is deductively valid, but it’s unsound because it rests on a false, stereotypical premise. The grain of truth in the stereotype is that the average German doesn’t dance sambas as well as the average South American, but to overgeneralize and presume that ALL Germans are poor samba dancers compared to South Americans is a mistake called “stereotyping.”

Straw Man

Your reasoning contains the Straw Man Fallacy whenever you attribute an easily refuted position to your opponent, one that the opponent would not endorse, and then proceed to attack the easily refuted position (the straw man) believing you have thereby undermined the real man, the opponent’s actual position. If the unfair and inaccurate representation is on purpose, then the Straw Man Fallacy is caused by lying.

Example (a debate before the city council):

Opponent: Because of the killing and suffering of Indians that followed Columbus’s discovery of America, the City of Berkeley should declare that Columbus Day will no longer be observed in our city.

Speaker: This is ridiculous, fellow members of the city council. It’s not true that everybody who ever came to America from another country somehow oppressed the Indians. I say we should continue to observe Columbus Day, and vote down this resolution that will make the City of Berkeley the laughing stock of the nation.

The Opponent is likely to respond with “Wait! That’s not what I said.” The Speaker has twisted what his Opponent said. The Opponent never said nor even indirectly suggested that everybody who ever came to America from another country somehow oppressed the Indians.

Style Over Substance

Unfortunately the style with which an argument is presented is sometimes taken as adding to the substance or strength of the argument.

Example:

You’ve just been told by the salesperson that the new Maytag is an excellent washing machine because it has a double washing cycle. If you notice that the salesperson smiled at you and was well dressed, this does not add to the quality of the salesperson’s argument, but unfortunately it does for those who are influenced by style over substance, as most of us are.

Subjectivist

The Subjectivist Fallacy occurs when it is mistakenly supposed that a good reason to reject a claim is that truth on the matter is relative to the person or group.

Example:

Justine has just given Jake her reasons for believing that the Devil is an imaginary evil person. Jake, not wanting to accept her conclusion, responds with, “That’s perhaps true for you, but it’s not true for me.”

Superstitious Thinking

Reasoning deserves to be called superstitious if it is based on reasons that are well known to be unacceptable, usually due to unreasonable fear of the unknown, trust in magic, or an obviously false idea of what can cause what. A belief produced by superstitious reasoning is called a superstition. The fallacy is an instance of the False Cause Fallacy.

Example:

I never walk under ladders; it’s bad luck.

It may be a good idea not to walk under ladders, but a proper reason to believe this is that workers on ladders occasionally drop things, and that ladders might have dripping wet paint that could damage your clothes. An improper reason for not walking under ladders is that it is bad luck to do so.

Suppressed Evidence

Intentionally failing to use information suspected of being relevant and significant is committing the fallacy of suppressed evidence. This fallacy usually occurs when the information counts against one’s own conclusion. Perhaps the arguer is not mentioning that experts have recently objected to one of his premises. The fallacy is a kind of Fallacy of Selective Attention.

Example:

Buying the Cray Mac 11 computer for our company was the right thing to do. It meets our company’s needs; it runs the programs we want it to run; it will be delivered quickly; and it costs much less than what we had budgeted.

This appears to be a good argument, but you’d change your assessment of the argument if you learned the speaker has intentionally suppressed the relevant evidence that the company’s Cray Mac 11 was purchased from his brother-in-law at a 30 percent higher price than it could have been purchased elsewhere, and if you learned that a recent unbiased analysis of ten comparable computers placed the Cray Mac 11 near the bottom of the list.

If the relevant information is not intentionally suppressed but rather inadvertently overlooked, the fallacy of suppressed evidence also is said to occur, although the fallacy’s name is misleading in this case. The fallacy is also called the Fallacy of Incomplete Evidence and Cherry-Picking the Evidence. See also Slanting.

Sweeping Generalization

See Fallacy of Accident.

Syllogistic

Syllogistic fallacies are kinds of invalid categorical syllogisms. This list contains the Fallacy of Undistributed Middle and the Fallacy of Four Terms, and a few others though there are a great many such formal fallacies.

Tokenism

If you interpret a merely token gesture as an adequate substitute for the real thing, you’ve been taken in by tokenism.

Example:

How can you call our organization racist? After all, our receptionist is African American.

If you accept this line of reasoning, you have been taken in by tokenism.

Traditional Wisdom

If you say or imply that a practice must be OK today simply because it has been the apparently wise practice in the past, then your reasoning contains the fallacy of traditional wisdom. Procedures that are being practiced and that have a tradition of being practiced might or might not be able to be given a good justification, but merely saying that they have been practiced in the past is not always good enough, in which case the fallacy is present. Also called Argumentum Consensus Gentium when the traditional wisdom is that of nations.

Example:

Of course we should buy IBM’s computer whenever we need new computers. We have been buying IBM as far back as anyone can remember.

The “of course” is the problem. The traditional wisdom of IBM being the right buy is some reason to buy IBM next time, but it’s not a good enough reason in a climate of changing products, so the “of course” indicates that the Fallacy of Traditional Wisdom has occurred. The fallacy is essentially the same as the fallacies of Appeal to the Common Practice, Gallery, Masses, Mob, Past Practice, People, Peers, and Popularity.

Tu Quoque

The Fallacy of Tu Quoque occurs in our reasoning if we conclude that someone’s argument not to perform some act must be faulty because the arguer himself or herself has performed it. Similarly, when we point out that the arguer doesn’t practice what he or she preaches, and then suppose that there must be an error in the preaching for only this reason, then we are reasoning fallaciously and creating a Tu Quoque. This is a kind of Ad Hominem Circumstantial Fallacy.

Example:

Look who’s talking. You say I shouldn’t become an alcoholic because it will hurt me and my family, yet you yourself are an alcoholic, so your argument can’t be worth listening to.

Discovering that a speaker is a hypocrite is a reason to be suspicious of the speaker’s reasoning, but it is not a sufficient reason to discount it.

Two Wrongs do not Make a Right

When you defend your wrong action as being right because someone previously has acted wrongly, you are using the fallacy called “Two Wrongs do not Make a Right.” This is a special kind of Ad Hominem Fallacy.

Example:

Oops, no paper this morning. Somebody in our apartment building probably stole my newspaper. So, that makes it OK for me to steal one from my neighbor’s doormat while nobody else is out here in the hallway.

Undistributed Middle

In syllogistic logic, failing to distribute the middle term over at least one of the other terms is the fallacy of undistributed middle. Also called the Fallacy of Maldistributed Middle.

Example:

All collies are animals.

All dogs are animals.

Therefore, all collies are dogs.

The middle term (“animals”) is in the predicate of both universal affirmative premises and therefore is undistributed. This formal fallacy has the logical form: All C are A. All D are A. Therefore, all C are D.

Unfalsifiability

This error in explanation occurs when the explanation contains a claim that is not falsifiable, because there is no way to check on the claim. That is, there would be no way to show the claim to be false if it were false.

Example:

He lied because he’s possessed by demons.

This could be the correct explanation of his lying, but there’s no way to check on whether it’s correct. You can check whether he’s twitching and moaning, but this won’t be evidence about whether a supernatural force is controlling his body. The claim that he’s possessed can’t be verified if it’s true, and it can’t be falsified if it’s false. So, the claim is too odd to be relied upon for an explanation of his lying. Relying on the claim is an instance of fallacious reasoning.

Unrepresentative Generalization

If the plants on my plate are not representative of all plants, then the following generalization should not be trusted.

Example:

Each plant on my plate is edible.

So, all plants are edible.

The set of plants on my plate is called “the sample” in the technical vocabulary of statistics, and the set of all plants is called “the target population.” If you are going to generalize on a sample, then you want your sample to be representative of the target population, that is, to be like it in the relevant respects. This fallacy is the same as the Fallacy of Unrepresentative Sample.

Unrepresentative Sample

If the means of collecting the sample from the population are likely to produce a sample that is unrepresentative of the population, then a generalization upon the sample data is an inference using the fallacy of unrepresentative sample. A kind of Hasty Generalization. When some of the statistical evidence is expected to be relevant to the results but is hidden or overlooked, the fallacy is called Suppressed Evidence. There are many ways to bias a sample. Knowingly selecting atypical members of the population produces a biased sample.

Example:

The two men in the matching green suits that I met at the Star Trek Convention in Las Vegas had a terrible fear of cats. I remember their saying they were from France. I’ve never met anyone else from France, so I suppose everyone there has a terrible fear of cats.

Most people’s background information is sufficient to tell them that people at this sort of convention are unlikely to be representative, that is, are likely to be atypical members of the rest of society. Having a small sample does not by itself cause the sample to be biased. Small samples are OK if there is a corresponding large margin of error or low confidence level.

Large samples can be unrepresentative, too.

Example:

We’ve polled over 400,000 Southern Baptists and asked them whether the best religion in the world is Southern Baptist. We have over 99% agreement, which proves our point about which religion is best.

Getting a larger sample size does not overcome sampling bias.

Untestability

See Unfalsifiability.

Vested Interest

The Vested Interest Fallacy occurs when a person argues that someone’s claim is incorrect or their recommended action is not worthy of being followed because the person is motivated by their interest in gaining something by it, with the implication that were it not for this vested interest then the person wouldn’t make the claim or recommend the action. Because this reasoning attacks the reasoner rather than the reasoning itself, it is a kind of Ad Hominem fallacy.

Example:

According to Samantha we all should vote for Anderson for Congress. Yet she’s a lobbyist in the pay of Anderson and will get a nice job in the capitol if he’s elected, so that convinces me that she is giving bad advice.

This is fallacious reasoning by the speaker because whether Samantha is giving good advice about Anderson ought to depend on Anderson’s qualifications, not on whether Samantha will or won’t get a nice job if he’s elected.

Victory by Definition

Same as the fallacy of Persuasive Definition.

Weak Analogy

See False Analogy.

Willed ignorance

I’ve got my mind made up, so don’t confuse me with the facts. This is usually a case of the Traditional Wisdom Fallacy.

Example:

Of course she’s made a mistake. We’ve always had meat and potatoes for dinner, and our ancestors have always had meat and potatoes for dinner, and so nobody knows what they’re talking about when they start saying meat and potatoes are bad for us.

Wishful Thinking

A reasoner who suggests that a claim is true, or false, merely because he or she strongly hopes it is, is using the fallacy of wishful thinking. Wishing something is true is not a relevant reason for claiming that it is actually true.

Example:

There’s got to be an error here in the history book. It says Thomas Jefferson had slaves. I don’t believe it. He was our best president, and a good president would never do such a thing. That would be awful.

You-Too

This is an informal name for the Tu Quoque fallacy.

7. References and Further Reading

Eemeren, Frans H. van, R. F. Grootendorst, F. S. Henkemans, J. A. Blair, R. H. Johnson, E. C. W. Krabbe, C. W. Plantin, D. N. Walton, C. A. Willard, J. A. Woods, and D. F. Zarefsky, 1996. Fundamentals of Argumentation Theory: A Handbook of Historical Backgrounds and Contemporary Developments. Mahwah, New Jersey, Lawrence Erlbaum Associates, Publishers.
Fearnside, W. Ward and William B. Holther, 1959. Fallacy: The Counterfeit of Argument. Prentice-Hall, Inc. Englewood Cliffs, New Jersey.
Fischer, David Hackett., 1970. Historian’s Fallacies: Toward Logic of Historical Thought. New York, Harper & Row, New York, N.Y.
- This book contains additional fallacies to those in this article, but they are much less common, and many have obscure names.
Groarke, Leo and C. Tindale, 2003. Good Reasoning Matters! 3rd edition, Toronto, Oxford University Press.
Hamblin, Charles L., 1970. Fallacies. London, Methuen.
Hansen, Has V. and R. C. Pinto., 1995. Fallacies: Classical and Contemporary Readings. University Park, Pennsylvania State University Press.
Huff, Darrell, 1954. How to Lie with Statistics. New York, W. W. Norton.
Levi, D. S., 1994. “Begging What is at Issue in the Argument,” Argumentation, 8, 265-282.
Schwartz, Thomas, 1981. “Logic as a Liberal Art,” Teaching Philosophy 4, 231-247.
Walton, Douglas N., 1989. Informal Logic: A Handbook for Critical Argumentation. Cambridge, Cambridge University Press.
Walton, Douglas N., 1995. A Pragmatic Theory of Fallacy. Tuscaloosa, University of Alabama Press.
Walton, Douglas N., 1997. Appeal to Expert Opinion: Arguments from Authority. University Park, Pennsylvania State University Press.
Whately, Richard, 1836. Elements of Logic. New York, Jackson.
Woods, John and D. N. Walton, 1989. Fallacies: Selected Papers 1972-1982. Dordrecht, Holland, Foris.

Research on the fallacies of informal logic is regularly published in the following journals: Argumentation, Argumentation and Advocacy, Informal Logic, Philosophy and Rhetoric, and Teaching Philosophy.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

The Ethics and Epistemology of Trust

Trust is a topic of long-standing philosophical interest because it is indispensable to the success of almost every kind of coordinated human activity, from politics and business to sport and scientific research. Even more, trust is necessary for the successful dissemination of knowledge, and, by extension, for nearly any form of practical deliberation and planning that requires us to make use of more information than we are able to gather individually and verify ourselves. In short, without trust, we could achieve few of our goals and would know very little. Despite trust’s fundamental importance in human life, there is substantial philosophical disagreement about what trust is, and further, how trusting is normatively constrained and best theorized about in relation to other things we value. Consequently, contemporary philosophical literature on trust features a range of different theoretical options for making sense of trust, and these options differ in how they (among other things) take trust to relate to such things as reliance, optimism, belief, obligations, monitoring, expectations, competence, trustworthiness, assurance, and doubt. With the aim of exploring these myriad issues in an organized way, this article is divided into three sections, each of which offers an overview of key (and sometimes interconnected) ethical and epistemological themes in the philosophy of trust: (1) The Nature of Trust; (2) The Normativity of Trust; and (3) The Value of Trust.
Table of Contents

The Nature of Trust
The Normativity of Trust
The Value of Trust
References and Further Reading

1. The Nature of Trust

What is trust? To a very first approximation, trust is an attitude or a hybrid of attitudes (for instance, optimism, hope, belief, and so forth) toward a trustee, that involves some (non-negligible) vulnerability to being betrayed on the truster’s side. This general remark, of course, does not take us very far. For example, we may ask: what kind of attitude (or hybrid of attitudes) is trust exactly? Suppose that (as some philosophers of trust maintain) trust requires an attitude of optimism. Even if that is right, getting a grip on trust requires a further conception of what the truster, qua truster, must be optimistic about. One standard answer here proceeds as follows: trust (at least, in the paradigmatic case of interpersonal trust) involves some form of optimism that the trustee will take care of things as we have entrusted them. In the special case of trusting the testimony of another—a topic at the centre of the epistemology of trust—this will involve at least some form of optimism that the speaker is living up to her expectations as a testifier; for instance, that the speaker knows what she says or, more weakly, is telling the truth.

Even at this level of specificity, though, the nature of trust remains fairly elusive. Does trusting involve (for example) merely optimism that the trustee will take care of things as entrusted, or does it also involve optimism that the trustee will do so compresently (that is, concurrently) with certain beliefs, non-doxastic attitudes, emotions or motivations on the part of the trustee, such as with goodwill (Baier 1986; Jones 1996). Moreover, and apart from such positive characterizations of trust, does trust also have a negative condition to the effect that one fails to genuinely trust another if one—past some threshold of vigilance—monitors the trustee (or otherwise, reflects critically on the trust relationship so as to attempt to minimize risk)?

These are among the questions that occupy philosophers working on the nature of trust. This section explores four subthemes aimed at clarifying trust’s nature: these concern (a) the distinction between trust and reliance; (b) two-place vs three-place trust; (c) doxastic versus non-doxastic conditions on trust; (d) deception detection and monitoring.

a. Reliance vs. Interpersonal Trust

Reliance is ubiquitous. You rely on the weather not to suddenly drop by 20 degrees, leaving you shivering; you rely on your chair not to give out, causing you to tumble to the floor. In these cases, are you trusting the weather and trusting your chair, respectively? Many philosophers working on trust believe the correct answer here is “no”. This is so even though, in each case, you are depending on these things in a way that leaves you potentially vulnerable.

The idea that trust is a kind of dependence that does not reduce to mere reliance (of the sort that might be apposite to things like chairs and the weather) is widely accepted. According to Annette Baier (1986: 244) the crux of the difference is that trust involves relying on another not just to take care of things any old way (for instance, out of fear, begrudgingly, accidentally, and so forth) but rather that they do so out of goodwill toward the truster; relatedly, a salient kind of vulnerability one subjects oneself to in trusting is vulnerability to the limits of that goodwill. On this way of thinking, then, you are not trusting someone if you (for instance) rely on that person to act in a characteristically self-centred way, even if you depend on them to do so, and even if you fully expect them to do so.

Katherine Hawley (2014, 2019) rejects the idea that what distinguishes trust from mere reliance has anything to do with the trustee’s motives or goodwill. Instead, on her account, the crucial difference is that in cases of trust, but not of mere reliance, a commitment on the part of the trustee must be in place. Consider a situation in which you reliably bring too much lunch to work, because you are a bad judge of quantities, and I get to eat your leftovers. My attitude to you in this situation is one of reliance, but not trust; in Hawley’s view, that is because you have made no commitment to provide me with lunch:

However, if we adapt the case so as to suggest commitment, it starts to look more like a matter of trust. Suppose we enjoy eating together regularly, you describe your plans for the next day, I say how much I’m looking forward to it, and so on. To the extent that this involves a commitment on your part, it seems reasonable for me to feel betrayed and expect apologies if one day you fail to bring lunch and I go hungry (Hawley 2014: 10).

If it is right that trust differs in important ways from mere reliance, then a consequence is that while reliance is something we can have toward people (when we merely depend on them) as well as toward objects (for instance, when we depend on the weather and chairs), not just anything can be genuinely trusted. Karen Jones (1996) captures this point, one that circumscribes people as the fitting objects of genuine trust, as follows:

One can only trust things that have wills, since only things with wills can have goodwills—although having a will is to be given a generous interpretation so as to include, for example, firms and government bodies. Machinery can be relied on, but only agents, natural or artificial, can be trusted (1996: 14).

If, as the foregoing suggests, trust relationships are best understood as a special subset of reliance relationships, should we also expect the appropriate attitudes toward misplaced trust to be a subset of a more general attitude-type we might have in response to misplaced reliance?

Katherine Hawley (2014) thinks so. As she puts it, misplaced trust warrants a feeling of betrayal. But the same is not so for misplaced (mere) reliance. Suppose, to draw from an example she offers (2014: 2) that a shelf you rely on to support a vase gives out; it would be inappropriate, Hawley maintains, to feel betrayed, even if a more general attitude of (mere) disappointment befits such misplaced reliance. Misplaced trust, by contrast, befits a feeling of betrayal.

In contrast with the above thinking, according to which disanalogies between trust and mere reliance are taken to support distinguishing trust from reliance, some philosophers have taken a more permissive approach to trust, by distinguishing between two senses of trust that differ with respect to the similarity of each to mere reliance.

Paul Faulkner (2011: 246; compare McMyler 2011), for example, distinguishes between what he calls predictive and affective trust. Predictive trust involves merely reliance in conjunction with a belief that the trustee will take care of things (namely, a prediction). Misplaced predictions warrant disappointment, not betrayal, and so predictive trust (like mere reliance) cannot be betrayed. Affective trust, by contrast, is a thick, interpersonal normative notion, and, according to Faulkner, it involves, along with reliance, a kind of normative expectation to the effect that the trustee (i) ought to prove dependable; and that they (ii) will prove dependable for that reason. On this view, it is affective trust that is uniquely subject to betrayal, even though predictive trust, which is a genuine variety of trust, is not.

b. Two-place vs. Three-place Trust

The distinction between two-place and three-place trust, first drawn by Horsburgh (1960), is meant to capture a simple idea: sometimes when we trust someone, we trust them to do some particular thing (see also Holton 1994; Hardin 1992), for example, you might trust your neighbour to water your plant while you are away on holiday but not to look after your daughter. This is three-place trust, with an infinitival component (schematically: A trusts B to X). Not all trusting fits this schema. You might also simply trust your neighbour generally (schematically: A trusts B) and in a way that does not involve any particular task in mind. Three- and two-place trust are thus different in the sense that the object of trust is specified in the former case but not in the latter.

While there is nothing philosophically contentious about drawing this distinction, the relationship between two- and three-place trust becomes contested when one of these kinds of trust is taken to be, in some sense, more fundamental than the other. To be clear, it is uncontentious that philosophers, as Faulkner (2015: 242) notes, tend to “focus” on three-place trust. What is contentious is whether any—and if so, which—of these notions is theoretically more basic.

The overwhelming view in the literature maintains that three-place trust is the fundamental notion and that two-place (as well as one-place) trust are derivative upon three-place trust (Baier 1986; Holton 1994; Jones 1996; Faulkner 2007; Hieronymi 2008; Hawley 2014; compare Faulkner 2015). This view can be called three-place fundamentalism.

According to Baier, for instance, trust is centrally concerned with “one person trusting another with some valued thing” (1986: 236) and for Hawley, trust is “primarily a three-place relation, involving two people and a task” (2014: 2). We might think of two-place (X trusts Y) trust as derived from three-place trust (X trusts Y to phi) in a way that is broadly analogous to how one might extract a diachronic view of someone on the basis of discrete interactions, as opposed to starting with any such diachronic view. On this way of thinking, three-place trust leads to two place trust over time, and is established on the basis of it.

Resistance to three-place fundamentalism has been advanced by Faulkner (2015) and Domenicucci and Holton (2017). Faulkner takes as a starting point that it is a desideratum on any plausible account of trust that it should accommodate infant trust, and thus, “that it not make essential to trusting the use of concepts or abilities which a child cannot be reasonably believed to possess” (1986: 244). As Faulkner (2015: 5) maintains, however, an infant, in trusting its mother “need not have any further thought; the trust is no more than a confidence or faith – a trust, as we say – in his mother”. And so, Faulkner reasons, if we take Baier’s constraint seriously, then we “have to take two-place trust as basic rather than three-place trust.”

A second strand of arguments against three-place fundamentalism is owed to Domenicucci and Holton (2017). According to them, the kind of derivation of two-place trust from three-place trust that is put forward by three-place fundamentalists is implausible for other similar kinds of attitudes like love and friendship:

No one—or at least, hardly anyone—thinks that we should understand what it is for Antony to love Cleopatra in terms of the three place relation ‘Antony loves Cleopatra for her __’, or in terms of any other three-place relation. Likewise hardly anyone thinks that we should understand the two place relation of friendship in terms of some underlying three-place relation […]. To this extent at least, we suggest that trust might be like love and friendship (2017: 149-50).

In response to this kind of argument by association, a proponent of three-place fundamentalism might either deny that these three- to two-place derivations are really problematic in the case of love or friendship, or instead grant that they are and maintain that trust is disanalogous.

In order to get a better sense of whether two-place trust might be unproblematically derived from three-place trust, regardless of whether the same holds mutatis mutandis for love in friendship, it will be helpful to look at a concrete attempt to do so. For example, according to Hawley (2014), three-place trust should be analyzed as: X relies on Y to phi because Y believes Y has a commitment to phi. And then, two-place trust defined simply as “reliance on someone to fulfil whatever commitments she may have” (2014: 16). If something like Hawley’s reduction is unproblematic, then, as one line of response might go, this trumps whatever concerns one might have about the prospects of making analogous moves in the love and friendship cases.

c. Trust and Belief: Doxastic, Non-doxastic and Performance-theoretic Accounts

Where does belief fit in to an account of trust? In particular, what beliefs (if any) must a truster have about whether the trustee will prove trustworthy? Proponents of doxastic accounts (Adler 1994; Hieronymi 2008; Keren 2014; McMyler 2011) hold that trust involves a belief on the part of the truster. On the simpler, straightforward incarnation of this view, when A trusts B to do X, A believes that B will do X. Other theorists propose more sophisticated belief-based accounts: on Hawley’s (2019) account, for instance, to trust someone to do something is to believe that she has a commitment to doing it, and to rely upon her to meet that commitment. Conversely, to distrust someone to do something is to believe that she has a commitment to doing it, and yet not rely upon her to meet that commitment.

Non-doxastic accounts (Jones 1996; McLeod 2002; Paul Faulkner 2007; 2011; Baker 1987) have a negative and a positive thesis. The negative thesis is just the denial of the belief requirement on trust that proponents of doxastic accounts accept (namely, a denial that trusting someone to do something entails the corresponding belief that they will do that thing). This negative thesis, to note, is perfectly compatible with the idea that trust oftentimes involves such a belief. What is maintained is that it is not essential. The positive thesis embraced by non-doxastic accounts involves a characterization of some further non-doxastic attitude the truster, qua truster, must have with respect to the trustee’s proving trustworthy.

An example of such a further (non-doxastic) attitude, on non-doxastic accounts, is optimism. For example, on Jones’ (1996) view, you trust your neighbour to bring back the garden tools you loaned her only if you are optimistic that she will bring them back, and regardless of whether you believe she will. It should be pointed out that oftentimes, optimism will lead to the acquisition of a corresponding belief. Importantly for Jones, the kind of optimism that characterizes trust is not itself to be explained in terms of belief but rather in terms of affective attitudes entirely. Such a commitment is more generally shared by non-doxastic views which take trust to involve affective attitudes that might be apt to prompt corresponding beliefs.

Quite a few important debates about trust turn on the matter of whether a doxastic account or a non-doxastic account is correct. For example, discussions of the rationality of trust will look one way if trust essentially involves belief and another way if it does not (Jones 1996; Keren 2014). Relatedly, what one says about trust and belief will bear importantly on how one thinks about the relationship between trust and monitoring, as well as the distinction between paradigmatic trust and therapeutic trust (the kind of trust one engages in in order to build trustworthiness; see Horsburgh 1960; Hieronymi 2008; Frost-Arnold 2014).

A notable advantage of the doxastic account is that it simplifies the epistemology of trust—and in particular, how trust can provide reasons for belief. Suppose, for instance, that the doxastic account is correct, and so your trusting your colleague’s word that they will return your laptop involves believing that they will return your laptop. This belief, some think, conjoined with the fact that your colleague tells you they will return your laptop, gives you a reason to believe that they will return your laptop. As Faulkner (2017: 113) puts it, on the doxastic account, “[t]rust gives a reason for belief because belief can provide reason for belief”. Non-doxastic accounts, by contrast, require further explanation as to why trusting someone would ever give you a reason to believe what they say.

Another advantage of doxastic accounts is that they are well-positioned to distinguish trusting someone to do something and mere optimistic wishing. Suppose, for instance, you loan £100 to a loved one with a terrible track record for repaying debts. Such a person may have lost your trust years ago, and yet you may remain optimistic and wishful that they will be trustworthy on this occasion. What distinguishes this attitude from genuine trust on the doxastic account is simply that you lack any belief that your loved one will prove trustworthy. Explaining this difference is more difficult on non-doxastic accounts. This is especially the case on non-doxastic accounts according to which trust not only does not involve belief but positively precludes it, by essentially involving a kind of “leap of faith” (Möllering 2006) that differs in important ways from belief.

Nonetheless, non-doxastic accounts have been emboldened in light of various serious objections that have been raised to doxastic accounts. One often raised objection of this kind highlights a key disanalogy with respect to how trust and belief interact with evidence, respectively (Faulkner 2007):

[Trust] need not be based on evidence and can demonstrate a wilful insensitivity to the evidence. Indeed there is a tension between acting on trust and acting on evidence that is illustrated in the idea that one does not actually trust someone to do something if one only believes they will do it when one has evidence that they will (2007: 876).

As Baker (1987) unpacks this idea, trusting can require ignoring counterevidence—as one might do when one trusts a friend despite evidence of guilt—whereas believing does not.

A particular type of example that puts pressure on doxastic accounts’ ability to accommodate dis-analogies with belief concerns therapeutic trust. In cases of therapeutic trust, the purpose of trusting is to promote trustworthiness, and is thereby not predicated on prior belief of trustworthiness. Take a case in which one trusts a teenager with an important task, in hopes that by trusting them, it will then lead them to become more trustworthy in the future. In this kind of case, we are plausibly trusting, but not on the basis of prior evidence or belief we have that the trustee will succeed on this occasion. To the contrary: we trust with the aim of establishing trustworthiness (Frost-Arnold 2014; Faulkner 2011). To the extent that such a description of this kind of case is right, therapeutic trust offers a counterexample to the doxastic account, as it involves trust in the absence of belief.

A third kind of account—the performance-theoretic account of trust (Carter 2020a, 2020c)—makes no essential commitment as to whether trusting involves belief. On the performance-theoretic account, what is essential to the attitude of trusting is how it is normatively constrained. An attitude is a trust attitude (toward a trustee, T, and a task, X) just in case the attitude is successful if and only if T takes care of X as entrusted to. Just as there is a sense in which, for example, your archery shot is not successful if it misses the target (see, for example, Sosa 2010a, 2015; Carter 2020b), your trusting someone to keep a secret misses its mark, and so fails to be successful trust, if the trustee spills the beans. With reference to this criterion of successful (and unsuccessful) trust, the performance-theoretic account aims to explain what good and bad trusting involves (see §2.a), and also why some variety of trust is more valuable than others (see §3).

d. Deception Detection and Monitoring

Given that trusting inherently involves the incurring of some level of risk to the truster, it is natural to think that trust would in some way be improved by the truster doing what she can to minimize such risk, for example, by monitoring the trustee with an eye to pre-empting any potential betrayal or at least mitigating the expected disvalue of potential betrayal.

This prima facie plausible suggestion, however, raises some perplexities. As Annette Baier (1986) puts it: “Trust is a fragile plant […] which may not endure inspection of its roots, even when they were, before inspection, quite healthy” (1986: 260). There is something intuitive about this point. If, for instance, A trusts B to drive the car straight home after work—but then proceeds to surreptitiously drive behind B the entire way in order to make sure that B really does drive straight home, it seems that A in doing so is no longer trusting B. The trust, it seems, dissolves through the process of such monitoring.

Extrapolating from such cases, it seems that trust inherently involves not only subjecting oneself to some risk, but also remaining subjected to such risk—or, at least—behaving in ways that are compatible with one’s viewing oneself as (remaining to be) subjected to such risk.

The above idea of course needs sharpened. For example, trusting is plausibly not destroyed by negligible monitoring. The crux of the idea seems to be, as Faulkner (2011, §5) puts it, that “too much reflection” on the trust relation, perhaps in conjunction with making attempts to minimize risks that trust will be betrayed, can undermine trust. Specifying what “too much reflection” or monitoring involves, however, and how reflecting relates to monitoring to begin with, remains a difficult task.

One form of monitoring—construed loosely—that is plausibly compatible with trusting is contingency planning (Carter 2020c). For example, suppose you trust your teenager to drive your car to work and back in order that they may undertake a summer job. A prudent mitigation against the additional risk incurred (for instance, that the car will be wrecked in the process) will be to buy some additional insurance upon entrusting the teenager with the car. The purchasing of this insurance, however, does not itself undermine the trusting relationship, even though it involves a kind of risk mitigating behaviour.

One explanation here turns on the distinction between (i) mitigating against the risk that trust will be betrayed; and (ii) mitigating against the extent or severity of the harm or damage incurred if trust is betrayed. Contingency planning involves type-(ii) mitigation, whereas, for example, trailing behind the teenager with your own car, which is plausibly incompatible with trusting, is of type-(i).

2. The Normativity of Trust

Norms of trust arise between the two parties of reciprocal trust: a norm to be trusting in response to the invitation to trust, and to be trustworthy in response to the other’s trusting reliance (Fricker 2018). The former normativity lies “on the truster’s side”, and the latter on the trustee’s side. In this section, we discuss norms on trusting by looking at these two kinds of norms—that govern the truster and the trustee, respectively—in turn.

This section first discusses general norms on trusting on the truster’s side, and then engages—in some detail—with the complex issue of the norms governing trust in another’s words specifically. Second, it discusses the normativity of trust on the trustee’s side and the nature of trustworthiness.

a. Entitlement to Trust

If—as doxastic accounts maintain—trust is a species of belief (Hieronymi 2008), then the rational norms governing trust govern belief, such that (for example) it will be irrational to trust someone whom you have strong evidence to be unreliable, and the norm violation here is the same kind of norm violation in play in a case where one simply believes, against the evidence, that an individual is trustworthy. Thus: to the extent that one is rationally entitled to believe the trustee is trustworthy, with respect to F, one thereby has an entitlement (on these kinds of views) to trust the trustee to F.

The norms that govern trust on the truster’s side will look different on non-doxastic accounts. For example, on a proposal like Frost-Arnold’s (2014), according to which trust is characterized as a kind of non-doxastic acceptance rather than as belief, the rationality governing trusting will be the rationality of acceptance, where rational acceptance can in principle come apart from rational belief. For one thing, whereas the rationality of belief is exclusively epistemically constrained, the rationality of acceptance need not be. In cases of therapeutic trust, for example, it might be practically rational (namely, rational with reference to the adopted end of building a trusting relationship) to accept that the trustee will F, and thus, to use the proposition that they will F as a premise in practical deliberation (see Bratman 1992; Cohen 1989)—that is, to act as if it is true that they will F. Of course, acting as if a proposition is true neither implies nor is implied by believing that it is true.

On performance-theoretic accounts, trusting is subject, on the truster’s side, to three kinds of evaluative norms, which correspond with three kinds of positive evaluative assessments: success, competence, and aptness. Whereas trusting is successful if and only if the trustee takes care of things as intrusted, trusting is competent if and only if one’s trusting issues from a reliable disposition—namely, a competence—to trust successfully when appropriately situated (for discussion, see Carter 2020a).

Just as successful trust might be incompetent as when one trusts someone with a well-known track record of unreliability who happens to prove trustworthy on this particular occasion, likewise, trust might fail to be successful despite being competent, as when one trusts an ordinarily reliable individual who, due to fluke luck, fails to take care of things as entrusted on this particular occasion. Even if trust is both successful and competent, however, there remains a sense in which it could fall short of the third kind of evaluative standard—namely, aptness. Aptness demands success because competence, and not merely success and competence (see Sosa 2010a, 2015; Carter 2020a, 2020b). Trust is apt, accordingly, if and only if one trusts successfully such that the successful trust manifests her trusting competence.

b. Trust in Words

Why not lie? (Or, more generally, why not promise to take care of things, and then renege on that promise whenever it is convenient to do so?) According to a fairly popular answer (Faulkner 2011; Simion 2020b), deception is bad not only for the deceived, but it is bad likewise for the deceiver (see also Kant). If one cultivates a reputation as being untrustworthy, then this comes with practical costs in one’s community; the untrustworthy person, recognized as such, is outcast, and de facto foregoes the (otherwise possible) social benefits of trusting.

Things are more complicated, however, in one-off trust-exchanges—where the risk of the disvalue of cultivating an untrustworthy reputation is minimal. The question can be reposed within the one-off context: why not lie and deceive, when it is convenient to do so, in one-off exchanges? In one-off interactions where we (i) do not know others’ motivations but (ii) do appreciate that there is a general motivation to be unreliable (for example, to reap gains of betrayal), it is surprising that we find as much trustworthy behaviour as we do. Why do people not betray to a greater extent than they do in such circumstances, given that betrayal seems prima facie the most rational decision-theoretic move?

According to Faulkner, when we communicate with another as to the facts, we face a situation akin to a prisoner’s dilemma (2011: 6). In a prisoner’s dilemma, our aggregate well-being will be maximized if we both cooperate. However, given the logic of the situation, it looks like the rational thing to do for each of us is to defect. We are then faced with a problem: how to ensure the cooperative outcome?

Similarly, Faulkner argues, speakers and audiences have different interests in communication. The audience is interested in learning the truth. In contrast, engaging in conversations is to the advantage of speakers because it is a means of influencing others: through an audience’s acceptance of what we say, we can get an audience to think, feel, and act in specific ways. So, according to Faulkner, our interest, qua speakers’, is being believed, because we have a more basic interest in influencing others. The commitment to telling the truth would not be best for the speaker. The best outcome for a speaker would be to receive an audience’s trust and yet have the liberty to tell the truth or not (2011: 5-6).

There are four main reactions to this problem in the literature in the epistemology of testimony. According to Reductionism (Adler 1994; Audi 1997, 2004, 2006; Faulkner 2011; Fricker 1994, 1995, 2017, 2018; Hume 1739; Lipton 1998; Lyons 1997), in virtue of this lack of alignment of hearer and speaker interests, one needs positive, independent reasons to trust their speaker: since communication is like a prisoner’s dilemma, the hearer needs a reason for thinking or presuming that the speaker has chosen the cooperative, helpful outcome. Anti-Reductionism (Burge 1993, 1997; Coady 1973, 1992; Goldberg 2006, 2010; Goldman 1999; Graham 2010, 2012a, 2015; Greco 2015, 2019; Green 2008; Reid 1764; Simion 2020b; Simion and Kelp 2018) rejects this claim. According to these philosophers, we have a default (absent defeaters) entitlement to believe what we are being told. In turn, this default entitlement is derivable on a priori grounds from the nature of reason (Burge 1993, 1997), sourced in social norms of truth-telling (Graham 2012b), social roles (Greco 2015), the reliance on other people’s justification-conferring processes (Goldberg 2010), or from the knowledge norm of assertion (Simion 2020b). Other than these two main views, we also encounter hybrid views (Lackey 2003, 2008; Pritchard 2004) that try to impose weaker conditions on testimonial justification than Reductionism, while, at the same time, not being as liberal about it as Anti-Reductionism. Last but not least, a fourth reaction to Faulkner’s problem of cooperation for testimonial exchanges is scepticism (Graham 2012a; Simion 2020b); on this view, the problem does not get off the ground to begin with.

According to Faulkner himself, trust lies at the heart of the solution to his problem of cooperation, that is, it gives speakers reasons to tell the truth (2011, Ch. 1; 2017). Faulkner thinks that the problem is resolved “once one recognizes how trust itself can give reasons for cooperating” (2017: 9). When the hearer H believes that the speaker S can see that H is relying on S for information about whether p, and in addition H trusts S for that information, then H will make a number of presumptions: 1. H believes that S recognizes H’s trusting dependence on S proving informative; 2. H presumes that if S recognizes H’s trusting dependence, then S will recognize that H normatively expects S to prove informative; 3. H presumes that if S recognizes H’s expectation that S should prove informative, then, other things being equal, S will prove informative for this reason; 4. Hence, taking the attitude of trust involves presuming that the trusted will prove trustworthy (2011: 130). The hearer’s presumption that the speaker will prove informative rationalizes the hearer’s uptake of the speaker testimony.

Furthermore, Faulkner claims, H’s trust gives S “a reason to be trustworthy”, such that S is, as a result, more likely to be trustworthy: it raises the objective probability that S will prove informative in utterance. In this fashion, “acts of trust can create as well as sustain trusting relations” (2011: 156-7). As Graham (2012a) puts it, “the hearer’s trust—the hearer’s normative expectation, which rationalizes uptake—then ‘engages,’ so to speak, the speaker’s internalization of the norm, which thereby motivates the speaker to choose the informative outcome.” Speakers who have internalized these norms will then often enough choose the informative outcome when they see that audiences need information; they will be “motivated to conform” because they have “internalized the norm” and so “intrinsically value” compliance (2011: 186). As such, the de facto reliability of testimony is explained by the fact that the trust placed in hearers by the speakers triggers, on the speakers’ side, the internalization of social norms of trust, which, in turn, makes speakers objectively likely to put hearers’ informational interests before their own.

According to Peter Graham (2012a), however, Faulkner’s own solution threatens to dissolve the problem of cooperation rather than solve it (Graham 2012a). Recall how the problem was set up: the thought was that speakers only care about being believed, whether they are speaking the truth or not, which is why the hearer needs some reason for thinking the speaker is telling them the truth. But if speakers have internalized social norms of trustworthiness, it is not true that speakers are just as apt to prove uninformative as informative. It is not true that they are only interested in being believed. Rather, they are out to inform, to prove helpful; due to having internalized the relevant trustworthiness norms, speakers are committed to informative outcomes (Graham 2012a).

Another version of scepticism about the problem of cooperation is voiced in Simion’s “Testimonial Contractarianism” (2020b). Recall that, according to Faulkner, in testimonial exchanges, the default position for speakers involves no commitment to telling the truth. If that is the case, he argues, the default position for hearers involves no entitlement to believe. Here is the argument unpacked:

(P1) Hearers are interested in truth; speakers are interested in being believed.

(P2) The default position for speakers is seeing to their own interests rather than to the interests of the hearers.

(P3) Therefore, it is not the case that the default position for speakers is telling the truth (from 1 and 2).

(P4) The default position for hearers is trust only if the default position for speakers is telling the truth.

There is one important worry for this argument: on the reconstruction above, the conclusion does not follow. In particular, the problem is with premise (3), which is not supported by (1) and (2) (Simion 2020b). That is because being interested in being believed does not exclude also being interested in telling the truth. Speakers might still—by default—also be interested in telling the truth on independent grounds, that is, independently of their concern (or, rather, lack thereof) with hearers’ interests; indeed, the sources of entitlement proposed by the Anti-Reductionist—for instance, the existence of social norms of truth-telling, the knowledge norm of assertion and so forth—may well constitute themselves in reasons for the speaker to tell the truth—absent overriding incentive to do otherwise. If that is the case, telling the truth will be default for hearers, therefore, trust will be default for hearers. What the defender of the Problem of Cooperation needs, then, for validity, is to replace (P1) with the stronger (P1*): Hearers are interested in truth; speakers are only interested in being believed. However, it is not clear that (P1*) spells out the correct utility profile of the case: are all speakers really such that they only care about being believed? This seems like a fairly heavy empirical assumption that is in need of further defence.

c. Obligation to Trust

A final normative species that merits discussion on the truster’s side is the obligation to trust. Obligations to trust can be generated, trivially, by promise-making (compare Owens 2017) or by other kinds of cooperative agreements (Faulkner 2011, Ch. 1). Of more philosophical interest are cases where obligations to trust are generated without explicit agreements.

One case of particular interest here arises in the literature on testimonial injustice, pioneered by Miranda Fricker (2007). Put roughly, testimonial injustice occurs when a speaker receives an unfair deficit of credibility from a hearer due to prejudice on the hearer’s part, resulting in the speaker’s being prevented from sharing what she knows.

An example of testimonial injustice that Fricker uses as a reference point is from Harper Lee’s To Kill a Mockingbird, where Tom Robinson, a black man on trial after being falsely accused of raping a white woman, has his testimony dismissed due to the prejudiced preconceptions on the part of the jury which owes to deeply seated racial stereotypes. In this case, the jury makes a deflated credibility judgement of Robinson, and as a result, he is unable to convey to them the knowledge that he has of the true events which occurred.

On one way of thinking about norms of trusting on the truster’s side, the members of the jury have mere entitlements to trust Robinson’s testimony though no obligation to do so; thus, their distrust of Robinson is not norm-violating. This gloss of the situation, on Fricker’s view, is incomplete; it fails to take into account the sense in which Robinson is wronged in his capacity as a knower as a result of this distrust. An appreciation of this wrong, according to Fricker, should lead us to think of the relevant norm on the hearer’s side as generating an obligation rather than a mere permission to believe; as such, on this view, distrust that arises from affording a speaker a prejudiced credibility deficit is not merely an instance of foregoing trusting when one is entitled to trust, but failing to trust when one should. For additional work discussing the relationship between trust and testimonial injustice see, for example, Origgi (2012); Medina (2011); Wanderer (2017); Carter and Meehan (2020).

Fricker’s ground-breaking proposal concerns cases when one is harmed in their capacity as a knower via distrust sourced in prejudice. That being said, several philosophers believe that the phenomenon generalizes beyond cases of distinctively prejudicial distrust; that is, that it lies in the nature and normativity of telling that we have a defeasible obligation to trust testifiers, and that failure to do so is impermissible, whether it is sourced in prejudice or not. Indeed, G. E. M. Anscombe (1979) and J. L. Austin (1946) famously believed that you can insult someone by refusing their testimony.

We can distinguish between three general accounts of what it is that hearers owe to speakers and why: presumption-based accounts, purport-based accounts, and function-based accounts. The key questions for all accounts are whether they successfully deliver an obligation to trust, what rationale they provide, and whether their rationale is ultimately satisfactory.

While there are differences in the details, the core idea behind presumption-based views (Gibbard 1990, Hinchman 2005, Moran 2006, Ridge 2014) is that when a speaker S tells a hearer H that p, say, S incurs certain responsibilities for the truth of p. Crucially, H, in virtue of recognising what S is doing, thereby acquires a reason for presuming S to be trustworthy in their assertion that p. But since anyone who is to be presumed trustworthy in asserting that p ought to be believed, we get exactly what we were looking for: an obligation to trust speakers alongside an answer to the rationale question.

Of course, the question remains whether the rationale provided is ultimately convincing. Sandy Goldberg (2020) argues that the answer is no. To see what he takes to be the most important reason for this, one should first look at a distinction Goldberg introduces between a practical entitlement to hold someone responsible and an epistemic entitlement to believe that they are responsible. Crucially, one can have the former without the latter. For instance, if your teenager tells you that they will be home by midnight and they are not, you will have a practical entitlement to hold them responsible even if you do not have an epistemic entitlement to believe that they are responsible. Importantly, to establish a presumption of trustworthiness, you need to make a case for an epistemic entitlement to believe. According to Goldberg, however, presumption-based accounts only deliver an entitlement to hold speakers responsible for their assertions, not an entitlement to believe that they are responsible. That is to say, when S tells H that p and thereby incurs certain responsibilities for the truth of p and when H recognises that this is what S is doing, H comes by an entitlement to hold S responsible for the truth of p. Crucially, to get to the presumption of trustworthiness we need more than this, as the case of the teenager clearly indicates. But presumption-based accounts do not offer more (Goldberg 2020, Ch. 4).

Another problem for these views is sourced in the fact that extant presumption-based accounts are distinctively personal: all accounts share the idea that in telling an addressee that p, speakers perform a further operation on them and that it is this further operation that generates the obligation on the addressee’s side. In virtue of this, presumption-based accounts deliver too limited a presumption of trustworthiness. To see this, we should go back to Fricker’s cases of epistemic injustice: it looks as though, not believing what a testifier says in virtue of prejudice is equally bad, whether one is the addressee of the instance of telling in question or merely overhears it (Goldberg 2020).

Goldberg’s own alternative proposal is purport-based: according to him, assertion has a job description, which is to present a content as true in such a way that, were the audience to accept it on the basis of accepting the speaker’s speech contribution, the resulting belief would be a candidate for knowledge (Goldberg 2020, Ch. 5). Since assertion has this job description, when speakers make assertions, they purport to achieve exactly what the job description says. Moreover, it is common knowledge that this is what speakers purport to do. But since assertion will achieve its job description only if the speaker meets certain epistemic standards and since this is also common knowledge, the audience will recognise that the performed speech act achieves its aim only if the relevant epistemic standards are met. Finally, this exerts normative pressure on hearers. To be more precise, hearers owe it to speakers to recognize them as agents who purport to be in compliance with the epistemic standards at work and to treat them accordingly.

According to Goldberg, our obligation toward speakers is weaker than presumption-based accounts would have it: in the typical case of testimony, what we owe to the speakers is not to outright believe them, but rather to properly assess their speech act epistemically. The reason for this, Goldberg argues, is that we do not have access to their evidence, or their deliberations; given that this is so, the best we can do is to adjust our doxastic reaction to “a proper (epistemic) assessment of the speaker’s epistemic authority, since in doing so they are adjusting their doxastic reaction to a proper (epistemic) assessment of the act in which she conveyed having such authority” (Goldberg 2020, Ch. 5).

As a first observation, note that Goldberg’s purport-based account deals better with cases of testimonial injustice than presumption-based accounts. After all, since the normative pressure is generated by the fact that it is common knowledge that in asserting speakers represent themselves as meeting the relevant epistemic standards, the normative pressure is on anyone who happens to listen in the conversation, not just on the direct addressees of the speech act.

With this point in play, let us return to Goldberg’s argument that there is no obligation to believe. According to Goldberg, this is because hearers do not have access to speakers’ reasons and their deliberations. One question is why exactly this should matter. After all, one might argue, the fact that the speaker asserted that p provides them with sufficient reason to believe that p (absent defeat, of course). That the assertion does not also give hearers access to the speakers’ own reasons and deliberations does nothing to detract from this, unless one endorses dramatically strong versions of reductionism about testimony (which Goldberg himself would not want to endorse). If so, the fact that assertions do not afford hearers access to speakers’ reasons and deliberations provides little reason to believe that there is no obligation to believe on the part of the hearer (Kelp & Simion 2020a).

An alternative way to ground an obligation to trust testimony (Kelp & Simion 2020a) relies on the plausible idea that the speech act of assertion has the epistemic function to generate true belief (Graham 2010), or knowledge (Kelp 2018; Kelp & Simion 2020a; Simion 2020a). According to this view, belief-responses on behalf of hearers contribute to the explanation of the continuous existence of the practice of asserting: were hearers to stop believing what they are being told, speakers would lose incentive to assert, and the practice would soon disappear. Since this is so, and since hearers are plausibly criticism-averse, it makes sense to have a norm that imposes an obligation on the hearers to believe what they are being told (absent defeat). Like that, in virtue of their criticism-aversion, hearers will reliably obey the norm—that is, will reliably form the corresponding beliefs—which, in turn, will keep the practice of assertion going (Kelp & Simion 2020a, Ch. 6).

One potential worry for this view is that it does not deliver the “normative oomph” that we want from a satisfactory account of the hearer’s obligation to trust: think of paradigm cases of epistemic injustice again. The hearers in these cases seem to fail in substantive moral and epistemic ways. However, on the function-based view, their failure is restricted to breaking a norm internal to the practice of assertion. Since norms internal to practices need not deliver substantive oughts outside of the practice itself—think, for instance, of rules of games—the function-based view still owes us an account of the normative strength of the “ought to believe” that drops out of their picture.

d. Trustworthiness

As the previous sections of this article show, trust can be a two-place or a three-place relation. In the former case, it is a relation between a trustor and a trustee, as in “Ann trusts George”. Two-place trust seems to be a fairly demanding affair: when we say that Ann trusts George simpliciter, we seem to attribute a fairly robust attitude to Ann, one whereby she trusts him in (at least) several respects. In contrast, three-place trust is a less involved affair: when we say that Ann trusts George to do the dishes, we need not say much about their relationship otherwise.

This contrast is preserved when we switch from focusing on the trustor’s trust to the trustee’s trustworthiness. That is, one can be trustworthy simpliciter (corresponding to a two-place trust relation) but one can also be trustworthy with regard to a particular matter—that is, two-place trustworthiness (Jones 1996) —corresponding to three-place trust. For instance, a surgeon might well be extremely trustworthy when it comes to performing surgery well, but not in any other respects.

Some philosophers working on trustworthiness focus more on two-place trust. As such, since the two-place trust relation is intuitively a more robust one, they put forward accounts of trustworthiness that are generally quite demanding, in that they require the trustee to be reliably making good on their commitments, but also to do so out of the right motive.

The classic account of such kind is Annette Baier’s goodwill-based account; in a similar vein, others combine reliance on goodwill with certain expectations (Jones 1996) including in one case a normative expectation of goodwill (Cogley 2012). According to this kind of view, the trustworthy person fulfils their commitments in virtue of their goodwill toward the trustor. This view, according to Baier, makes sense of the intuition that there is a difference between trustworthiness and mere reliability, that corresponds to the difference between trust and mere reliance.

The most widely spread worry about these accounts of trustworthiness is that they are too strong: we can trust other people without presuming that they have goodwill. Indeed, our everyday trust in strangers falls into this category. If so, the argument goes, this seems to suggest that whether or not people are making good on their commitments out of goodwill or not is largely inconsequential: “[w]e are often content to trust without knowing much about the psychology of the one-trusted, supposing merely that they have psychological traits sufficient to get the job done” (Blackburn 1998).

Another worry for these accounts is that, while plausible as accounts of trustworthiness simpliciter, they give counterintuitive results in cases of two-place trustworthiness: indeed, whether George is trustworthy when it comes to washing the dishes or not seems not to depend on his goodwill, nor on other such noble motives. The goodwill view is too strong.

Furthermore, it looks as though there is a reason to believe the goodwill view is, at the same time, too weak. To see this, consider the case of a convicted felon and his mother: it looks as though they can have a goodwill-based relationship, and thus be trustworthy within the scope thereof, while, at the same time, not being someone whom we would describe as trustworthy (Potter 2002: 8).

If all of this is true, it begins to look as though the presence of goodwill is independent of the presence of trustworthiness. This observation motivates accounts of trustworthiness that rely on less highbrow motives underlying the trustee’s reliability. One such account is the social contract view of trustworthiness. According to this view, the motives underlying people’s making good on their commitments are sourced in social norms and the unfortunate consequences to one’s reputation and general wellbeing of breaking them (Hardin 2002: 53; see also O’Neill 2002; Dasgupta 2000). Self-interest determines trustworthiness on these accounts.

It is easy to see that social contract views do well in accounting for trustworthiness in three-place trust relations: George is trustworthy when it comes to washing the dishes, on this view: he makes good on his commitments in virtue of social norms making it such that it is in his best interest to do so. The main worry for these views, however, is that they will be too permissive, and thus have difficulties in distinguishing between trustworthiness proper and mere reliability. Relatedly, the worry goes, these views seem less well equipped to deal with trustworthiness simpliciter, that is, the kind of trustworthiness that corresponds to a two-place trust relation. For instance, on a social contract view, it would seem that a sexist employer who treats female employees well only because he believes that he would face legal sanctions if he did not, will come out as trustworthy (Potter 2002: 5). This is intuitively an unfortunate result.

One thought that gets prompted by the case of the sexist employer is that trustworthiness is a character trait that virtuous people possess; after all, this seems to be something that the sexist employer is missing. On Nancy Potter’s view, trustworthiness is a disposition to respond to trust in appropriate ways, given “who one is in relation [to]” and given other virtues that one possesses or ought to possess (for example, justice, compassion) (2002: 25). According to Potter, a trustworthy person is “one who can be counted on, as a matter of the sort of person he or she is, to take care of those things that others entrust to one.”

When it comes to demandingness, the virtue-based view seems to lie somewhere in-between the goodwill view, on one hand, and the social contract view, on the other. It seems more permissive than the former in that it can account for the trustworthiness of strangers insofar as they display the virtue at stake. It seems more demanding than the latter in that it purports to account for the intuition that mere reliability is not enough for trustworthiness: rather, what is required is reliability sourced in good character.

An important criticism of virtue-based views comes from Jones (2012). According to her, trustworthiness does not fit the normative profile of virtue in the following way: if trustworthiness was a virtue, then being untrustworthy would be a vice. However, according to Jones, that cannot be right: after all, we are often required to be untrustworthy in one respect or another—for instance, because of conflicting normative constraints—but it cannot be that being vicious is ever required.

Another problem for Potter’s specific view are its apparent un-informativeness; first, defining the trustworthy person as “a person who can be counted on as a matter of the sort of person he or she is” threatens vicious circularity: after all, it defines the trustworthy as those that can be trusted. Relatedly, the account turns out to be too vague to give definite predictions in a series of cases. Take again the case of the sexist employer: why is it that he cannot be “counted on, as a matter of the sort of person he is, to take care of those things that others entrust to one” in his relationship with his female employees? After all, in virtue of the sort of person he is—that is, the sort of person who cares about not suffering the social consequences of mistreating them—he can be counted on to treat his employees well. If that is so, Potter’s view will not do much better than social contract views when it comes to distinguishing trustworthiness from mere reliability.

Several philosophers propose less demanding accounts of trustworthiness. Katherine Hawley’s (2019) view falls squarely within this camp. According to her, trustworthiness is a matter of avoiding unfulfilled commitments, which requires both caution in incurring new commitments and diligence in fulfilling existing commitments. Crucially, on this view, one can be trustworthy regardless of one’s motives for fulfilling one’s commitments. Hawley’s is a negative account of trustworthiness, which means that one can be trustworthy while avoiding commitments as far as possible. Untrustworthiness can arise from insincerity or bad intentions, but it can also arise from enthusiasm and becoming over-committed. A trustworthy person must not allow her commitments to outstrip her competence.

One natural question that arises for this view is: how about commitments that we do not, but we should take on board? Am I a trustworthy friend if I never take on any commitments toward my friends? According to Hawley, in practice, through friendship, work and other social engagements we take on meta-commitments—commitments to incur future commitments. These can make it a matter of trustworthiness to take on certain new commitments.

Another view in a similar, externalist vein, is developed by Kelp and Simion (2020b). According to them, trustworthiness is a disposition to fulfil one’s obligations. What drives the view is the thought that one can fail to fulfil one’s commitments in virtue of being in a bad environment—an environment that “masks” the normative disposition in question—while, at the same time, remaining a trustworthy person. Again, on this view as well, whether the disposition in question is there in virtue of good will or not is inconsequential. That being said, their view can accommodate the thought that people who comply with a particular norm for the wrong reason are less trustworthy than their good-willing counterparts. To see how, take the sexist employer again: insofar as it is plausible that there are norms against sexism, as well as norms against mistreating one’s female employees, the sexist employer fulfils the obligations generated by the latter but not by the former. In this, he is trustworthy when it comes to treating his employees well, but not trustworthy when it comes to treating them well for the right reason.

Another advantage of this view is that it explains the intuitive difference in robustness between two-place trustworthiness and trustworthiness simpliciter. According to this account, one is trustworthy simpliciter when one meets a contextually-variant threshold of two-place trustworthiness for contextually-salient obligations. For instance, a philosophy professor is trustworthy simpliciter in the philosophy department just in case she has a disposition to meet enough of her contextually salient obligations: do her research and teaching, not be late for meetings, answer emails promptly, help students with their essays and so forth. Plausibly, some of these contextually salient obligations will include doing these things for the right reasons. If so, the view is able to account for the fact that trustworthiness simpliciter is more demanding than two-place trustworthiness.

3. The Value of Trust

Trust is valuable. Without it, we face not only cooperation problems, but we also incur substantial risks to our well-being—namely, those ubiquitous risks to life that characterize—at the limit case—the Hobbesian (1651/1970) “state of nature”. Accordingly, one very general argument for the value of trust appeals to the disutility of its absence (see also Alfano 2020).

Moreover, apart from merely serving as an enabling condition for other valuable things (like the possibility of large-scale collective projects for societal benefit), trust is also instrumentally valuable for both the truster and the trustee as a way of resolving particular (including one-off) cooperation problems in such a way as to facilitate mutual profit (see §2). Furthermore, trust is instrumentally valuable as a way of building trusting relationships (Solomon and Flores 2003). For example, trust can effectively be used—as when one trusts a teenager with a car to help cultivate a trust relationship—in order to make more likely the attainments of the benefits of trust (for both the truster and the trustee) further down the road (Horsburgh 1960; Jones 2004; Frost-Arnold 2014; see also the discussion of therapeutic trust above).

Apart from trust’s uncontroversial instrumental value (for helpful discussion, see O’Neill 2002), some philosophers believe that trust has final value. Something X is instrumentally valuable, with respect to an end, Y, insofar as it is valuable as a means to Y; instrumental value can be contrasted with final value. Something is finally valuable iff it is valuable for its own sake. A paradigmatic example of something instrumentally valuable is money, which we value because of its usefulness in getting other things; an example of something (arguably) finally valuable is happiness.

One way to defend the view that trust can be finally valuable, and not merely instrumentally valuable, is to supplement the performance-theoretic view of trust (see §1.c and §2.a) with some additional (albeit somewhat contentious) axiological premises as follows:

(P1) Apt trust is successful trust that is because of trust-relevant competence. (from the performance-theoretic view of trust)

(P2) Something is an achievement if and only if it is a success because of competence. (Premise)

(C1) So, apt trust is an achievement. (from P1 and P2)
(P3) Achievements are finally valuable. (Premise)

(C2) So, apt trust has final value. (from C1 and P3)

Premise (2) of the argument is mostly uncontentious, and is taken for granted widely in contemporary virtue epistemology (for instance, Greco 2009, 2010; Haddock, Millar, and Pritchard 2009; Sosa 2010b) and elsewhere (Feinberg 1970; Bradford 2013, 2015).

Premise (3), however, is where the action lies. Even if apt trust is an achievement, given that it involves a kind of success because of ability (that is, trust-relevant competences), we would need some positive reason to connect the “success because of ability” structure with final value if we are to accept (P3).

A strong line here defends (3) by maintaining that all achievements (including evil achievements and “trivial” achievements) are finally valuable, because successes because of ability (no matter what the success, no matter what the ability used) have a value that is not reducible to just the value of the success.

This kind of argument faces some well-worn objections (for some helpful discussions, see Kelp and Simion 2016; Dutant 2013; Goldman and Olsson 2009; Sylvan 2017). A more nuanced line of argument for C2 will weaken (3) so that it says, instead, that (3*) some achievements are finally valuable. But with this weaker premise in play, (3*) and (C1) no longer entail C2; what would be needed—and this remains an open problem for work on the axiology of trust—is a further premise to the effect that the kind of achievement that features in apt trust, specifically, is among the finally valuable rather than non-finally valuable achievements. And a defence of such a further premise, of course, will turn on further considerations about (among other things) the value of successful and competent trust, perhaps also in the context of wider communities of trust.

4. References and Further Reading

Adler, Jonathan E. 1994. ‘Testimony, Trust, Knowing’. The Journal of Philosophy 91 (5): 264–275.
Alfano, Mark. 2020. ‘The Topology of Communities of Trust’. Russian Sociological Review 15 (4): 3-57. https://doi.org/10.17323/1728-192X-2016-4-30-56/.
Anscombe, G. E. M. 1979. ‘What Is It to Believe Someone?’ In Rationality and Religious Belief, edited by C. F. Delaney, 141–151. South Bend: University of Notre Dame Press.
Audi, Robert. 1997. ‘The Place of Testimony in the Fabric of Knowledge and Justification’. American Philosophical Quarterly 34 (4): 405–422.
Audi, Robert. 2004. ‘The a Priori Authority of Testimony’. Philosophical Issues 14: 18–34.
Audi, Robert. 2006. ‘Testimony, Credulity, and Veracity’. In The Epistemology of Testimony, edited by Jennifer Lackey and Ernest Sosa, 25–49. Oxford University Press.
Austin, J. L. 1946. ‘Other Minds.’ Proceedings of the Aristotelian Society Supplement 20: 148–187.
Baier, Annette. 1986. ‘Trust and Antitrust’. Ethics 96 (2): 231–260. https://doi.org/10.1086/292745.
Baker, Judith. 1987. ‘Trust and Rationality’. Pacific Philosophical Quarterly 68 (1): 1–13. https://doi.org/10.1111/j.1468-0114.1987.tb00280.x.
Blackburn, Simon. 1998. Ruling Passions: A Theory of Practical Reasoning. Oxford University Press UK.
Bond Jr, Charles F., and Bella M. DePaulo. 2006. ‘Accuracy of Deception Judgments’. Personality and Social Psychology Review 10 (3): 214–234.
Bradford, Gwen. 2013. ‘The Value of Achievements’. Pacific Philosophical Quarterly 94 (2): 204–224.
Bradford, Gwen. 2015. Achievement. Oxford University Press.
Bratman, M. 1992. ‘Practical Reasoning and Acceptance in a Context’. Mind 101 (401): 1–16.
Burge, Tyler. 1993. ‘Content Preservation’. Philosophical Review 102 (4): 457–488.
Burge, Tyler. 1997. ‘Interlocution, Perception, and Memory’. Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition 86 (1): 21–47.
Carter, J. Adam. 2020a. ‘On Behalf of a Bi-Level Account of Trust’. Philosophical Studies, 2020, issue 177, pages 2299–2322.
Carter, J. Adam. 2020b. ‘De Minimis Normativism: A New Theory of Full Aptness’. Philosophical Quarterly.
Carter, J. Adam. 2020c. ‘Therapeutic Trust’. Manuscript.
Carter, J. Adam, and Daniella Meehan. 2020. ‘Trust, Distrust, and Epistemic Injustice’. Educational Philosophy and Theory.
Coady, C. A. J. 1973. ‘Testimony and Observation’. American Philosophical Quarterly 108 (2): 149–55.
Coady, C. A. J. 1992. Testimony: A Philosophical Study. Oxford University Press.
Cogley, Zac. 2012. ‘Trust and the Trickster Problem’. Analytic Philosophy 53 (1): 30–47. https://doi.org/10.1111/j.2153-960X.2012.00546.x.
Cohen, L. Jonathan. 1989. ‘Belief and Acceptance’. Mind 98 (391): 367–389.
Dasgupta, Partha. 2000. ‘Trust as a Commodity’. Trust: Making and Breaking Cooperative Relations 4: 49–72.
deTurck, Mark A., Janet J. Harszlak, Darlene J. Bodhorn, and Lynne A. Texter. 1990. ‘The Effects of Training Social Perceivers to Detect Deception from Behavioral Cues’. Communication Quarterly 38 (2): 189–199.
Domenicucci, Jacopo, and Richard Holton. 2017. ‘Trust as a Two-Place Relation’. The Philosophy of Trust, 149–160.
Dutant, Julien. 2013. ‘In Defence of Swamping’. Thought: A Journal of Philosophy 2 (4): 357–366.
Faulkner, Paul. 2007. ‘A Genealogy of Trust’. Episteme 4 (3): 305–321. https://doi.org/10.3366/E174236000700010X.
Faulkner, Paul. 2011. Knowledge on Trust. Oxford: Oxford University Press.
Faulkner, Paul. 2015. ‘The Attitude of Trust Is Basic’. Analysis 75 (3): 424–429.
Faulkner, Paul. 2017. ‘The Problem of Trust’. The Philosophy of Trust, 109–28.
Feinberg, Joel. 1970. Doing and Deserving; Essays in the Theory of Responsibility. Princeton: Princeton University Press.
Fricker, Elizabeth. 1994. ‘Against Gullibility’. In Knowing from Words, 125–161. Springer.
Fricker, Elizabeth. 1995. ‘Critical Notice’. Mind 104 (414): 393–411.
Fricker, Elizabeth. 2017. ‘Inference to the Best Explanation and the Receipt of Testimony: Testimonial Reductionism Vindicated’. Best Explanations: New Essays on Inference to the Best Explanation, 262–94.
Fricker, Elizabeth. 2018. Trust and Testimonial Justification.
Fricker, Miranda. 2007. Epistemic Injustice: Power and the Ethics of Knowing. Oxford University Press.
Frost-Arnold, Karen. 2014. ‘The Cognitive Attitude of Rational Trust’. Synthese 191 (9): 1957–1974.
Gibbard, Allan. 1990. Wise Choices, Apt Feelings: A Theory of Normative Judgment. Cambridge, MA: Harvard University Press.
Goldberg, Sanford C. 2006. ‘Reductionism and the Distinctiveness of Testimonial Knowledge’. The Epistemology of Testimony, 127–44.
Goldberg, Sanford C. 2010. Relying on Others: An Essay in Epistemology. Oxford University Press.
Goldberg, Sanford C. 2020. Conversational Pressure. Oxford University Press.
Goldman, Alvin I. 1999. ‘Knowledge in a Social World’. Oxford University Press.
Goldman, Alvin, and Erik J. Olsson. 2009. ‘Reliabilism and the Value of Knowledge’. In Epistemic Value, edited by Adrian Haddock, Alan Millar, and Duncan Pritchard, 19–41. Oxford University Press.
Graham, Peter J. 2010. ‘Testimonial Entitlement and the Function of Comprehension’. In Social Epistemology, edited by Duncan Pritchard, Alan Millar, and Adrian Haddock, 148–74. Oxford University Press.
Graham, Peter J. 2012a. ‘Testimony, Trust, and Social Norms’. Abstracta 6 (3): 92–116.
Graham, Peter J. 2012b. ‘Epistemic Entitlement’. Noûs 46 (3): 449–82. https://doi.org/10.1111/j.1468-0068.2010.00815.x.
Graham, Peter J. 2015. ‘Epistemic Normativity and Social Norms’. In Epistemic Evaluation: Purposeful Epistemology, edited by David Henderson, and John Greco, 247-273. Oxford University Press.
Greco, John. 2009. ‘The Value Problem’. In Epistemic Value, edited by Adrian Haddock, Alan Millar, and Duncan Pritchard, 313–22. Oxford: Oxford University Press.
Greco, John. 2010. Achieving Knowledge: A Virtue-Theoretic Account of Epistemic Normativity. Cambridge University Press.
Greco, John. 2015. ‘Testimonial Knowledge’. Epistemic Evaluation: Purposeful Epistemology, 274-290.
Greco, John. 2019. ‘The Transmission of Knowledge and Garbage’. Synthese 197: 1–12.
Green, Christopher R. 2008. ‘Epistemology of Testimony’. Internet Encyclopedia of Philosophy, 1–42. https://iep.utm.edu/ep-testi/
Haddock, Adrian, Alan Millar, and Duncan Pritchard, eds. 2009. Epistemic Value. Oxford University Press.
Hardin, Russell. 1992. ‘The Street-Level Epistemology of Trust’. Analyse & Kritik 14 (2): 152–176.
Hardin, Russell. 2002. Trust and Trustworthiness. Russell Sage Foundation.
Hawley, Katherine. 2014. ‘Trust, Distrust and Commitment’. Noûs 48 (1): 1–20.
Hawley, Katherine. 2019. How to Be Trustworthy. Oxford University Press, USA.
Hieronymi, Pamela. 2008. ‘The Reasons of Trust’. Australasian Journal of Philosophy 86 (2): 213–36. https://doi.org/10.1080/00048400801886496.
Hinchman, Edward, 2005. ‘Telling as Inviting to Trust’. Philosophy and Phenomenological Research 70: 562-87.
Hobbes, Thomas. 1970. ‘Leviathan (1651)’. Glasgow.
Holton, Richard. 1994. ‘Deciding to Trust, Coming to Believe’. Australasian Journal of Philosophy 72 (1): 63–76. https://doi.org/10.1080/00048409412345881.
Horsburgh, H. J. N. 1960. ‘The Ethics of Trust’. The Philosophical Quarterly (1950-) 10 (41): 343–54. https://doi.org/10.2307/2216409.
Hume, David. 2000(1739). Treatise on Human Nature. Oxford University Press.
Jones, Karen. 1996. ‘Trust as an Affective Attitude’. Ethics 107 (1): 4–25.
Jones, Karen. 2004. ‘Trust and Terror’. In Moral Psychology: Feminist Ethics and Social Theory, edited by Peggy DesAutels and Margaret Urban Walker, 3–18. Rowman & Littlefield.
Jones, Karen. 2012. ‘Trustworthiness’. Ethics 123 (1): 61–85.
Kelp, Christoph. 2018. ‘Assertion: A Function First Account.’ Nous 52, 411-42.
Kelp, Christoph, and Simion, Mona. 2020a. Knowledge Sharing: A Functionalist Account of Assertion. Manuscript.
Kelp, Christoph, and Simion, Mona. 2020b. ‘What Is Trustworthiness?’ Manuscript.
Kelp, Christoph, and Simion, Mona. 2016. The Tertiary Value Problem and the Superiority of Knowledge (with C. Kelp). American Philosophical Quarterly 53 (4): 397-411.
Keren, Arnon. 2014. ‘Trust and Belief: A Preemptive Reasons Account’. Synthese 191 (12): 2593–2615.
Kraut, Robert. 1980. ‘Humans as Lie Detectors’. Journal of Communication 30 (4): 209–218.
Lackey, Jennifer. 2003. ‘A Minimal Expression of Non–Reductionism in the Epistemology of Testimony’. Noûs 37 (4): 706–723.
Lackey, Jennifer. 2008. Learning from Words: Testimony as a Source of Knowledge. Oxford University Press.
Lipton, Peter. 1998. ‘The Epistemology of Testimony’. Studies in History and Philosophy of Science Part A 29 (1): 1–31.
Lyons, Jack. 1997. ‘Testimony, Induction and Folk Psychology’. Australasian Journal of Philosophy 75 (2): 163–178.
McLeod, Carolyn. 2002. Self-Trust and Reproductive Autonomy. MIT Press.
McMyler, Benjamin. 2011. Testimony, Trust, and Authority. Oxford University Press USA.
Medina, José. 2011. ‘The Relevance of Credibility Excess in a Proportional View of Epistemic Injustice: Differential Epistemic Authority and the Social Imaginary’. Social Epistemology 25 (1): 15–35.
Medina, José. 2013. The Epistemology of Resistance: Gender and Racial Oppression, Epistemic Injustice, and the Social Imagination. Oxford University Press.
Möllering, Guido. 2006. Trust: Reason, Routine, Reflexivity. Elsevier.
Moran, Richard. 2006. ‘Getting Told and Being Believed’. In Jennifer Lackey and Ernest Sosa (eds.), The Epistemology of Testimony. Oxford: Oxford University Press.
O’Neill, Onora. 2002. Autonomy and Trust in Bioethics. Cambridge University Press.
Origgi, Gloria. 2012. ‘Epistemic Injustice and Epistemic Trust’. Social Epistemology 26 (2): 221–235.
Owens, David. 2017. ‘Trusting a Promise and Other Things’. In Paul Faulkner and Thomas Simpson (eds.), New Philosophical Perspectives on Trust, 214–29. Oxford University Press.
Piovarchy, Adam. 2020. ‘Responsibility for Testimonial Injustice’. Philosophical Studies, 1–19. https://doi.org/10.1007/s11098-020-01447-6
Pohlhaus Jr, Gaile. 2014. ‘Discerning the Primary Epistemic Harm in Cases of Testimonial Injustice’. Social Epistemology 28 (2): 99–114.
Potter, Nancy Nyquist. 2002. How Can I Be Trusted? A Virtue Theory of Trustworthiness. Rowman & Littlefield.
Pritchard, Duncan. 2004. ‘The Epistemology of Testimony’. Philosophical Issues 14: 326–348.
Rabinowicz, Wlodek, and Toni Ronnow-Rasmussen. 2000. ‘II-A Distinction in Value: Intrinsic and For Its Own Sake’. Proceedings of the Aristotelian Society 100 (1): 33–51.
Reid, Thomas. 1764. ‘An Inquiry into the Mind on the Principles of Common Sense’. In The Works of Thomas Reid, edited by W.H. Bart. Maclachlan & Stewart.
Ridge, Michael. 2014. Impassioned Belief. Oxford: Oxford University Press.
Simion, Mona. 2020a. Shifty Speech and Independent Thought: Epistemic Normativity in Context. Oxford: Oxford University Press.
Simion, Mona. 2020b. ‘Testimonial Contractarianism: A Knowledge-First Social Epistemology’. Noûs 1-26. https://doi.org/10.1111/nous.12337
Simion, Mona, and Christoph Kelp. 2018. ‘How to Be an Anti-Reductionist’. Synthese. https://doi.org/10.1007/s11229-018-1722-y.
Solomon, Robert C., and Fernando Flores. 2003. Building Trust: In Business, Politics, Relationships, and Life. Oxford University Press USA.
Sosa, Ernest. 2010a. ‘How Competence Matters in Epistemology’. Philosophical Perspectives 24 (1): 465–475.
Sosa, Ernest. 2010b. ‘Value Matters in Epistemology’. The Journal of Philosophy 107 (4): 167–190.
Sosa, Ernest. 2015. Judgment and Agency. Oxford: Oxford University Press.
Sylvan, Kurt. 2017. ‘Veritism Unswamped’. Mind 127 (506): 381–435.
Wanderer, Jeremy. 2017. ‘Varieties of Testimonial Injustice’. In Ian James Kidd, José Medina, and Gaile Pohlhaus Jr. (eds.), The Routledge Handbook of Epistemic Injustice, 27–40. Routledge.
Williamson, Timothy. 2000. Knowledge and Its Limits. Oxford University Press.

Author Information

J. Adam Carter
Email: adam.carter@glasgow.ac.uk
University of Glasgow
United Kingdom

and

Mona Simion
Email: mona.simion@glasgow.ac.uk
University of Glasgow
United Kingdom

Tyler Burge (1946—)

Tyler Burge is an American philosopher who has done influential work in several areas of philosophy. These include philosophy of language, logic, philosophy of mind, epistemology, philosophy of science (primarily philosophy of psychology), and history of philosophy (focusing especially on Frege, but also on the classical rationalists—Descartes, Leibniz, and Kant). Burge has also done some work in psychology itself.

Burge is best known for his extended elaboration and defense of the thesis of anti-individualism. This is the thesis that most representational mental states depend for their natures upon phenomena that are not determined by the individual’s own body and other characteristics. In other words, what it means to represent a subject matter—whether in perception, language, or thought—is not fully determined by individualistic characteristics of the brain, body, or person involved. One of the most famous illustrations of this point is Burge’s argument that psychologically representing a kind such as water requires the fulfillment of certain non-individualistic conditions; such as having been in causal contact with instances of the kind, having acquired the representational content through communication with others, having theorized about it, and so forth. A consequence of Burge’s anti-individualism, in this case, is that two thinkers who are physically indiscernible (who are, for example, neurologically indistinguishable in a certain sense) can differ in that one of them, but not the other, has thoughts containing the concept “water”.

When Burge first proposed the thesis of anti-individualism, it was common for philosophers to reject it for one reason or another. It is a measure of Burge’s influence, and the power of his arguments, that the early 21^st century saw few philosophers deny the truth of the view.

Nevertheless, there is much more to Burge’s philosophical contributions than anti-individualism. Most of Burge’s more influential theses and arguments are briefly described in this article. An attempt is made to convey how the seemingly disparate topics addressed in Burge’s corpus are unified by certain central commitments and interests. Foremost among these is Burge’s long-standing interest in understanding the differences between the minds of human beings, on one hand, and the minds of other animals, on the other. This interest colors and informs Burge’s work on language, mind, and epistemology in particular.

Life and Influence
Language and Logic
Anti-Individualism
De Re Representation
Mind and Body
Justification and Entitlement
Interlocution
Self-Knowledge
Memory and Reasoning
Reflection
Perception
History of Philosophy
Psychology
References and Further Reading
1. Primary Literature
  1. Books
  2. Articles
2. Secondary Literature

1. Life and Influence

Charles Tyler Burge graduated from Wesleyan University in 1967. He obtained his Ph.D. at Princeton University in 1971, his dissertation being directed by Donald Davidson. He is married with two sons. Burge’s wife, Dorli Burge, was prior to her retirement a clinical psychologist. Burge’s eldest son, Johannes, is Assistant Professor in Vision Science at the University of Pennsylvania. His younger son, Daniel, completed a Ph.D. in 20^th century American History at Boston University.

Burge is a fan of sport and enjoys traveling and hiking. He also reads widely outside of philosophy (particularly literature, history, history of science, history of mathematics, psychology, biology, music, and art history). Three of Burge’s interests are classical music, fine food, and fine wine.

A list of Burge’s main philosophical contributions would include the following seven areas. First, in his dissertation and the 1970s more generally, Burge focused attention upon the central significance of context-dependent referential and representational devices, including many uses of proper names, as well as what he came to call “applications” in language and thought. This was during a philosophical era in which it was widely believed that such devices were reducible to context-independent representational elements such as linguistic descriptions and concepts in thought. Burge also appealed to demonstrative- or indexical-like elements in perhaps unexpected areas, such as in his treatment of the semantical paradox. A concern with referential representation, which Burge does not believe to be confined solely to empirical cases, has been as close to his central philosophical interest as any topic. Much the same could be said about Burge’s long-standing interest in predication and attribution. (See sections 2 and 4.) Burge’s work on context-dependent aspects of representation is indebted to Keith Donnellan and Saul Kripke.

Second, while broadly-understood anti-individualism has been a dominant view in the history of philosophy, Burge was the first philosopher to articulate the doctrine, to argue for it, and to mine it for many of its implications. Anti-individualism is the view that the natures of most representational states and events are partly dependent on relations to matters beyond the individuals with representational abilities. In the 20^th century, anti-individualism went from being, for a decade or more after Burge discussed its several forms or aspects, a minority view to a view that is rarely even questioned in serious philosophical work today. Furthermore, the discussion of anti-individualism engendered by Burge’s work breathed new life into at least two somewhat languishing areas of philosophy: the problem of mental causation, and the nature of authoritative self-knowledge, each of which has since then become widely discussed and recognized as central areas of philosophy of mind, and epistemology, respectively. (See sections 3, 5 and 8.)

Third, Burge’s work on interlocution (commonly called “testimony”) has been widely discussed. He was among the first to defend a non-reductionist view of interlocution, one which remains among the best-articulated and supported accounts of our basic epistemic warrant for relying upon the words of others (see section 7); and Burge later extended this work to provide new ways of thinking about the problem of other minds, on one hand, and the epistemology of computer-aided mathematical proofs, on the other.

Fourth, beginning with his work on self-knowledge and interlocution, Burge began a rationalist initiative in epistemology that has been influential, in addition to areas already mentioned, in discussions of memory, the first-person concept, reflection and understanding, and other abilities, such as certain forms of reasoning, that seem to be peculiar to human beings. Central to Burge’s limited form of rationalism is his powerful case against the once-common view that both analytic and a priori truths are in some way insubstantial or vacuous; as well as his rejection of the closely related view that apriority is to be reduced to analyticity. (See sections 6-10.)

Fifth, beginning in the mid- to late-1980s all the way up to the early 21^st century, Burge developed a detailed understanding of the nature of perception. Integral to this understanding has been the extent to which Burge has immersed himself in the psychology of perception as well as developmental psychology and ethology. Some of Burge’s work on perception is as much a contribution to psychology as to philosophy; one of the articles he has published on the topic covers a prominent and hotly contested question in psychology—the question whether infants and non-human animals attribute psychological states to agents with whom they interact. Parallel with these developments has been Burge’s articulation of a novel account of perceptual epistemic warrant. (See sections 6, 11 and 13.)

Sixth, throughout his career Burge has resisted the tendency of philosophers of mind, especially in the United States, to accept some form of materialism. While it may not have been a central focus of his published work, Burge has over time formulated and defended a version of dualism about the relation between the mind and the body in the literature today. Burge’s view holds that minds, mental states, and mental events are not identical with bodies, physical states, or physical events. It is important to note, however, that Burge’s dualism is not a substance dualism such as the view commonly attributed to Descartes. It is instead a “modest dualism” motivated by the view that counting mental events as physical events does no scientific or other good conceptual work; similarly, for mental properties and minds themselves. This is one example of Burge’s more general resistance to forms of reductionism in philosophy. (See section 5.)

The seventh respect in which Burge’s work has been influential is not confined to a certain body of work or a defended thesis. It lies in providing an antidote to the pervasive tendency, in several areas of philosophy, toward hyper-intellectualization. The earliest paper in which Burge discusses hyper-intellectualization is his short criticism of David Lewis’s account of convention (1975). The tendency toward hyper-intellectualization is exhibited in individualism about linguistic, mental, or perceptual representational content—the idea being that the individual herself must somehow be infallible concerning the proper application conditions of terms, concepts, and even perceptual attributives. It is at the center of the syndrome of views, called Compensatory Individual Representationalism, that Burge criticizes at some length. These views insist that objective empirical representation requires that the individual must in some way herself represent necessary conditions for objective representation. Hyper-intellectualization motivates various forms of epistemic internalism, according to which epistemic warrant requires that the individual be able in some way to prove that her beliefs are warranted, or at least to have good, articulable grounds for believing that they are. Finally, hyper-intellectualization permeates even action theory, which tends to model necessary conditions for animal action upon the intentional actions of mature human language-users. Burge has resisted all of these hyper-intellectualizing tendencies within philosophy, and to a lesser extent in psychology. (See sections 3, 6, 7 and 11.)

If there is a single, overriding objective running throughout Burge’s long and productive career, it is to understand wherein human beings are similar to, and different from, other animals in representational and cognitive respects. As he put the point early on, in the context of a discussion of anti-individualism:

I think that ultimately the greatest interest of the various arguments lies not in defeating individualism, but in opening routes for exploring the concepts of objectivity and the mental, and more especially those aspects of the mental that are distinctive of persons. (1986b, 194 fn. 1)

This large program has involved not only investigating the psychological powers that seem to be unique to human beings—such as a priori warranted cognition and reflection, and authoritative self-knowledge and self-understanding—but also competencies that we share with a wide variety of non-human animals, principally memory, action, and perception. (See sections 3, 4, 7, and 8-11.)

2. Language and Logic

Burge’s early work in philosophy of language and logic centered on semantics and logical form. The work on semantics constitutes the beginning of Burge’s lifelong goal of understanding reference and representation—beginning in language and proceeding to thought and perception. This work includes the logical form of de re thought (1977); the semantics of proper names (1973); demonstrative and indexical constructions (1974a); and also mass and singular terms (1972; 1974b). While the work on context-dependent reference was the dominant special case of Burge’s thought and writing on semantics and logical form, it also includes Burge’s work on paradoxes, especially the strengthened liar (semantic) paradox and the epistemic paradox.

Significant later work on logic and language prominently includes articles on logic and analyticity, and on predication and truth (2003a; 2007b).

3. Anti-Individualism

Anti-individualism is the view that the natures of most thoughts, and perceptual states and events, are partly determined by matters beyond the natures of individual thinkers and perceivers. By the “nature” of these mental states we understand that without which they would not be the mental states they are. So the representational contents of thoughts and perceptual states, for example, are essential to their natures. If “they” had different contents, they would be different thoughts or states. As Burge emphasizes, anti-individualism has been the dominant view in the history of philosophy. It was present in Aristotle, arguably in Descartes, and in many other major philosophers in the Western canon. When Burge, partly building upon slightly earlier work by Hilary Putnam, came explicitly to formulate and defend the view, it became controversial. There are several reasons for this. One is that materialistic views in philosophy of mind seemed incompatible with the implications of anti-individualism. Another was a tendency, which began to be dislodged only after the mid-20^th century, to place very high emphasis upon phenomenology and introspective “flashes” of insight when it came to discussions of the natures of representational mental states and events. There are rear-guard defenses of the cognitive relevance of phenomenology that still have currency today. But anti-individualism appears to have become widely, if sometimes reluctantly, accepted.

As noted, anti-individualism is the view that most psychological representational states and events depend for their natures upon relations to subject matters beyond the representing individuals or their psychological sub-systems. This view was first defended in Burge’s seminal article, “Individualism and the Mental” (1979a). Some of Burge’s arguments for anti-individualism employ the Twin-Earth thought-experiment methodology originally set out by Putnam. Burge went beyond Putnam, among other ways, by arguing that the intentional natures of many mental states themselves, rather than merely associated linguistic meanings, depend for their natures on relations to a subject matter. Burge has also argued at length against Putnam’s view (which Putnam has since given up) that meanings and thought contents involving natural kinds are indexical in character.

There are five distinct arguments for anti-individualism in Burge’s work. The order in which they were published is as follows. First, Burge argued that many representational mental states depend for their natures upon relations to a social environment (1979a). Second, he argued that psychologically representing natural kinds such as water and aluminum depends upon relations to entities in the environment (1982). Third, Burge argued that having thoughts containing concepts corresponding to artefactual kinds such as sofas is compatible with radical, non-standard theorizing about the kinds (1986a). Fourth, Burge constructed a thought experiment that appears to show that even the contents of perception may depend for their natures on relations to entities purportedly perceived (1986b; 1986c). Finally, Burge has provided an argument for a version of empirical anti-individualism that he regards as both necessarily true and a priori: “empirical representational states as of the environment constitutively depend partly on entering into environment-individual causal relations” (2010, 69). This final argument has superseded the fourth as the main ground of perceptual anti-individualism. It could also be said that it provides the strongest ground for anti-individualism in general, at least for empirical cases, since propositional attitudes containing concepts such as “arthritis”, “water”, and “sofa”, are all parasitic, albeit in complex and non-fully-understood ways, upon basic perceptual categories covered by the fifth argument. Finally, while it is a priori that perceptual systems and states are partly individuated by relations to an environment, it is an empirical fact that there are perceptual states and events.

Rather than discussing each of these arguments in detail, the remainder of the section focuses on one of Burge’s schematic representations of the common structure of several of the arguments. The thought experiments in question involve three steps. In the first, one judges that someone could have thoughts about “a given kind or property as such, even though that person is not omniscient about its nature” (2013a, 548). For example, one can think thoughts about electrons without being infallible about the natures of electrons. This lack of omniscience can take the form of incomplete understanding, as in the case of the concept of arthritis. It can stem from an inability to distinguish the kind water from a look-alike liquid in a world that contains no water, or theorizing about water. Or it can issue from non-standard theorizing about entities, say sofas, despite fully grasping the concept of sofa.

In the second step, one imagines a situation just like the one just considered, but in which the person’s mistaken beliefs are in fact true. That is to say, one considers a situation in which the kind or property differs from its counterpart in the first situation, but in ways in which the individual cannot discriminate the kind or property in the first situation from its counterpart in the second step. Thus, in this step of the argument the thoughts one would normally express with the words used by the subject, present in the first step, are in fact true.

In the third step “one judges that in the second environment, the individual could not have thoughts about arthritis … [or] sofas, as such” (2013a, 549). The reason, of course, is that the relevant entities in the second step are not the same as those in the first step. There are additional qualifications that must be made, such as that it must be presupposed that, while there is no arthritis, water, or sofas, in the second step, no alternative ways of acquiring the concepts of arthritis, water or sofa is available or utilized. Burge continues:

The conclusion is that what thoughts an individual can have—indeed, the nature of the individual’s thoughts—depends partly on relations that the individual bears to the relevant environments. For we can imagine the individual’s make-up invariant between the actual and counterfactual situations in all other ways pertinent to his psychology. What explains the possibility of thinking the thoughts in the first environment and the impossibility of thinking them in the second is a network of relations that the individual bears to his physical or social surroundings. (2013a, 549)

In other words, the person is able to use the concepts of arthritis, water, and sofa in the first step of the argument for the same reasons that all of us can think with these concepts. Even if the person were indiscernible in individualistic respects, however, changes in the environment could preclude him from thinking with these concepts. If this is correct, then it cannot be the case that the thoughts that one can think with are fully determined by individualistic factors. That is to say, two possible situations in which a person is indistinguishable with respect to individualist factors can differ as regards the thoughts that she thinks.

What this schematic formulation of the first three thought experiments for anti-individualism emphasizes is arguably the same as the reason that it has come to be so widely accepted. As Burge had earlier put the point: the schematic representation of the arguments “exploits the lack of omniscience that is the inevitable consequence of objective reference to an empirical subject matter” (2007, 22-23). Thus, opposition to anti-individualism, or at least opposition to the three arguments in question, must in some way deny our lack of omniscience about the natures of our thoughts, or the conditions necessary for our thinking them. This denial appears to be unreasonable and without a solid foundation.

4. De Re Representation

To a first approximation, de dicto representation is representation that is entirely conceptualized and does not in any way rely upon non-inferential or demonstrative-like relations for its nature. By contrast, de re representation is both partly nonconceptual and reliant upon demonstrative-like relations (at least in empirical cases) for the determination of its nature. For example, the representational content in “that red sphere” is de re; it depends for its nature on a demonstrative-like relation holding between the representer and the putative subject matter. By contrast, “the shortest spy in all the world in 2019” is de dicto. It is completely conceptualized and is not in any way dependent for its nature on demonstrative-like relations. When Burge first began publishing on the topic, it was very common to hold that de re belief attributions (for example) could be reduced to de dicto ascriptions of belief.

Burge’s early work on de re representation sought to achieve three primary goals (1977). First, he provided a pair of characterizations of the fundamental nature of de re representation in language and in thought: a semantical and an epistemic characterization. The semantical account “maintains that an ascription ascribes a de re attitude by ascribing a relation between what is expressed by an open sentence, understood as having a free variable marking a demonstrative-like application, and a re to which the free variable is referentially related” (2007f, 68). The epistemic account, by contrast, maintains that an attitude is de re if it is not completely conceptualized. The second goal of Burge’s early paper on de re belief was to argue that any individual with de dicto beliefs must also have de re beliefs (1977, section II). Finally, Burge argued that the converse does not hold: it is possible to have de re beliefs but not de dicto beliefs. From the second and third claims it follows, contrary to most work on the topic at the time, that de re representation is in important respects more fundamental than de dicto representation.

Burge’s later work on de re representation includes a presentation of and an argument for five theses concerning de re states and attitudes. The first four theses concern specifically perception and perception-based belief. Thesis one is that all representation involves representation-as (2009a, 249-250). This thesis follows from the view that all perceptual content, and the content of all perception-based belief, involves attribution as well as reference. There is no such thing as “neat” perception. All perception is perspectival and involves attribution of properties (which may or may not correctly characterize the objects of perception, even assuming that perceptual reference succeeds). Thesis two is that all perception and perception-based belief is guided by general attributives (2009a, 252). An attributive is the perceptual analog of a predicate, for example, “red” in the perceptual content “that red sphere”. Perceptual representation must be carried out in such a way that one or more attributives is associated with the perception and guides the ostensible perceptual reference. The third thesis is that successful perceptual reference requires that some perceptual attribution must veridically characterize the entity perceived (2009a, 289-290). A main idea of this thesis is that something must make it the case that perceptual reference has succeeded, in a given instance, rather than failed. What must be so is not only that the right sort of causal relation obtains between the perceiver and the perceptual referent, but that some attributive associated with the object of perception veridically applies to it. Like the second thesis, this one is fully compatible with the fact that perceptual reference can succeed even where many attributions, including those most salient, fail. The difference is that the second thesis concerns only purported perceptual reference, while the third concerns successful reference. Successful reference is compatible with the incorrectness of some perceptual attribution, even if an attributive that functions to guide the reference fails to apply to the referent. But the third thesis, to reiterate, does require that some perceptual attributive present in the psychology of the individual correctly applies to the referent.

To summarize: the first thesis says that every representation must have a mode of representation. It is impossible for representation to occur neat. The second thesis holds that even (merely) purported reference requires attribution. And the third thesis states that successful perceptual reference requires that some attributives associated in the psychology of the individual with the reference correctly apply to the referent.

Burge’s final two theses concerning de re representation are more general and abstract. The fourth thesis states that necessary preconditions on perception and perceptual reference provide opportunities for a priori warranted belief and knowledge concerning perception. In Burge’s words: “Some of our perceptually based de re states and attitudes, involving context-based singular representations, can yield apriori warranted beliefs that are not parasitic on purely logical or mathematical truths” (2009a, 298). An example of such knowledge might be the following:

(AC*) If that object [perceptually presented as a trackable, integral body] exists, it is trackable and integral. (compare Burge 2009a, 301)

This thesis arguably follows from the third thesis concerning de re perceptual representation. It follows, to reiterate, because a minimal, necessary condition upon successful perceptual reference is that some attributive associated (by the individual or its perceptual system) with the referent veridically applies to the referent of perception—and the most general, necessarily applicable attributive where perceptual reference is concerned is that the ostensible entity perceived be a trackable, integral body. Finally, the fifth thesis concerning de re representational states and events provides a general characterization of de re representation that does not apply merely to empirical cases:

A mental state or attitude is autonomously (and proleptically) de re with respect to a representational position in its representational content if and only if the representational position contains a representational content that represents (purports to refer) nondescriptively and is backed by an epistemic competence to make non-inferential, immediate, nondiscursive attributions to the re. (2009a, 316)

The use of “autonomously” here is necessary to exclude reliance upon others in perception-based reference. Such reliance can be de re even if the third thesis fails (2009a, 290-291). “Proleptically” is meant to allow for representation that fails to refer. Technically speaking, failed perceptual or perception-based reference is never de re. But it is nevertheless purported de re reference and so is covered by the fifth thesis.

For discussion of non-empirical cases of de re representation, which Burge allowed for even in “Belief De Re”, see Burge (2007f, 69-75) and (2009a, 309-316).

It should be re-emphasized that two of Burge’s primary philosophical interests, throughout his career, have been de re reference and representation (1977; 2007f), on one hand, and the nature of predication, on the other (2007b; 2010a). These topics connect directly with the aforementioned interest in understanding representational and epistemic abilities that seem to be unique to human beings.

5. Mind and Body

Burge’s early work on the mind/body problem centered around sustained criticism of certain ways the problem of mental causation has been used to support versions of materialism (1992; 1993b). Burge’s criticisms of materialism about mind, including the argument against token-identity materialism, date back to “Individualism and the Mental” (1979a, section IV). As noted earlier, Burge’s position on the mind/body problem is a modest form of dualism that is principally motivated by the failure [of reduction of minds, mental states, and mental events, on one hand, to the body or brain, physical states, and physical events, on the other] to provide empirical or conceptual explanatory illumination. He has also done work on consciousness, and provided a pair of new arguments against what he calls “compositional materialism”.

Beginning in the late 1980s, many philosophers expressed doubts concerning the probity of our ordinary conception of mental causation. Discussion of anti-individualism partially provoked this series of discussions. Some argued that, absent some reductive materialist understanding of mental causation, we are faced with the prospect of epiphenomenalism—the view that instances of mental properties do not do any genuine causal work but are mere impotent concomitants of instances of physical properties. Burge argues that the grounds for rejecting epiphenomenalism are far stronger than any of the reasons that have been advanced in favor of the epiphenomenalist threat. He points out that, were there a serious worry about how mental properties can be causally efficacious, the properties of the special sciences such as biology and geology would be under as much threat as those in commonsense psychology or psychological science. Such causal psychological explanation “works very well, within familiar limits, in ordinary life; it is used extensively in psychology and the social sciences; and it is needed in understanding physical science, indeed any sort of rational enterprise” (1993b, 362). Such explanatory success itself shows, other things equal, the “respectability” of the ordinary conception of mental causation: “Our best understanding of causation comes from reflecting on good instances of causal explanation and causal attribution in the context of explanatory theories” (2010b, 471).

Burge has also provided arguments against some forms of materialism. One such argument employs considerations made available by anti-individualism to contend that physical events, as ordinarily individuated, cannot in the general case be identical with mental events (1979a, 141f.; 1993b, 349f.). This variation in mental events across individualistically indiscernible thinkers would not be possible, of course, if mental events were identical with physical events. In other words, if mental events were identical with physical events then mere variation in environment could not constitutively affect individuals’ mental events. Needless to say, the falsity of token-identity materialism entails the falsity of a claim of type-identity.

Burge has also provided another line of thought on the mind-body problem that supports his “modest dualism” concerning the relation of the mental to the physical. The most plausible of the various versions of materialism, Burge holds, is compositional materialism—the view that psychologies or minds, like tectonic plates and biological organisms, are composed of physical particles. However, like all forms of materialism, compositional materialism makes strong, empirically specific, claims. Burge writes:@

The burden on compositional materialism is heavy. It must correlate neural causes and their effects with psychological causes and their effects. And it must illuminate psychological causation, of both physical and psychological effects, in ways familiar from the material sciences (2010b, 479).

He holds that there is no support in science for the compositional materialist’s commitment to the view that mental states and events are identical with composites of physical materials.

The two new arguments against compositional materialism run roughly as follows. The first turns on the difficulty of seeing how “material compositional structures could ground causation by propositional psychological states or events” (2010b, 482). Physical causal structures—broadly construed, to include causation in the non-psychological special sciences—do not appear to have a rational structure. The propositional structures that help to type-individuate certain psychological kinds do have a rational structure. Hence, it is prima facie implausible that psychological causation could be reduced to physical-cum-compositional causal structures. The second argument is similar but does not turn on the notion of causation. Burge argues that:@

the physical structure of material composites consists in physical bonds among the parts. According to modern natural science, there is no place in the physical structure of material composites for rational, propositional bonds. The structure of propositional psychological states and events constitutively includes propositional, rational structure. So propositional states and events are not material composites. (2010b, 483)

Burge admits the abstractness of the arguments, and allows that subsequent theoretical developments might show how compositional materialism can overcome them. However, he suggests that the changes would have fundamentally to alter how either material states and events or psychological states and events are conceived.

Finally, Burge has written two articles on consciousness. The first of these defends three points. One is that all kinds of consciousness, including access consciousness, presuppose the presence of phenomenal consciousness. Phenomenal consciousness is the “what it is like” aspect certain mental states and events. The claim of presupposition is that no individual can be conscious, in any way, unless it has mental states some of which are phenomenally conscious. The second point is that the notion of access consciousness, as understood by Ned Block, for example, needs refinement. As Block understands access consciousness, it concerns mental states that are poised for use in rational activity (1997). Burge argues that this dispositional characterization runs afoul of the general principle that consciousness, of whatever sort, is constitutively an occurrent phenomenon. Burge’s refinement of the notion of access consciousness is called “rational-access consciousness”. The third point is that we should make at least conceptual space for the idea of phenomenal qualities that are not conscious throughout their instantiation in an individual.

Burge’s second paper on consciousness: (a) notes mounting evidence that a person could have phenomenal qualities without the qualities being rationally accessible; (b) explores ways in which a state could be rationally-access conscious despite not being phenomenally conscious; (c) distinguishes phenomenal consciousness from other phenomena, such as attention, thought, and perception; and (d) sets out a unified framework for understanding all aspects of phenomenal consciousness, as a type of phenomenal presentation of qualities to subjects (2007e).

6. Justification and Entitlement

Burge draws a crucial distinction between two forms of epistemic warrant. One is justification. A justified belief is one that is warranted by reason or reasons. By contrast, an epistemic entitlement is an epistemic warrant that does not consist in the possession of reasons. Entitlement is usually defined by Burge negatively—in such way, because there is no simple way to express what entitlement consists in that abstracts from the nature of the representational competence in question.

The distinction was first articulated in “Content Preservation” (1993a). Burge there explained that:

(t)he distinction between justification and entitlement is this: Although both have positive force in rationally supporting a propositional attitude or cognitive practice, and in constituting an epistemic right to it, entitlements are epistemic rights or warrants that need not be understood by or even accessible to the subject. We are entitled to rely, other things equal, on perception, memory, deductive and inductive reasoning, and on … the word of others. (230)

What entitlement consists in with respect to each of these cases is different. What they do have in common is the negative characteristics listed. Burge continues:

The unsophisticated are entitled to rely on their perceptual beliefs. Philosophers may articulate these entitlements. But being entitled does not require being able to justify reliance on these resources, or even to conceive such a justification. Justifications … involve reasons that people have and have access to. (1993a, 230)

Throughout his career, Burge has provided explanations for our entitlement to rely upon interlocution, certain types of self-knowledge and self-understanding, memory, reasoning, and perception. The last of these is briefly sketched before some misunderstandings of the distinction between justification and entitlement are warned against. The case of perceptual entitlement provides one of the best illustrations of the nature of entitlement in general.

People are entitled to rely upon their perceptual beliefs just in case the beliefs in question: (a) are the product of a natural perceptual competence, that is functioning properly; (b) are of types that are reliable, where the requirement of reliability is restricted to a certain type of environment; and (c) have contents that are normally transduced from perceptual states that themselves are reliably veridical (Burge 2003c, sections VI and VIII; 2020, section I). These points are part of a much larger and more complex discussion, of course. The point for now is that each of (a)-(c) are examples of elements of entitlements. As is the case with all entitlements, individuals who are perceptually entitled to their beliefs do not have to know anything concerning (a)-(c); and indeed need not even be able to understand the explanation of the entitlement, or the concept “entitlement”. A final key point is that while all entitlements, like all epistemic warrants generally for Burge, must be the product of reliable belief-forming competences, no entitlement consists purely in reliability. In the case of perception, the sort of reliability that is necessary for entitlement is reliability in the kind of environment that contributed to making the individual’s perceptual states and beliefs what they are (2003c, section VI; 2020, section I).

Numerous critics of Burge have misunderstood the nature of entitlement, and/or the distinction between justification and entitlement. Rather than exhaustively cataloging these misinterpretations, the remainder of the section is devoted to articulating the four main sources of misunderstanding. Keeping these in mind would help to prevent further interpretive mistakes. In increasing levels of subtlety, the mistakes are the following. The first error is simply to disregard Burge’s insistence that entitlements need not be appealed to, or be even within the ken, of the entitled individual. The fact that an individual has no knowledge of any warranting conditions, in a given case, is not a reason for doubting that she is entitled to the relevant range of beliefs.

The second error is insisting that entitlement be understood in terms of “epistemic grounds”, or “evidence”. Each of these notions suggests the idea of epistemic materials in some way made use of by the believer. But entitlement is never something that accrues to a belief, or by extension to a believer, because of something that he or she does, or even recognizes. The example of perceptual entitlement, which accrues in virtue of conditions (a)-(c) above, illustrates these points. The individuation conditions of perceptual states or beliefs are in no sense epistemic grounds. The notion of evidence is even less appropriate for describing entitlement. While evidence can be made up of many different sorts of entities, or states of affairs, evidence must be possessed or appreciated by a subject in order for it to provide an epistemic warrant. But in that case, on Burge’s view, the warrant would be a justification rather than an entitlement.

A variant on this second source of misunderstanding is to assume that since justification is an epistemic warrant by reason, and reasons are propositional, all propositional elements of epistemic warrants are justifications (or parts of justifications). Several types of entitlements involve propositionality—examples of which are interlocution, authoritative self-knowledge, and even perception (in the sense that perceptual beliefs to which we are entitled must have a propositional structure appropriately derived from the content of relevant perceptual states). But none is a justification or an element in a justification. Being propositional is necessary, but not sufficient, for an element of an epistemic warrant to be, or to be involved in, a justification (as opposed to an entitlement). Another way to put the point is to explain that being propositional in structure is necessary, but not sufficient, for being a reason.

The third tendency that leads to misunderstandings of Burge’s two notions of epistemic warrant is the assumption that they are mutually exclusive. On this view, a belief warranted by justification (entitlement) cannot also be warranted by entitlement (justification). Not only is this not the case, but in fact all beliefs that are justified are also beliefs to which the relevant believer is entitled. Every belief that a thinker is justified in holding is also a belief that is produced by a relevantly reliable, natural competence. (Though the converse obviously does not hold.) Entitlement is the result of a well-functioning, natural, reliable belief-forming competence. There are two species of justification for Burge. In the first case, one is justified in believing a self-evident content such as “I am thinking”, or “2+2=4”. In effect, these contents are reasons for themselves—believing them is enough, other things equal, for the beliefs to be epistemically warranted and indeed to be knowledge. The second kind of justification involves inference. If a sound inference is made by a subject, the premises support the conclusion, and the believer understands why the inference is truth-preserving (or truth-tending), then the belief is justified. Notice that each of these kinds of justified beliefs are, for Burge, also the products of well-functioning, natural, reliable belief-forming competencies. The competence in the case of contents that are reasons for themselves is understanding; and the competence in the second case is a complex of understanding the contents, understanding the pattern of reasoning, and actually reasoning from the content of the premises to the content of the conclusion. So all cases of justification are also cases in which the justified believer is entitled to his or her beliefs.

The subtlest mistake often made by commenters concerning Burge’s notions of justification and entitlement is to assume that what Burge says is not true of entitlement is true of his notion of justification. After all, in “Content Preservation”, Burge states that entitlement “need not be understood by or even accessible to the subject” (1993a, 230). And later, in “Perceptual Entitlement”, Burge makes a number of additional negative claims about entitlement. He writes that entitlement “does not require the warranted individual to be capable of understanding the warrant”, and that entitlement is a “warrant that need not be fully conceptually accessible, even on reflection, to the warranted individual” (2003c, 503). Finally, Burge argues that children, for example, are entitled to their perceptual beliefs, rather than being justified, because they lack sophisticated concepts such as epistemic, entails, perceptual state, and so forth (2003c, 521). So we have the following negative specifications concerning entitlement:

(I) It does not require understanding the warrant;

(II) It does not require being able to access the warrant;

and

(III) It does not require the use of sophisticated concepts such as those mentioned above.

The mistake, of course, is to assume that these things that are not required by entitlement are required by justification, as Burge understands justification. This difficulty is a reflection of the fact that Burge, in these passages and others like them, is doing two things at once. He is not only explaining how he thinks of entitlement and justification, but also distinguishing entitlement from extant conceptions of justification. Since his conception of justification differs from most other conceptions, it is a fallacy to infer from (I)-(III), together with relevant context, that they must be abilities or capacities that justification does require.

This is not to say that (I)-(III) are wholly irrelevant to Burge’s notion of justification. For his conception is not completely unlike others’ conceptions. For example, one who believes that 2+2=4 based on his or her understanding of the content does understand the warrant—for the warrant is the content itself. So what (I) denies of entitlement is sometimes true of Burge’s notion of justification. Similarly, a relative neophyte who understands at least basic logic, and makes a sound inference in which the premises support the conclusion, is in one perfectly good sense able to access his or her warrant for believing the conclusion, as in what is denied in (II). The notion of access in question, when Burge invokes the notion in characterizations of epistemic warrant, is conscious access. (See section 5 above.)

But the other two claims are more problematic. Burge’s conception of justification is not as demanding as one which holds that the denials of (II) and (III) correctly characterize what is necessary for justification. Thus, while perceptual entitlement is the primary form of epistemic warrant for those with empirical beliefs, it is not impossible for children or adults to have justifications for their perceptual beliefs. It is only that these will almost always be partial. They will usually be able to access and understand the warrant (the entitlement) only partially. In effect, they are justified only to the extent that they have an understanding of the nature of perceptual entitlement. Fully understanding the warrant, the entitlement, would require concepts such as those mentioned in (IV). But even children and many adults, as noted, are capable of approximating understanding of the warrant. Burge gives the example of a person averring a perceptual belief and providing in support of his belief the claim that it looks that way to him or her. This is a kind of justification. But there is no (full) understanding of the warrant, and likely not even possession of all the concepts employed in a discursive representation of the complete warrant. Finally, Burge’s notion of justification, or epistemic support by reason, is even weaker than these remarks suggest. For he holds that some nonhuman animals probably have reasons for some of their perceptual beliefs (and therefore have justifications for them)—but these animals can in no sense at all access or understand the warrant. As Burge writes, “My notion of having a reason or justification does not require reflection or understanding. That is a further matter” (2003c, 505 fn. 1). This passage brings out how different Burge’s notion of justification is from many others’ conceptions; and it helps to explain why it is an error to assume that what Burge says is not true of entitlement is true of (his notion of) justification.

7. Interlocution

Burge’s early work on interlocution (or testimony) defended two principal theses. One is the “Acceptance Principle”—the view, roughly speaking, that one is prima facie epistemically warranted in relying upon the word of another. The argument for this principle draws upon three a priori theses: (a) speech and the written word are indications of propositional thought; (b) propositional thought is an indication of a rational source; and (c) rational sources can be relied upon to present truth. The other thesis Burge defended was that it is possible to be purely a priori warranted in believing a proposition on the basis of interlocution (1993a). Burge came to regard this second thesis as a large mistake (2013b, section III), and has since then held that the required initial perceptual uptake of the words in question—utilization of which is made in (a)—makes all interlocutionary knowledge and warranted belief at least minimally empirical in epistemic support. It should be noted, however, that Burge’s view on our most basic interlocutionary warrant remains distinctive in that he regards it as fundamentally non-inferential in character. It is an entitlement—whose nature is structured and supported by the Acceptance Principle, and the argument for it—rather than a justification. Furthermore, none of the critics of Burge’s early view on interlocutionary entitlement identified the specific problem that eventually convinced him that the early view had to be given up.

The specific problem in question was that Burge had initially held that since interlocutionary warrant could persist in certain cases, even as perceptual identification of an utterance failed, the warrant could not be based, even partly, on perception. Burge came to believe that this persistence was possible only because of a massive presumption of reliability where perception was concerned. So the fact that interlocutionary warrant could obtain even where perception failed does not show that the warrant is epistemically independent of perception (2013b, section III).

8. Self-Knowledge

Burge’s views on self-knowledge developed over three periods. The first of these consisted largely in a demonstration that anti-individualism is not, contrary to a common view at the time, inconsistent with or in any tension with our possession of some authoritative self-knowledge (1986d; compare 2013, 8). Burge pointed to certain “basic cases” of self-knowledge—such as those involving the content of “I am now entertaining the thought that water is wet”—which are infallible despite consisting partly in concepts that are anti-individualistically individuated. Using the terms that Burge introduced later, this content is a pure cogito case. It is infallible in the sense that thinking the content makes it true. It is also self-verifying in the sense that thinking the content provides an epistemic warrant, and indeed knowledge, that it is the case. There are also impure cogito cases, an example of which is “I am hereby thinking [in the sense of committing myself to the view] that writing requires concentration”. This self-ascription is not infallible. One can think the content, even taking oneself to endorse the first-order content in question, but one can fail actually to commit oneself to it. But impure cogito cases are still self-verifying. The intentional content in such cases “is such that its normal use requires a performative, reflexive, self-verifying thought” (2003e, 417-418). What Burge calls “basic self-knowledge” in his early work on self-knowledge is comprised of cogito cases, pure and impure. He is explicit, however, that not all authoritative self-knowledge, much less all self-knowledge in general, has these features.

To reiterate, the central point of this early work was simply to demonstrate that there is no incompatibility between our possession of authoritative self-knowledge and anti-individualism. Basic cases of self-knowledge illustrate this. One further way to explain why there is no incompatibility is to note that the conditions that, in accordance with anti-individualism, must be in place for the first-order contents to be thought are necessarily also in place when one self-ascribes such an attitude to oneself (2013, 8).

The second period of Burge’s work on self-knowledge centered around a more complete discussion of the different forms of authoritative self-knowledge, as well as defending the thesis that a significant part of our warrant for non-basic cases of such self-knowledge derives from its indispensable role in critical reasoning (1996). Critical reasoning is meta-representational reasoning that conceptualizes attitudes and reasons as such. The role of (non-basic) authoritative self-knowledge in critical reasoning is part of our entitlement to relevant self-ascriptions of attitudes in general. This second period thus extended Burge’s account of authoritative self-knowledge to non-cogito instances of self-knowledge. It also began the project of explaining wherein we are entitled to authoritative self-knowledge among instances where the self-ascriptions are not self-verifying. Since cogito cases provide reasons for themselves, as it were, basic cases of self-knowledge involve justification. By contrast, non-basic cases of authoritative self-knowledge are warranted by entitlement rather than justification. (See section 6.)

The third period of Burge’s work on self-knowledge consisted in a full discussion of the nature and foundations of authoritative self-knowledge (2011a). Burge argues that authoritative self-knowledge, including a certain sort of self-understanding, is necessary for our role in making attributions concerning, and being subject to, norms of critical reasoning and morality. A key to authoritative self-knowledge, as stressed by Burge from the beginning of his work on the topic, is the absence of the possibility of brute error. Brute error is an error that is not in any way due to malfunctioning or misuse of a representational competence. In perception, for example, one can be led into error despite the fact that one’s perceptual system is working fully reliably; if, say, light is manipulated in certain ways. By contrast, while error is possible in most cases of authoritative self-knowledge, it is possible only when there is misuse or malfunction. Since misuse and malfunction undermine the epistemic warrant, it can be said that instances of authoritative self-knowledge for Burge are “warrant factive”—warrant entails, in such cases, true self-ascriptions of mental states.

The full, unified account of self-knowledge in Burge (2011a) explains each element in our entitlement to self-knowledge and self-understanding. The account is extended to cover, not only basic cases of self-knowledge, but also knowledge of standing mental states; of perceptual states; and of phenomenal states such as pain. The unified treatment explains why its indispensable role in critical reasoning is not all there is to our entitlement to (non-basic cases of) self-knowledge and self-understanding. Burge’s explanation of the impossibility of brute error with respect to authoritative self-knowledge makes essential use of the notion of “preservational psychological powers”, such as purely preservative memory and betokening understanding. Betokening understanding is understanding of particular instances of propositional representational content. The unification culminates in an argument that shows how immunity to brute error follows from the nature of certain representational competencies, along with the nature of epistemic entitlement (2011a, 213f). In yet later work, Burge explained in detail the relation between authoritative self-knowledge and critical reasoning (2013, 23-24).

9. Memory and Reasoning

Two of Burge’s most important philosophical contributions are his identification and elucidation of the notion of purely preservative memory, on one hand, and his discussion of critical reasoning, particularly its relation to self-knowledge and the first-person concept, on the other.

Burge’s discussion of memory and persons distinguishes three different forms of memory: experiential memory; substantive content memory; and purely preservative memory (2003c, 407-408). Experiential memory is memory of something one did, or that happened to one, from one’s own perspective. Substantive content memory is closer to our ordinary notion of simply recalling a fact, or something that happened, without having experienced it personally. Purely preservative memory, by contrast, simply holds a remembered (or seemingly remembered) content, along with the content’s warrant and the associated attitude or state, in place for later use. When I remember blowing out the candles at my 14^th birthday party, this is normally experiential memory. Remembering that the United States tried at least a dozen times to assassinate Fidel Castro, in most cases, is an example of substantive content memory. When one conducts inference over time, by contrast, memory functions simply to hold earlier steps along with their respective warrants in place for later use in the reasoning. This sort of memory is purely preservative. Burge argues that no successful reasoning over time is possible without purely preservative memory. Purely preservative memory also plays an important role in Burge’s earlier account of the epistemology of interlocution (1993a; 2013b); and in his most developed account of the epistemology of self-knowledge and self-understanding (2011a).

In “Memory and Persons” he discussed the role of memory in psychological representation as well as the issue of personal identity. Burge argues that memory is “integral to being a person, indeed to having a representational mind” (2003b, 407). He does this by arguing that three common sorts of mental acts, states, and events—those involving intentional agency, perception, and inference—presume or presuppose the retention of de se representational elements in memory. De se states have two functions. First, they mark an origin of representation. In the case of a perceptual state this might be between an animal’s eyes. Second, they are constitutively associated with an animal’s perspectives, needs, and goals. Thus, a dog might not simply represent in perceptual memory the location of a bone—but instead, the location of his or her bone. De se markers are also called by Burge “ego-centric indexes” (2003c; 2019).

Intentional agency requires retention in memory of de se representational elements because intention formation and fulfillment frequently take place over time. If someone else executes the sort of action that one intends for oneself, this would not count as fulfillment of the veridicality condition of one’s intention. Marking one’s own fulfillment (or the lack of it) requires retention in memory of one’s own de se representational elements. Another example is perception. It requires the use of perceptual contents. This use always and constitutively involves possession or acquisition of repeatable perceptual abilities. “Such repeatable abilities include a systematic ability to connect, from moment to moment, successive perceptions to one another and to the standpoint from which they represent” (2003b, 415). The activity necessarily involved in perception, too, involves retention of de se contents in purely preservative memory. Inference, finally, requires this same sort of retention for reasons alluded to above. If reliance on a content used earlier in a piece of reasoning is not ego-centrally indexed to the reasoner, then simple reliance on the content cannot epistemically support one’s conclusion. The warrant would have to be re-acquired whenever use was made of a given step in the process of reasoning—making reasoning over time impossible.

It follows from these arguments that attempts to reduce personal identity to memory-involving stretches of consciousness cannot be successful. Locke is commonly read as attempting to carry-out such a reduction. Butler pointed out a definitional circularity—memory cannot be used in defining personal identity because genuine memories presuppose such identity. Philosophers such as Derek Parfit and Sydney Shoemaker utilized a notion of “quasi-memory”—a mental state just like memory but which does not presuppose personal identity—in an attempt to explain personal identity in more fundamental terms. Burge’s argumentation shows that this strategy involves an explanatory circularity. Only a creature with a representational mind could have quasi-memories. However, for reasons set out in the previous two paragraphs, having a representational mind requires de se representational elements that themselves presuppose personal identity over time. Hence, quasi-memory presupposes genuine memory, and cannot therefore be used to define or explain it (2003b, sections VI-XI).

As noted in the previous section, critical reasoning is meta-representational reasoning that characterizes propositional attitudes and reasons as such. One of Burge’s most important discussions of critical reasoning explains how fully understanding such reasoning requires use and understanding of the full, first-person singular concept “I” (1998).

Descartes famously inferred his existence from the fact that he was thinking. He believed that this reasoning was immune to serious skeptical challenges. Some philosophers, most notably Lichtenberg, questioned this. They reasoned that while it might be the case that one can know one is thinking, simply by reflecting on the matter, the ontological move from thinking to a thinker seems dubious at worst, and unsupported at best. Burge argues, using only premises that Lichtenberg was himself doubtless committed to—such as that it is a worthwhile philosophical project to understand reason and reasoning—that the first-person singular concept is not dispensable in the way that Lichtenberg and others have thought. Among other things, Burge’s argument provides a vindication of Descartes’s reasoning about the cogito. The argument shows that Descartes’s inference to his existence as a thinker from the cogito is not rationally unsupported, as Lichtenberg and others had suggested.

All reasons that thinkers have are, in Burge’s terminology, “reasons-to”. That is, they are not merely recognitions of (for example) logical entailments among propositions—they enjoin one to change or maintain one’s system of beliefs or actions. This requires not merely recognition of the relevance of a rational review, but also acting upon it. “In other words, fully understanding the concept of reason involves not merely mastering an evaluative system for appraising attitudes … [but also] mastering and conceptualizing the application of reasons in actual reasoning” (1998, 389). Furthermore, reasons must sometimes exert their force immediately. Their implementational relevance, that is to say, is sometimes not subject to further possible rational considerations. Instead, the reasons carry “a rationally immediate incumbency to shape [attitudes] in accordance with the evaluation” of which the reasons are part (1998, 396). Burge argues that full understanding of reasoning in general, and this rational immediacy in particular, requires understanding and employing the full “I”-concept. If correct, this refutes Lichtenberg’s contention that the “I”-concept is only practically necessary; and it supports Descartes’s view that understanding and thought alone are sufficient to establish one’s existence as a thinker. Only by adverting to the “I” concept can we fully explain the immediate rational relevance that reasons sometimes enjoy in a rational activity.

10. Reflection

Burge has also discussed the epistemology of intellection (that is, reason and understanding) and reflection. He argues that classical rationalists maintained three principles concerning reflection. One is that reflection in an individual is always, at least in principle, sufficient to bring to conscious articulation steps or conclusions of the reflection. Another is that reflection is capable of yielding a priori warranted belief and knowledge of objective subject matters. The final classical principle about reflection is that success in reflection requires skillful reasoning and is frequently difficult—it is not a matter simply of attaining immediate understanding or knowledge from a “flash” of insight (2013a, 535-537).

Burge accepts the second and third principles about reflection but rejects the first. He argues that anti-individualism together with advances in psychology show the first principle to be untenable. Anti-individualism shows that “the representational states one is in are less a matter of cognitive control and internal mastery, even ‘implicit’ cognitive control and mastery, than classical views assumed” (2013a, 538). Advances in psychology cast doubt on the first thesis primarily because it seems that many nonhuman animals, as well as human infants, think thoughts (and thus have concepts) despite lacking the ability to reflect on them; and because it has become increasingly clear that much cognition is modular and therefore inaccessible to conscious reflection, even in normal, mature human beings.

Burge has also carried out extensive work on how reflection can (and sometimes, unaided, cannot) “yield fuller understanding of our own concepts and conceptual abilities” (2007d, 165); on the emergence of logical truth and logical consequence as the key notions in understanding logic and deductive reasoning (which discussion includes an argument that fully understanding reasoning commits one ontologically to an infinite number of mathematical entities) (2003a); and on the nature and different forms of incomplete understanding (2012, section III). Finally, a substantial portion of Burge’s other work makes extensive use of a priori reflection—an excellent example being “Memory and Persons” (see section 9).

11. Perception

Burge’s writing on perception is voluminous in scope. Most historically important is Origins of Objectivity (2010). [This book is not most centrally about perception, as some commentators have suggested, but on what its title indicates: the conditions necessary and sufficient for objective psychological reference. A much more complete treatment of perception is to be found in the successor volume to Origins—Perception: First Form of Mind (2021)]. The first part of the present section deals with Burge’s work on the structure and content of perception. The second part briefly describes his 2020 article on perceptual warrant.

Origins is divided into three parts. Part I provides an introduction, a detailed discussion of terminology, and consideration of the bearing of anti-individualism on the rest of the volume’s contents. Part II is a wide-ranging discussion of conceptions of the resources necessary for empirical reference and representation, covering both the analytic and the continental traditions, and spanning the entire 20^th century. Part III develops in some detail Burge’s conception of perceptual representation: including biological and methodological backgrounds; the nature of perception as constitutively associated with perceptual constancies; discussion of some of the most basic perceptual representational categories; and a few “glimpses forward”, one of which is mentioned below.

Part I characterizes a view that Burge calls “Compensatory Individual Representationalism” (CIR). With respect to perception, this is the view that the operation of the perceptual system, even when taken in tandem with ordinary relevant causal relations, is insufficient for objective reference to and representation of the empirical world. The individual perceiver must herself compensate for this insufficiency in some way if objective reference is to be possible. This view is then contrasted with Burge’s own view of the origins of objective reference and representation, which is partly grounded in anti-individualism as well as the sciences of perceptual psychology, developmental psychology, and ethology.

Part II of Origins critically discusses all the major versions of CIR. The discussion is comprehensive, including analyses of several highly influential 20^th-century philosophers (and some prominent psychologists) who reflected upon the matter in print. There are two families of CIR. The first family holds that a more primitive level of representation is needed, underlying ordinary empirical representation, without which representation of prosaic entities in the environment is not possible. Bertrand Russell is an example of one who held a first-family version of CIR. Representation of the physical world, on his view, was parasitic upon being acquainted—representing—sense data (2010, 119). Second family forms of CIR did not require a more primitive level of representation. They did require, however, that certain advanced competencies be in place if objective reference and empirical representation are to be possible. Peter Strawson, for example, held that objective representation requires the use of a comprehensive spatial framework, as well as the use of one’s position in this represented allocentric space (2010, 160).

Both families of CIR share a negative and a positive claim. The negative claim is that the normal functioning of a perceptual system, together with regular causal relations, is insufficient for objective empirical representation. The positive claim is that such representation requires that an individual in some way herself represents necessary conditions upon objective representation. Burge argues that all versions of CIR are without serious argumentative or empirical support. This includes even versions of CIR that are compatible with anti-individualism. Burge extracted the detailed discussion of Quine’s version of the syndrome in an article (2009b).

The central chapter of Part III of Origins, chapter 9, discusses Burge’s conception of the nature of perceptual representation, including what distinguishes perception from other sensory systems. It argues that perception is paradigmatically attributable to individuals; sensory; representational; a form of objectification; and involves perceptual constancies. All perception must occur in the psychology of an individual with perceptual capacities, and in normal cases some individual perceptions must be attributable to the individual (as opposed to its subsystems). Perception is a special sort of sensory system—a system that functions to represent through the sort of objectification that perceptual constancies consist in. Perception is constitutively a representational competence, for Burge. Objectification involves, inter alia, marking an important divide between mere sensory responses, on one hand, and representational capacities that include such responses, but which cannot be explained solely in terms of them, on the other (2010, 396). Finally, perceptual constancies “are capacities to represent environmental attributes, or environmental particulars, as the same, despite radically different proximal stimulations” (2010, 114).

Burge argues that genuine objective perception begins, for human beings, nearly at birth, and is achieved in dozens or hundreds of other animal species, including some arthropods. The final chapter of the book includes “glimpses beyond”. It points, perhaps most importantly, toward Burge’s work—thus far unpublished—explaining the origins of propositional thought, including what constitutively distinguishes propositional representation from perceptual and other forms of representation. (Burge has published, in addition to the discussion in Origins of Objectivity, some preparatory work in this direction (2010a).)

The remainder of this section briefly discusses Burge’s 2020 work on perceptual warrant. This lengthy article is divided into five substantial sections. The first consists in a largely or wholly a priori discussion of the nature of epistemic warrant, including discussion of the distinction between justification and entitlement; and the nature of representational and epistemic functions and goods. Two of the most important theses defended in the first section are the following: (i) the thesis that, setting aside certain probabilistic cases and beliefs about the future, epistemic warrant certifies beliefs as knowledge—that is, if a perceptual belief (say) is warranted, true, and does not suffer from Gettier-like problems, then the belief counts as knowledge; and (ii) the thesis that epistemic warrant cannot “block” knowledge. That is to say, whatever epistemic warrant is, it cannot be such that it prevents a relevantly warranted belief from becoming knowledge. Burge uses these theses to argue for the inadequacy of various attempts at describing the nature of epistemic warrant.

The second section uses the a priori connections between warrant, knowledge, and reliability to argue against certain (internalist) conceptions of empirical warrant. The central move in the argument against epistemic internalism about empirical warrant is the thesis that warrant and knowledge require reliability in normal circumstances, but that nothing in perceptual states or beliefs taken in themselves ensures such reliability. Burge argues for the reliability requirement on epistemic warrant by an appeal to the “no-blockage” thesis—any unreliable way of forming beliefs would block those beliefs from counting as knowledge. So the argument against epistemic internalism has two central steps. First, the “no-blockage” thesis shows that reliability, at least in certain circumstances, is required for an epistemic warrant. And second, nothing that is purely “internal” to a perceiver ensures that her perceptual state-types are reliably veridical; or, therefore, that her perceptual belief-types are reliably true. Hence, internalism cannot be a correct conception of perceptual warrant.

The third section discusses differences between refuting skeptical theses, on one hand, and providing a non-question-begging response to a skeptical challenge, on the other. (In section VI of “Perceptual Entitlement” (2003c), for example, Burge explains perceptual warrant but does not purport to answer skepticism.) Burge argues that many epistemologists have conflated these two projects, with the result (inter alia) that the nature of epistemic warrant has been obscured. The fourth section argues that a common line of reasoning concerning “bootstrapping” is misconceived. Some have held that if, as on Burge’s view, empirical warrants do not require justifying reasons, then there is the unwelcome consequence that we can infer inductively from the most mundane pieces of empirical knowledge, or warranted empirical beliefs, that our perceptual belief-forming processes are reliable. Burge argues that it is not the nature of epistemic warrant that yields this unacceptable conclusion but instead a misunderstanding concerning the nature of adequate inductive inference. Finally, the fifth section argues at length against the view that conceptions of warrant like Burge’s imply unintuitive results in Bayesian confirmation theory (2020).

12. History of Philosophy

Finally, Burge has done sustained and systematic work on Frege. The work tends to be resolutely historical in focus. All but two of his articles on Frege are collected in Truth, Thought, Reason (2005). The others are Burge (2012) and (2013c). The latter article contains Burge’s fullest discussion of the relation between philosophy and history of philosophy.

The substantial introduction to Burge (2005) is by far the best overview of Burge’s work on Frege. The introduction contains not only a discussion of Frege’s views and how his collected essays relate to them, but also Burge’s most complete explanation of wherein his own views differ from Frege’s. The first essay provides a valuable, quite brief introduction to Frege and his work (2005a). The remaining essays are divided into three broad categories. The first discusses Frege’s views on truth, representational structure, and Frege’s philosophical methodology. The second category deals with Frege’s views on sense and cognitive value. Included in this category is the article that Burge believes is his philosophically most important article on Frege (1990). Finally, the third section of Burge’s collection of essays on Frege treats aspects of Frege’s rationalist epistemology. One of the articles on Frege that do not appear in Burge (2005) critically discusses an interpretation of Frege’s notion of sense advanced by Kripke; it also provides an extended discussion of the nature of incomplete understanding (2012). The other paper discusses respects in which Frege has influenced subsequent philosophers and philosophy (2013c).

Burge has also done historical work on Descartes, Leibniz, and Kant. Much of this work remains unpublished, save three articles. One traces the development and use of the notion of apriority through Leibniz, Kant, and Frege (2000). The other two discuss Descartes’s notion of mental representation, especially including evidence for and against the view that Descartes was an anti-individualist about representational states and events (2003d; 2007c).

13. Psychology

Much of Burge’s work on perception is also a contribution to the philosophy of psychology or even to the science of psychology itself (for example, 1991a; 2010; 2014a; 2014b). He was the first to introduce into philosophical discussion David Marr’s groundbreaking work on perception (Burge, 1986c). Burge himself has also published a couple of shorter pieces in psychology (2007g; 2011b).

In addition to this, Burge published a long article in Psychological Review (2018), that is not focused on perception. This article criticizes in detail the view, common among psychologists and some philosophers, that infants and nonhuman animals attribute mental states to others. The key to Burge’s argument is recognizing and developing a non-mentalistic and non-behavioristic explanatory scheme that centers on explaining action and action targets, but which does not commit itself to the view that relevant subjects represent psychological subject matters. The availability of this teleological, conative explanatory scheme shows that it does not follow, other things equal, from the fact that some infants and nonhuman animals represent actions and actors that they attribute mental states to these actors.

14. References and Further Reading

a. Primary Literature

i. Books

(2005). Truth, Thought, Reason: Essays on Gottlob Frege: Philosophical Essays, Volume 1 (Oxford: Oxford University Press).
(2007). Foundations of Mind: Philosophical Essays, Volume 2 (Oxford: Clarendon Press).
(2010). Origins of Objectivity (Oxford: Clarendon Press).
(2013). Cognition Through Understanding: Self-Knowledge, Interlocution, Reasoning, Reflection: Philosophical Essays, Volume 3 (Oxford: Clarendon Press).
(2021) Origins—Perception: First Form of Mind. (Oxford: Oxford University Press).

ii. Articles

(1972). ‘Truth and Mass Terms’, The Journal of Philosophy 69, 263-282.
(1973). ‘Reference and Proper Names’, The Journal of Philosophy 70, 425-439.
(1974a). ‘Demonstrative Constructions, Reference, and Truth’, The Journal of Philosophy 71, 205-223.
(1974b). ‘Truth and Singular Terms’, Noûs 8, 309-325.
(1975). ‘On Knowledge and Convention’, The Philosophical Review 84, 249-255.
(1977). ‘Belief De Re’, The Journal of Philosophy 74, 338-362. Reprinted in Foundations of Mind.
(1979a). ‘Individualism and the Mental’, Midwest Studies in Philosophy 4, 73-121. Reprinted in Foundations of Mind.
(1979b). ‘Semantical Paradox’, The Journal of Philosophy 76, 169-198.
(1982). ‘Other Bodies’, in A. Woodfield (ed.) Thought and Object (Oxford: Oxford University Press, 1982). Reprinted in Foundations of Mind.
(1984). ‘Epistemic Paradox’, The Journal of Philosophy 81, 5-29.
(1986a). ‘Intellectual Norms and Foundations of Mind’, The Journal of Philosophy 83, 697-720. Reprinted in Foundations of Mind.
(1986b). ‘Cartesian Error and the Objectivity of Perception’, in P. Pettit and J. McDowell (eds.) Subject, Thought, and Context (Oxford: Oxford University Press). Reprinted in Foundations of Mind.
(1986c). ‘Individualism and Psychology’, The Journal of Philosophy 95, 3-45. Reprinted in Foundations of Mind.
(1986d). ‘Individualism and Self-Knowledge’, The Journal of Philosophy 85, 649-663. Reprinted in Cognition Through Understanding.
(1990). ‘Frege on Sense and Linguistic Meaning’, in D. Bell and N. Cooper (eds.) The Analytic Tradition (Oxford: Blackwell). Reprinted in Truth, Thought, Reason.
(1991a). ‘Vision and Intentional Content’, in E. LePore and R. Van Gulick (eds.) John Searle and His Critics (Oxford: Blackwell).
(1991b). ‘Frege’, in H. Burkhardt and B. Smith (eds.) Handbook of Ontology and Metaphysics (Munich: Philosophia Verlag). Reprinted in Truth, Thought, Reason.
(1992). ‘Philosophy of Language and Mind: 1950-1990’, The Philosophical Review 101, 3-51. Expanded version of the portion on mind in Foundations of Mind.
(1993a). ‘Content Preservation’, The Philosophical Review 102, 457-488. Reprinted in Cognition Through Understanding.
(1993b). ‘Mind-Body Causation and Explanatory Practice’, in J. Heil and A. Mele (eds.) Mental Causation (Oxford: Oxford University Press, 1993). Reprinted in Foundations of Mind.
(1996). ‘Our Entitlement to Self-Knowledge’, Proceedings of the Aristotelian Society 96, 91-116. Reprinted in Cognition Through Understanding.
(1997a). ‘Interlocution, Perception, and Memory”. Philosophical Studies 86, 21-47. Reprinted in Cognition Through Understanding.
(1997b). ‘Two Kinds of Consciousness’, in N. Block, O. Flanagan, and G. Güzeldere (eds.) The Nature of Consciousness (Cambridge, MA: MIT Press). Reprinted in Foundations of Mind.
(1998). ‘Reason and the First Person’, in C. Wright, B. Smith, and C. Macdonald (eds.) Knowing Our Own Minds (Oxford: Clarendon Press). Reprinted in Cognition Through Understanding.
(1999). ‘Comprehension and Interpretation’, in L. Hahn (ed.) The Philosophy of Donald Davidson (Chicago, IL: Open Court Press). Reprinted in Cognition Through Understanding.
(2000). ‘Frege on Apriority’, in P. Boghossian and C. Peacocke (eds.) New Essays on the A Priori (Oxford: Oxford University Press). Reprinted in Truth, Thought, Reason.
(2003a) ‘Logic and Analyticity’, Grazer Philosophische Studien 66, 199-249.
(2003b) ‘Memory and Persons’, The Philosophical Review 112, 289-337. Reprinted in Cognition Through Understanding.
(2003c). ‘Perceptual Entitlement’, Philosophy and Phenomenological Research 67, 503-548.
(2003d). ‘Descartes, Bare Concepts, and Anti-individualism’, in M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press).
(2003e). ‘Mental Agency in Authoritative Self-Knowledge’, M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press).
(2005a). ‘Frege’, in Truth, Thought, Reason.
(2007a). ‘Disjunctivism and Perceptual Psychology’, Philosophical Topics 33, 1-78.
(2007b). ‘Predication and Truth’, The Journal of Philosophy 104, 580-608.
(2007c). ‘Descartes on Anti-individualism’, in Foundations of Mind.
(2007d). ‘Postscript: “Individualism and the mental”’, in Foundations of Mind.
(2007e). ‘Reflections on Two Kinds of Consciousness’, in Foundations of Mind.
(2007f). ‘Postscript: “Belief De Re”’, in Foundations of Mind.
(2007g). ‘Psychology Supports Independence of Phenomenal Consciousness: Commentary on Ned Block’, Behavioral and Brain Sciences, 30, 500-501.
(2009a). ‘Five Theses on De Re States and Attitudes’, in J. Almog and P. Leonardi (eds.) The Philosophy of David Kaplan (New York: Oxford University Press).
(2009b). ‘Perceptual Objectivity’, The Philosophical Review 118, 285-324.
(2010a). ‘Steps toward Origins of Propositional Thought’, Disputatio 4, 39-67.
(2010b). ‘Modest Dualism’, in R. Koons and G. Bealer (eds.) The Waning of Materialism (New York: Oxford University Press). Reprinted in Cognition Through Understanding.
(2011a). ‘Self and Self-Understanding’: The Dewey Lectures. Presented in 2007. Published in The Journal of Philosophy 108, 287-383. Reprinted in Cognition Through Understanding.
(2011b). ‘Border-Crossings: Perceptual and Post-Perceptual Object Representation’, Behavioral and Brain Sciences 34, 125.
(2012). ‘Living Wages of Sinn’, The Journal of Philosophy 109, 40-84. Reprinted in Cognition Through Understanding.
(2013a). ‘Reflection’, in Cognition Through Understanding.
(2013b). ‘Postscript: Content Preservation’, in Cognition Through Understanding.
(2013c). ‘Frege: Some Forms of Influence’, in M. Beaney (ed.) The Oxford Handbook of the History of Analytic Philosophy. Oxford: Oxford University Press.
(2014a). ‘Adaptation and the Upper Border of Perception: Reply to Block’, Philosophy and Phenomenological Research 89, 573-583.
(2014b). ‘Perceptual Content in Light of Perceptual Consciousness and Biological Constraints: Reply to Rescorla and Peacocke’, Philosophy and Phenomenological Research 88, 485-501.
(2018). ‘Do Infants and Nonhuman Animals Attribute Mental States?’ Psychological Review 125, 409-434.
(2019). ‘Psychological Content and Ego-Centric Indexes’, in A. Pautz and D. Stoljar (eds.) A. Pautz, Blockheads! Essays on Ned Block’s Philosophy of Mind and Consciousness (Oxford: Oxford University Press).
(2020). ‘Entitlement: The Basis for Empirical Warrant’, in N. Pederson and P. Graham (eds.) New Essays on Entitlement (Oxford: Oxford University Press).

b. Secondary Literature

Two volumes of essays have been published on Burge’s work: M. Frápolli and E. Romero (eds.) Meaning, Basic Self-Knowledge, and Mind: Essays on Tyler Burge (Stanford, CA: CSLI Publications, 2003); and M. Hahn and B. Ramberg (eds.) Reflections and Replies: Essays on the Philosophy of Tyler Burge (Cambridge, MA: MIT Press, 2003). The second volume is nearly unique, among Festschriften, in that Burge’s responses make up nearly half of the book’s 470 pages. Further pieces include the following:
An article on Burge in The Oxford Companion to Philosophy, Ted Honderich (ed.) Oxford: Oxford University Press, 1995.
An article on Burge, in Danish Philosophical Encyclopedia. Politikens Forlag, 2010.
Interview with Burge. Conducted by James Garvey, The Philosophers’ Magazine, 2013—a relatively wide-ranging yet short discussion of Burge’s views.
Interview with Burge. Conducted by Carlos Muñoz-Suárez, Europe’s Journal of Psychology, 2014—a discussion focused on anti-individualism and perception.
Article on Burge, in the Cambridge Dictionary of Philosophy, Peter Graham, 2015.
Article on Burge, in the Routledge Encyclopedia of Philosophy, Mikkel Gerken and Katherine Dunlop, 2018—provides a quick overview of some of Burge’s philosophical contributions.
Article on Burge, in Oxford Bibliographies in Philosophy, Brad Majors, 2018—contains brief summaries of most of Burge’s work, together with descriptions of a small portion of the secondary literature.

Author Information

Brad Majors
Email: bradmajors9@gmail.com
Baker University
U. S. A.

Persistence in Time

No person ever steps into the same river twice—or so goes the Heraclitean maxim. Obscure as it is, the maxim is often taken to express two ideas. The first is that everything always changes, and nothing remains perfectly similar to how it was just one instant before. The second is that nothing survives this constant flux of change. Where there appears to be a single river, a single person or, more generally, a single thing, there in fact is a series of different instantaneous objects succeeding one another. No person ever steps into the same river twice, for it is not the same river, and not the same person.

Is the Heraclitean maxim correct? Is it true that nothing survives change, and that nothing persists through time? These ancient questions are still at the center of contemporary metaphysics. This article surveys the main contemporary theories of persistence through time, such as three-dimensionalism, four-dimensionalism and the stage view (§ 1), and reviews the main objections proposed against them (§ 2, 3, 4).

Theories of persistence are an integral part of the more general field of the metaphysics of time. Familiarity with other debates in the metaphysics of time, universals, and mereology is here presupposed and can be acquired by studying the articles ‘Time’, ‘Universals’, ‘Properties’, and ‘Material Constitution’ in this encyclopedia.

Theories of Persistence
Arguments against Endurantism
Arguments against Perdurantism
Arguments against Stage View
What Is Not Covered in this Article
References and Further Reading

1. Theories of Persistence

This chapter presents contemporary theories of persistence from their most basic (§ 1a) to their most advanced forms (§ 1b and § 1c). It then discusses some ways of making sense of temporal parts (§ 1d), the relation between theories of persistence and theories of time (§ 1e), and the topic of the persistence of events (§ 1f).

a. The Basics

While the Heraclitean maxim denies that anything survives change and persists through time, we normally assume that some things do survive change and do persist through time. This bottle of sparkling water, for example, was here 5 minutes ago, and still is, despite its being now half empty. This notepad, for another example, will still exist tonight, even if I will have torn off some of its pages. In other words, we normally assume some things to persist through time. But before wondering whether our assumptions are right or wrong, we should wonder: what is it for something to persist? Here is an influential definition, first introduced by David Lewis (1986, 202):

Persistence

Something persists through time if and only if it exists at various times.

So, the bottle persists through time, if it does at all, because it exists at various times—such as now as well as five minutes ago, and the notepad persists through time because it exists at various times—such as now as well as later tonight.

Lewis’ definition makes use of the notion of existence at a time. The notion is technical, but its intended meaning should be clear enough. The following intuitive gloss might help clarify it. Something exists at, and only at, those times at which it is, in some sense, present, or to be found. So, Socrates existed in 400 B.C.E. but not in 1905, while I exist in 2019, at all instants that make up 2019, but at no time before the date of my birth (on temporal existence: Sider 2001: 58-59).

Persistence through time is sometimes also alternatively called ‘diachronic identity’—literally, ‘identity across time’. The reason for this name is simple enough. If this notepad exists now and will also exist afterwards, then there is a sense in which the notepad which exists now and the notepad that will exist later on are the same and identical. In which sense are they identical? What is the kind of identity here involved?

It is useful to introduce here a fundamental distinction between numerical and qualitative identity. On the one hand, numerical identity is the binary relation that anything bears to itself, and to itself alone (Noonan and Curtis 2018). For example, I, like everything else, am numerically identical to myself and to nothing else. Superman, for another example, is numerically identical to Clark Kent and Augustus is numerically identical with the first Roman emperor. This relation is called ‘numerical identity’, for it is related in an important way with the number of entities that exist. If superman is numerically identical to Clark Kent, then they are one entity, and not two. And if superman is numerically different from batman, then they are two entities, and not one. On the other hand, qualitative identity is nothing else than perfect similarity (Noonan and Curtis 2018). If two water molecules could have exactly the same mass, electrical charge, spatial configuration, and so on, so as to be perfectly similar, then they would be qualitatively identical. (It is controversial whether two entities can ever be perfectly similar—more on this later. Still, it is not difficult to find cases of perfect similarity. For example, an entity at a time is perfectly similar to itself at the same time.)

Having distinguished qualitative and numerical identity, what is, again, the sense of identity that is involved in diachronic identity? It is numerical identity. For recall: the question was whether, say, a river is a single—thus one—entity existing at different times, or rather a series of—thus many—instantaneous entities existing one after another.

Here is a second outstanding question that concerns persistence. Suppose that the Heraclitean maxim is wrong, and things persist through time. Do all things that persist through time persist in the same way? Or are there different ways of persisting through time? The consensus is that there are in fact several ways of persisting through time. In order to appreciate this fact, it is useful to contrast two kinds of entities that are supposed to persist, in one sense or another, through time: events and material objects. On the one hand, consider events. An event is here taken to be anything that is said to occur, happen, or take place (Cresswell 1986, Hacker 1982). Examples include a football match, a war, the spinning of a sphere, the collision of two electrons, the life of a person. Changes, processes, and prolonged states, if any, are notable examples of events. On the other hand, a material object can be thought of as the subject of those events, such as the football players, the soldiers, the sphere, the electrons and the person who lives. (For more on events see: What is an Event?)

Both material objects and events, or at least some of them, seem to persist through time. We have already discussed some examples involving objects, and it is equally easy to find examples of persisting events—basically, any temporally extended event would do. However, even if both objects and events seem to persist through time, they seem to do that in two different ways. An event persists through time by having different parts at different times. For example, a football match has two halves. These halves are parts of the match. But clearly enough they are not spatial parts of the match: they are not spread across different places, but across different times. That is why such parts are called ‘temporal parts’. The way of persisting of an event, by having different temporal parts at different times, is called ‘perdurance’ (Lewis 1986: 202).

Perdurance

Something perdures if and only if it persists by having different temporal parts at different times.

Throughout this article, ‘part’ means ‘proper part’, unless otherwise specified.

On the other hand, an object seems to persist in a different way. If an object persists through time, what is present of an object at different times is not a part of it, but rather the object itself, in its wholeness or entirety. This way of persisting, whereby something persists by being wholly present at different times, is called ‘endurance’ (Lewis 1986: 202). (‘Wholly present’ here clearly contrasts with the ‘partial’ presence of an event at different times—more on this later.)

Endurance

Something endures if and only if it persists by being wholly present at different times.

That being said, the contemporary debate on persistence focuses on material objects. In which way do they persist, if at all? A first theory, which takes the intuitions presented so far at face value, says that objects do indeed persist by being wholly present at different times, and so endure. (Endurantists include Baker (1997, 2000); Burke (1992, 1994); Chisholm (1976); Doepke (1982); Gallois (1998); Geach (1972a); Haslanger (1989); Hinchliff (1996); Johnston (1987); Lombard (1994); Lowe (1987, 1988, 1995); Mellor (1981, 1998); Merricks (1994, 1995); Oderberg (1993); Rea (1995, 1997, 1998); Simons (1987); Thomson (1983, 1998); van Inwagen (1981, 1990a, 1990b); Wiggins (1968, 1980); Zimmerman (1996).)

Endurantism

Ordinary material objects persist by being wholly present at different times; they are three-dimensional entities.

Endurantism is usually taken to be closer to common sense and favored by our intuitions. However, as we see later, endurantism does not come without problems. Due to those problems, and inspired by the spatiotemporal worldview suggested by modern physics, contemporary philosophers have also taken seriously the idea that objects are four-dimensional entities spread out both in space and time, and which divide into parts just like their spatiotemporal location does, and thus persist through time by having different temporal parts at different times, just like events do. This view is called perdurantism. (Perdurantists include Armstrong (1980); Balashov (2000); Broad (1923); Carnap (1967); Goodman (1951); Hawley (1999); Heller (1984, 1990); Le Poidevin (1991); Lewis (1986, 1988); McTaggart (1921, 1927); Quine (1953, 1960, 1970, 1981); Russell (1914, 1927); Smart (1972, 1963); Whitehead (1920).)

Perdurantism

Ordinary material objects persist by having different temporal parts at different times; they are four-dimensional entities.

Perdurantism is also known as ‘four-dimensionalism’—for perdurantism has it that objects are extended in four dimensions (this contrasts with endurantism, according to which objects are extended at most in the three spatial dimensions, and hence is also called ‘three-dimensionalism’).

Under perdurantism, what exists of me at each moment of my persistence is, strictly speaking, a temporal part of me. And each of my temporal parts is numerically different from all others.

One might be tempted to think that, as a consequence, perdurantism denies that I persist through time. This would be a mistake. While my instantaneous temporal parts do not persist—they exist at one time only—I am not any of those parts. I, as a whole person, am the temporally extended collection, or mereological sum, of all those parts. Hence, I, as a whole person, exist at different times, and thus persist. Compare this with the spatial case. I occupy an extended region of space by having different spatial parts at different places. But I am not numerically identical to those parts. I, as a whole, exist at different places in the sense that in those different places there is a part of me. That is why perdurance implies persistence through time.

We started this article with the question of whether objects persist through time. We have so far presented two theories, and both of them affirm that objects do persist through time. It is now time to introduce a third theory of persistence, the one that consists in the denial of this claim, and that has it that, in place of seemingly persisting objects, there really is a series of instantaneous stages. This theory is called the ‘stage view’, or also ‘exdurantism’. (Stage viewers include Hawley (2001), Sider (1996, 2001), Varzi (2003).)

Stage view

Ordinary material objects do not persist through time; in place of a single persisting object there really is a series of instantaneous stages, each numerically different from the others.

The stage view is often confused with perdurantism. The reason is that many contemporary stage viewers believe in a mereological doctrine called ‘universalism’, or also ‘unrestricted fusion’. According to mereological universalism, given a series of entities, no matter how scattered and unrelated, there is an object composed of those entities (see Compositional Universalism). If we combine the stage view with universalism, we get to an ontology in which the stages compose four-dimensional objects which are just like the four-dimensional objects of the perdurantist.

However, the two views are clearly distinct. Here are a few crucial differences. (i) There is, first, a semantic difference: under perdurantism, singular terms referring to ordinary objects, such as “Socrates”, usually refer to persisting, four-dimensional objects, whereas under the stage view, singular terms referring to ordinary objects refer to one instantaneous stage (which particular stage is referred to is determined by the context). So, while under the stage view there might be four-dimensional objects, so-called ordinary objects (such as “Socrates”) are not identified with them, but rather with the stages (Sider 2001, Varzi 2003). (It should be pointed out that significant work is here done by the somehow elusive notion of ‘ordinary object’; see Brewer and Cumpa 2019.) (ii) A second crucial difference has to do with the metaphysical commitment to four-dimensional entities. While perdurantism is by definition committed to four-dimensional entities, the stage view is by definition only committed to the existence of instantaneous stages. If the stage viewer eventually believes in four-dimensional collections of those stages—and she might well not—such a commitment is not an essential part of her theory of persistence. (iii) A third interesting difference has to do with the metaphysical commitment to the instantaneous stages. While this commitment is built into the stage view, it is not built into four-dimensionalism (Varzi 2003). A four-dimensionalist might believe her temporal parts to be always temporally extended and deny the existence of instantaneous temporal parts (for example, because she believes that time is gunky. Incidentally, it is worth noting that from a historical point of view, the guiding intuition of the stage view—namely that objects do not persist through time or change—emerged much earlier than the guiding intuition of four-dimensionalism. While the former can be traced back to, if not Heraclitus, at least the academic skeptics (Sedley 1982), the latter, as far as we know, emerged no earlier than the end of the XIX century (Sider 2001).

b. Locative Theories of Persistence

Here are, again, the definitions of endurantism and perdurantism that we introduced above:

Endurantism

Ordinary material objects persist by being wholly present at different times; they are three-dimensional entities.

Perdurantism

Ordinary material objects persist by having different temporal parts at different times; they are four-dimensional entities.

One can appreciate the fact that these definitions seem to mix together two aspects of persisting objects (Gilmore 2008). First, there is the mereological aspect. There, the question is whether persisting objects have temporal parts or not. Second, there is an aspect that concerns the shape and size of persisting objects. There, the question is whether persisting objects have a four-dimensional shape, and are temporally extended, or have a three-dimensional shape, and are not extended in time. How can we make sense of these two aspects? What is it for something to be three- or four-dimensional? And how can we make sense of what a temporal part really is? While the latter question is tackled in section § 1d, we shall now focus on the former question concerning shape and extension.

So, what is it for something to be three- or four-dimensional? An illuminating approach to this question—an approach that everyone who wants to work on persistence must be familiar with—comes from location theory (Casati and Varzi 1999, Parsons 2007). We shall thus focus on location first, and then come back to persistence.

Location is here taken to be a binary relation between an entity and a region of a dimension—be it space, time, spacetime—where the entity is in some sense to be found (Casati and Varzi 1999). Location is ambiguous. There is a weak sense, in which you are located at any region that is not completely free of you. In that sense, for example, reaching an arm inside a room would be enough to be weakly located in that room. But there is also a more exact sense, in which you are located at that region of space that is of your shape, size, and that is as distant to everything else as you are—roughly, the region that is determined by your boundaries (Gilmore 2006, Parsons 2007). We shall here follow standard practice and call these modes of location ‘weak location’ and ‘exact location’, respectively.

The intuitive gloss related to exact location suggests that it is interestingly linked to shape, and thus offers us a way of making a more precise sense of what is it for something to be three- or four-dimensional. To be four-dimensional simply is to be exactly located at a four-dimensional spacetime region, while to be three-dimensional is to be located at spacetime regions that are at most three-dimensional. The same gloss helps us make sense of what it is for something to be extended or unextended in time. To be extended in time is for something to be exactly located at a temporally extended spacetime region, while for something to be temporally unextended is for it to be exactly located at temporally unextended spacetime regions only (Gilmore 2006).

At this point, it might be useful to sum up the two aspects mixed together in the definitions of endurantism and perdurantism offered above. We should distinguish: (i) the mereological question of whether persisting objects have temporal parts, and (ii) the locative question of whether objects are exactly located at temporally extended, four-dimensional spacetime regions or rather at temporally unextended, three-dimensional regions only.

Mereological endurantism

Ordinary persisting objects do not have temporal parts.

Mereological perdurantism

Ordinary persisting objects have temporal parts.

Locative three-dimensionalism

Ordinary persisting objects are exactly located at temporally unextended regions only.

Locative four-dimensionalism

Ordinary persisting objects are exactly located at the temporally extended region of their persistence only.

Let us explore locative three-dimensionalism further. In particular, we explore here two consequences of the view. First, locative three-dimensionalism has it that objects persist, thus covering a temporally extended region. But they persist by being exactly located at temporally unextended regions. This requires the persisting object to be located at more than one unextended region; more precisely, at all those unextended regions that collectively make up the spacetime region covered during their persistence. Hence, locative three-dimensionalism implies multi-location, that is, the fact that a single entity has more than one exact location (Gilmore 2007). This contrasts with the unique, four-dimensional, spatiotemporal location of an object under locative four-dimensionalism.(Two remarks are in order. First, there is logical space for other locative views as well, but we shall not consider them here. Second, these definitions make use of the notion of persistence, which can now be defined in locative terms as well. Here is a simple way of doing this. Let us define the path of an entity as the mereological sum of its exact locations (Gilmore 2006). An entity persists if its path is temporally extended.)

A second interesting consequence of the view is that, under plausible assumptions, persisting objects will not have temporal parts, for what exists of an entity at a time is the entity itself, exactly located at that time, and not a temporal part thereof. So, under plausible assumptions, locative three-dimensionalism implies mereological endurantism: if something is three-dimensional it does not have temporal parts.

Interestingly, however, being multi-located at instants is not the only way to persist without temporal parts. In principle, something might be exactly located at a four-dimensional, temporally extended spacetime region without dividing into temporal parts. This is the case if the persisting, four-dimensional object is also an extended simple, that is, an entity that is exactly located at an extended region, but is also mereologically simple, in that it lacks any parts (for more on the definition, possibility and actuality of extended simples, see Hudson 2006, Markosian 1998, McDaniel 2003, 2007a, 2007b, Simons 2004). Lacking any parts at all, the persisting object will also lack any temporal parts, thus being mereologically enduring. We shall call simplism this combination between mereological endurantism and locative four-dimensionalism (Costa 2017, Parsons 2000, 2007).

Simplism

Ordinary persisting objects are mereologically simple and exactly located at the temporally extended region of their persistence only.

To sum up, making use of some conceptual tools borrowed from location theory allowed us to make better sense of perdurantism and its claim that persisting objects are four-dimensional, temporally extended entities. Moreover, it allowed us to distinguish two forms of endurantism, namely locative three-dimensionalism according to which persisting objects are exactly located at instantaneous, three-dimensional regions of spacetime, and thus lack temporal parts, and simplism, according to which persisting objects are four-dimensional, temporally extended, mereological simples, and thus lack temporal parts.

c. Non-Locative Theories of Persistence

The previous section described two radically different ways of capturing endurantism. Interestingly enough, both of them seem to be committed to controversial claims, such as the actuality of multi-location or of extended simples. Of course, any objection against the actuality of multi-location and of extended simples counts de facto also as an objection against either form of endurantism. We cover some of these objections below. For the time being, suffice it to say that both forms of endurantism are controversial.

Some scholars have taken this result as evidence that endurantism is hopeless (Hofweber and Velleman 2011). But others have taken it as a reason to look for other ways of making sense of endurantism (Fine 2006, Hawthorne 2008, Hofweber and Velleman 2011, Costa 2017, Simons 2000a). So far, we have worked under the standard assumption that it is useful and correct to try to make sense of endurantism in locative terms, that is, under the assumption that the relation between objects and times is the one described in location theory. Some scholars take this assumption to be fundamentally misguided.

Why do they believe this assumption to be fundamentally misguided? One reason might come from intuitions embedded in natural language. Fine (2006), for instance, provides linguistic data in support of the idea that objects and events are in time in fundamentally different ways, which he calls ‘existence’ and ‘extension/location’, respectively (he also offers linguistic data in support of the idea that objects and events are in space in the same way in which events are in time). Moreover, he suggests that two radically different forms of presence might come with different mereological requirements: if something is extended/located at a region, it divides into parts throughout that region, while if something exists at an extended region, it divides into parts throughout that region. Since objects are taken to exist at times instead of being extended/located at times, they will not divide into temporal parts.

Another source of evidence from natural language comes from the attribution of temporal relations (van Fraassen 1970). The intuitive gloss for exact location required any temporally located entity to enter temporal relations. However, it is awkward to attribute temporal relations to objects (consider “Alexander is 15 years after Socrates”) and we would naturally lean towards reinterpreting such attributions as attributions of temporal relations to events (“Alexander’s birth is 15 years after Socrates’ death”). This linguistic data might suggest two intuitions. The first one is that the relation between objects and times should not be the location of location theory. The second one is that the way in which objects are in time is derivative with respect to their events: for an object to exist at a time is for it to be the subject of an event located at that time. Under such a view, the possibility of endurantism coincides with the possibility for a single object to participate in numerically different events (Costa 2017, Simons 2000a).

A different non-locative approach consists in trying to make sense of the endurantism/perdurantism distinction in terms of what is intrinsic to a time (Hawthorne 2006, Hofweber and Velleman 2011). According to this approach, something is wholly present at a time if it is intrinsic to how things are that that very object exists at it (Hawthorne 2006) or if the identity of that object is intrinsic to that time (Hofweber and Velleman 2011). These definitions of wholly present are then plugged into the classic definition of endurance: something endures if it is wholly present at each time of its persistence.

Apart from their being grounded in natural language and intuitions, such views have been motivated on the basis of the controversy of their alternatives. Since both locative forms of endurantism are controversial, these non-locative views should be taken seriously.

d. What is a Temporal Part?

A notion that plays a fundamental role in the definition of perdurantism is the notion of a temporal part. Endurantists have sometimes lamented the notion to be substantially unintelligible (van Inwagen 1981, Lowe 1987, Simons 1987). Hence, it is in the interest of perdurantists to try and clarify it (as well as in the interest of those endurantists who believe that events perdure).

What is a temporal part, such as my present temporal part, supposed to be? First of all, it should be clear that a temporal part is not simply a part that is in time. A spatial part of me, such as my left hand, is certainly not outside time, but it is not a temporal part of mine. It is not, because it is not, in a sense, big enough: a temporal part of mine at a given time must be as big as I am at that time. So, one might be tempted to define a temporal part as a part that is as big as the whole is at the time at which the part is supposed to exist. Moreover, the notion of ‘being as big as’ might be spelled out in terms of spatial location. However, this definition would not do if there are perduring entities that are not in space (such as, for example, a Cartesian mind, or a mental state conceived of as non-spatial event) or if there are parts of objects that are as big as the object is at a time without being temporal parts of it, such as, for example, the shape trope of my body conceived of as something spatially located and as a part of me (Sider 2001). (For tropes and for located properties, see: The Ontological Basis of Properties.)

Sider (2001) offers a standard definition of a temporal part:. It reads:

Temporal part

x is a temporal part of y at t if (i) x is a part of y at t; (ii) x exists at, and only at, t, (iii) x overlaps at t everything that is part of y at t.

Let us have a look at each clause in turn. The first one simply says that temporal parts must be parts. The second one ensures that the temporal part exists at the relevant time only. The third one ensures that it includes all of y that exists at that time. (The reader might have noticed that Sider is here using the temporary, three-place notion of parthood—x is part of y at t—instead of the familiar, binary, timeless notion—x is part of y. Here, by ‘timeless’ we simply mean that the notion is not relativized to a time, and not that what exemplifies the notion is in any sense timeless, or outside time. The use of the temporary notion is conceived as a friendly gesture towards the endurantist who usually relativizes the exemplification of properties to times—more on this in § 2a. However, temporal parts might be defined by means of the binary, timeless notion as well. One just needs to replace in the previous definition every instance of the temporary notion with the binary one, and to replace the third clause as (iii*) x overlaps every part of y that exists at t. A second note concerns the fact that Sider’s definition is supposed to work for instantaneous temporal parts. A crucial question then is how, and whether, this definition could be adapted to a metaphysics in which time is gunky (see Kleinschmidt 2017).)

e. Theories of Persistence and Theories of Time

One of the central debates of contemporary metaphysics is the debate as to whether only the present exists, or rather past, present and future all equally exist (Sider 2001). The former view is called ‘presentism’, whereas the latter is called ‘eternalism’ (for more on presentism and eternalism, as well as further alternatives, see: Presentism, the Growing-Past, Eternalism, and the Block-Universe). What are the logical relations between endurantism/perdurantism and presentism/eternalism?

While the combinations of endurantism and presentism, and of perdurantism and eternalism have usually been accepted as possible (for example Tallant 2018), for a long time, it had been supposed that endurantism and eternalism were incompatible with each other. The reasons for this supposed incompatibility are difficult to track down. Summarily, here are two possible reasons. In part, this supposed incompatibility has to do with the so-called problem of temporary intrinsics. In part, it has to do with the idea that eternalism, when combined with spacetime unitism, yields a view in which persisting objects cover a four-dimensional region of spacetime, and thus are four-dimensional and divide into temporal parts (Quine 1960, Russell 1927). Such reasons are now usually discarded. We focus on temporary intrinsics later in § 2a. We have already explained that there are at least two ways in which an object might cover a four-dimensional region of spacetime by being four-dimensional and lacking temporal parts (simplism), or even without being four-dimensional themselves (locative three-dimensionalism). Apart from these locative options, we have also remarked that there are non-locative theories of persistence, and that such theories require the rejection of spacetime unitism. If unitism is successfully rejected, then the problem, if there is one at all, seems not present itself in the first place.

Can one be a perdurantist and also a presentist? A few publications have been devoted to this question, though no conclusive answer has been reached (Benovsky 2009, Brogaard 2000, Lombard 1999, Merricks 1995). On the one hand, one might believe that nothing can be composed of temporal parts if all except one of those parts (namely the past and future ones) do not exist. On the other hand, it has been suggested that one might solve the problem by means of an accurate use of tense operators: while past temporal parts do not presently make part of our ontological catalogue, they did, and maybe their past existence is enough to make them entitled to be parts of a perduring whole.

f. The Persistence of Events

Although contemporary metaphysicians focus mainly on the persistence of objects, there are also parallel debates concerning the persistence of other kinds of entities, such as tropes, facts, dimensions, and, in particular, of events (Galton 2006, Stout 2016). Events are traditionally taken to perdure, for it is intuitively the case that events—such as a football match—divide into temporal parts, such as its two halves. This claim is also accepted by several endurantists, who believe that while objects endure, events perdure. Such a view traces back at least to medieval scholasticism (Costa 2017a). But, once again, the traditional view does not come without dissenters. Contemporary scholars have defended the idea that events or, more precisely, processes endure (Galton 2006, Galton and Mizogouchi 2009, Stout 2016). One reason to believe that at least some entities that are said to be happening endure comes from the fact that we attribute change to them, and that, allegedly, genuine change requires endurance of its subject (Galton and Mizogouchi 2009: 78-81). For example, the very same process of walking might have different speeds at different times. But for change to occur, the numerically same subject, and not temporal parts thereof, must have incompatible properties at different times (heterogeneity of parts is not enough for change to occur). Hence, changing processes must endure. Defenders of enduring processes usually tend to believe that alongside enduring processes there are also perduring events, and sometimes claim that enduring processes are picked out by descriptions that make use of imperfective verbs (such as the walking that is/was/will be happening) while perduring events are picked out by descriptions that make use of perfective verbs (such as the walking that happened/will happen) (Stout 1997: 19). To learn more about the question of whether change requires the endurance of its subject, see the No Change objection against perdurantism, discussed below in § 3b. To learn more about the alleged distinction between processes and events and the related use of (im)perfective verbs, see (Steward 2013, Stout 1997).

2. Arguments against Endurantism

Endurantism has it that objects persist by being wholly present at each instant of their persistence. Thus conceived, objects persist without having temporal parts. Endurantism is usually recognized as the theory of persistence that is closest to common sense and intuition, and thus has sometimes been described as the default view, that is, the view to be held until or unless it is convincingly shown to be hopelessly problematic. So, is endurantism hopelessly problematic?

a. The Argument from Change, a.k.a. from Temporary Intrinsics

A first serious objection against endurantism which traces back to ancient philosophy (Sedley 1982) comes from change. In its simplest form, the objection sounds as follows. Change seems to require difference: if something has changed, it is different from how it was. But if it is different, it cannot be identical, on pain of contradiction. Now, endurantism requires a changing thing to be identical across change, hence, the objection goes, endurantism is false. In this simple form, the objection has a simple answer, that relies on the distinction between qualitative and numerical identity outlined in § 1a. The kind of difference required by change is qualitative difference (not being perfectly identical), and not numerical difference (being two instead of one). Hence, in a change, you might be the same as before (numerical identity) as well as different as before (qualitative difference) without this being contradictory.

This basic argument from change can evolve into two slightly more sophisticated forms. The first form aims to show that even if this analysis of change as numerical identity and qualitative difference is offered, change still results in a contradiction. For change requires a single object—Socrates, say—to have incompatible properties, such as being healthy and sick. But of course, exemplification of incompatible properties leads to a contradiction. For who is sick is not healthy, and hence the numerically same Socrates must be both healthy and not healthy (Sider 2001).

The second slightly more sophisticated form aims to show that change is incompatible with Leibniz’ law, also called the Indiscernibility of Identicals. Leibniz’s law says that numerically identical entities must share all properties. But change thus described is incompatible with Leibniz’s law, for it requires the numerically same entity—such as Socrates at one time and Socrates at another time—not to share all properties—while Socrates at one time is sick, at a later time he is not (Merricks 1994, Sider 2001).

One way to block these two more sophisticated forms consists in rejecting the two guiding principles they rely on. But while this could perhaps more lightheartedly be done with Leibniz’s law, rejecting the Law of Non-contradiction, though not impossible (see Paraconsistent Logic), is certainly not an obviously promising move.

A second way to block these two more sophisticated forms consists in bringing time into the picture. A veritable contradiction and veritable violation of Leibniz’ law would only result from the possession of incompatible properties at the same time. But the incompatible properties of a change are had at the two ends of the change, and hence at two different times.

While this move certainly sounds promising, it is not obvious how time really comes into the picture. Here are two outstanding questions. The first one has to do with the Law of Non-contradiction and Leibniz’s law. When we first introduced them, we did not mention time at all. And in contemporary logic and metaphysics, the two laws are expressed in formulas in which time seems to play no role:

Law of Non-Contraction (LNC)

¬ (p ∧ ¬p)

Leibniz’ law (LL)

x = y → ∀P (Px ↔ Py)

Do such principles require a modification in light of the claim that incompatible properties are had at different times?

The second outstanding question has to do with the claim that a changing object has incompatible properties at different times. This seems to require objects to exemplify properties at times. But how is this temporary, or temporally relative, notion of exemplification to be understood (for example, Socrates is sick at time t), especially as opposed to the timeless notion of exemplification (for example, Socrates is sick) (Lewis 1986)? (Once again, here, by “timeless” we simply mean that the notion is not relativized to a time, and not that what exemplifies the notion is in any sense timeless, or outside time.)

Let us begin with the latter question first. What is it for an object to have a property at a time—what is it for, say, Socrates to be sick at time t? To have a look at the other side of the barricade, perdurantism and the stage view seem to have very simple answers to this question. Under the stage view, temporary exemplification is to be analyzed as timeless exemplification by an instantaneous stage: Socrates is sick at t if and only if the instantaneous stage we call Socrates that exists at t is sick (Hawley 2001, Sider 1996, Varzi 2003). Under perdurantism, temporary exemplification is to be analyzed as timeless exemplification by a temporal part: Socrates is sick at t if and only if the temporal part of Socrates that exists at t is sick (Lewis 1986: 203-204, Russell 1914, Sider 2001: 56). So, under perdurantism and the stage view, temporary exemplification is analyzed as timeless exemplification, and therefore there is no need of adapting LNC or LL in any way: the original timeless reading would do.

How would an endurantist make sense of temporary exemplification—of, say, Socrates being sick at time t? We shall here consider a few options. First, notice that if presentism is true, the endurantist too might analyze it in terms of timeless exemplification (Merricks 1995). If t were present, then “Socrates is sick at t” simply would reduce to “Socrates is sick”, full stop. If t were past/future, then “Socrates is healthy at t” would reduce to “Socrates is healthy” under the scope of an appropriate tense-operator, such as: “it was 5 years ago the case that: Socrates is healthy” (for tense operators, see: The Syntax of Tempo-Modal Logic). Moreover, since we cannot infer from “t was 5 years ago the case that: Socrates is healthy” that “Socrates is healthy”, no contradiction or violation of LL follows. However, this solution requires the endurantist to buy presentism.

Second, an endurantist might interpret “Socrates is sick at t” as involving a binary relation—the relation of “being sick at” —linking Socrates and time t (Van Inwagen 1990a, Mellor 1981). This solution does not require us to make any change to the timeless formulations of LNC and LL (it just follows that the relevant instances of LNC and LL will involve relations rather than properties). And, of course, no violation of LNC or LL would follow, insofar as Socrates’ being sick and healthy would be two incompatible relations involving different relata (compare: no contradiction follows from the fact that I love Sam and I do not love Maria). However, this requires a certain deal of metaphysical revisionism. To put it in Lewis’ words, if we know what health is, we know it is a monadic property and not a relation, and we know it is intrinsic and not extrinsic (Lewis 1986: 204) (for intrinsic properties, see: Intrinsic and Extrinsic Properties).

Third, an endurantist might interpret “at t” as an adverbial modifier: when Socrates is sick at t, he exemplifies the property in a certain way, namely t-ly (Johnston 1987, Haslanger 1989, Lowe 1988). If this view of temporary exemplification is accepted, we should also consider more carefully how the original formulations of LNC and LL should be adapted, for the exemplification they involve is temporally unmodified. The task might be more complicated than one might expect (Hawley 2001, 21f). In any case, under certain assumptions, this adverbialist solution makes it the case that change implies no violation of LNC or LL: Socrates is sick and healthy, but in two different ways—t-ly and t’-ly (compare: the fact that I am actually sitting and possibly standing does not imply a contradiction). But, once again, this involves a certain amount of revisionism. For while adverbial modifiers correspond to different ways of exemplifying an attribute, temporal modifiers seem not to correspond to different ways of exemplifying an attribute: for example, standing on Monday and standing on Tuesday seem not to be two different ways of standing.

There are other strategies that the endurantist might use to make sense of temporary exemplification. This is not the place to go through all of them. However, it is worth noting that even if all of them require a bit of revisionism, the endurantist might actually argue the kind of revisionism they involve is less nefarious than the revisionism required to reject endurantism itself (Sider 2001, 98).

b. The Argument from Coincidence

A second objection against endurantism comes from cases in which material objects seem to mereologically coincide—that is, share all parts and— – and locatively coincide— – that is, share the same location—without being numerically identical. If there are such cases, the objection goes, endurantists have a hard time making sense of them, while their alleged problematicity simply disappears if perdurantism or the stage view are assumed (Sider 2001).

What is so bad about mereological and locative coincidence? To start with locative coincidence, it just seems wrong that two numerically different material objects could fit exactly into a single region of space: instead of occupying the same place, they would just bump into each other. It might be the case that some particular kinds of microphysical particles, such as bosons, allow for this kind of co-location (Hawthorne and Uzquiano, 2011). It might also be the case that in some other possible world, with a different set of laws of nature, objects would not bump into each other, but rather pass through each other unaffected, and thus allow for co-location (Sider 2001). However, the ordinary middle-sized objects that populate our everyday life simply do not: they cannot share a same exact location.

Let us now turn to mereological coincidence. What is so bad about it? Suppose x and y share all parts at the same time. If they do, they will surely also happen to be spatially co-located. But if that is the case and they are numerically different, what could account for their numerical difference? What makes them different one from the other, if they have the same parts and the same location? Moreover, contemporary standard mereology—that is, classical extensional mereology—implies that no two objects can share all parts, a principle called ‘extensionality’ (Simons 1987; Varzi 2016).

Let us now consider two possible examples of mereological and locative coincidence. The first one is the case of a statue of Socrates and the lump of clay it is made of. As long as the statue exists, the statue and the lump of clay coincide both mereologically and locatively: they are exactly located at the same spatial region and they share all parts. And yet, there are reasons to believe they are numerically different. For instance, they have different properties. They indeed have different temporal properties: the clay, but not the statue, has the property of existing at times before the statue was created. And they seem to have different modal properties as well: only the clay, and not the statue, can continue to exist even if the clay gets substantially reshaped into, say, a statue of Plato. Since the statue and the lump of clay have different properties, we must conclude that they are numerically different, in virtue of Leibniz’s law.

A second case of coincidence without identity involves Tibbles the cat. As any other cat, Tibbles has a long fury tail. The tail is part of Tibbles just well as the rest of Tibbles—call it Tib—is. Tib is a part of Tibbles, and hence they are numerically different. But suppose that Tibbles loses her tail. It seems that both Tibbles and Tib would survive the accident. After all, cats do not die when losing their tails; and nothing actually happened to Tib when Tibbles lost her tail, so there is no reason to believe that Tib stopped existing. However, after the accident, Tibbles and Tib end up sharing the same exact location and end up sharing all parts. Hence, the case of Tibbles and Tib is yet another case of coincidence without identity.

Is it really the case that the statue is not the lump of clay, and Tibbles is not Tib? These claims might be resisted. For example, if identity is temporary—if x might be identical with y at one time and different at another—then one might say that even if before the accident Tibbles and Tib were different, after the accident they are identical (Gallois 1998, Geach 1980, Griffin 1977). However, this move does not come for free. Serious arguments have been offered to the effect that identity is not a temporary relation (Sider 2001: 165ff, Varzi 2003: 395).

A different option consists in saying that the statue is nothing else than the lump of clay as long as it possesses the property of being arranged statue-of-Socrates-wise (just like Socrates the philosopher is nothing else than Socrates who possess the property of being a philosopher, and certainly not a second person on top of Socrates). In that case, the statue and the lump of clay would not be numerically different (Heller 1990). However, unlike in the case of Socrates becoming a philosopher, it seems that when we create a statue, we have not merely changed something that existed before. Rather, it seems that we created something that did not exist before.

How do perdurantism and the stage view solve the problem of coincidence? Let us start with perdurantism. According to perdurantism, the statue and the piece of clay are four-dimensional objects composed of temporal parts. During the existence of the statue, they might well mereologically and locatively coincide. But since the lump of clay existed before, and will exist after, the statue, the lump has some temporal parts that the statue does not have. Hence, mereologically speaking they do not overall coincide (in fact, from the perdurantist, four-dimensional, perspective, the 4D statue is a part of the 4D lump of clay). Moreover, from a locative point of view, since the lump exists at times at which the statue does not, their spatiotemporal location is not the same. For sure, their spatial location might sometimes be the same; but this is as it should be: if you consider the exact spatial location of your hand, at that location, you and your hand coincide locatively. The same holds for Tibbles and Tib, for they do not mereologically coincide. Tibbles’ tail is a four-dimensional object that only Tibbles, and not Tib, contains as a part (Varzi 2003: 398). On the other hand, the stage viewer, who identifies ordinary objects with stages, will claim that after the creation of the statue, the statue and the piece of clay are numerically identical. Then, she will benefit from the flexibility of the temporal counterpart relation to make sense of the alleged different properties of the statue and the clay. The present clay will outlast the statue not because it will persist for a longer time—the statue is an instantaneous object, it does not persist—but because it has a clay-counterpart at times which are later than the times at which it has its last statue counterpart. The stage viewer will probably adopt a similar answer in the modal case as well. To illustrate, the claim that the clay, and not the statue, can survive reshaping translates into the claim that in a possible world in which the clay is reshaped, the actual clay has a clay-counterpart but not a statue-counterpart (Sider 2001: 194).

What can an endurantist say in cases of coincidence without identity? A first option could be to just bite the bullet: the statue and the piece of clay are indeed numerically different and indeed mereologically and locatively coincident. However, the endurantist will not want to just accept without qualification that different objects can thus coincide. Of course, she will agree, in normal circumstances different objects cannot thus coincide. She will then try to tell apart in a principled way the special cases that allow for coincidence from the normal cases which do not. One popular attempt to trace this difference in a principled way has to do with the notion of constitution. There is a sense, the idea goes, in which the clay constitutes the statue, and in which after the accident Tib constitutes Tibbles. These selected cases in which constitution is in play warrant the possibility—if not the necessity—of mereological and locative coincidence. This endurantist solution to the problem of coincidence is sometimes called the ‘standard account’ (Burke 1992, Lowe 1995). Of course, the standard account does not come for free. It requires one to adopt a theory of mereology different from classical extensional mereology, and a theory of location that allows for co-location, and this might seem to be a drawback in itself. Moreover, a proponent of such a view still has to tell a story on what she takes constitution to be. A much-discussed option is to make sense of constitution in terms of mutual parthood: the statue is part of the clay, and the clay is part of the statue (we are here using the technical notion of proper or improper part, which has numerical identity as a limit case; see Mereological Technicalities). Apart from requiring a substantial revision of even the most endurantist-friendly theories of mereology, appealing to mutual parthood is not yet enough to make sense of constitution. Mutual parthood is symmetrical while friends of constitution take constitution to be asymmetrical: the statue is constituted by the clay, but not vice versa (Sider 2001: 155-156). Contemporary neo-aristotelianism might come to the rescue in answering this question (Fine 1999; Koslicki 2008): constitution might be defined in terms of grounding (for example, one might say that the existence or nature of the clay grounds the existence or nature of the statue) or in hylomorphic terms (the statue is a compound of matter and form, and the clay is its matter).

Further endurantist solutions, to mention a few, include taking identity to be temporary (Gallois 1998, Geach 1980), embracing mereological essentialism (namely the view that changing parts results in the end of persistence; this would help with the case of Tibbles, but not with the case of the clay, which does not necessarily change its parts when arranged into a statue; see Burke 1994, Chisholm 1973, 1975, van Cleve 1986, Sider 2001, Wiggins 1979), or mereological nihilism (namely the view that there are mereologically atomic—that is, partless—objects, so that most if not all of the entities involved in the cases are not part of one’s ontological catalogue (see van Inwagen 1981, 1990a, Rosen e Dorr 2002, Sider 2013).

Apart from trying to respond to the objection, an endurantist could also launch the ball back in the opposite camp and argue that the solution proposed by the perdurantist does not apply in all cases. In the original cases, coincidence was only temporary: there were times at which the two objects did not coincide, either because one did not yet exist (the statue) or because one had a part that the other did not have (Tibbles and her tail). But what about cases in which coincidence is permanent? Consider for example the case in which an artist creates both the statue and the lump of clay at the same time and later on destroys them at the same time. In such a case, the perdurantist’s solution seems to be precluded, for the statue and the piece of clay will share all their temporal parts, so they will end up mereologically and spatiotemporally coinciding (Gibbard 1975, Hawley 2001, Mackie 2008, Noonan 1999). When confronted with such a case, a perdurantist might be forced to accept one of the endurantist’s solutions, and thus will not be allowed anymore to declare her position better off with respect to endurantism. Notice, though, that the perdurantist might actually reply that permanent coincidence does indeed result in numerical identity. After all, if coincidence is permanent, we have lost one of the two reasons to believe that the statue and the piece of clay are numerically different—namely that they existed at different times. Moreover, as regards the difference in modal properties, the perdurantist might just accept the aforementioned solution: the claim that the clay, and not the statue, can survive reshaping translates into the claim that in a possible world in which the clay is reshaped, the actual clay, numerically identical to the statue, has a clay-counterpart but not a statue-counterpart (Hawley 2001). Finally, notice that the problem of permanent coincidence is no problem at all for the stage viewer, who did not appeal to a difference in temporal parts between the statue and the piece of clay to explain coincidence away (Sider 2001).

c. The Argument from Vagueness

A third objection against endurantism comes from the phenomenon of temporal vagueness. Suppose a table is gradually mereologically decomposed: slowly, from top to bottom, one by one, each of the atoms composing it is taken away until, finally, nothing of the table remains. At the end of the process, the table does not exist anymore. So, it must have ceased to exist at some time. But which time? Even if we might have a rough idea of when it happened, it is much more difficult to tell the precise moment in which the table ceased to exist. Recall that we are removing from the table one atom after the other. The removal of which atom is responsible for the disappearance of the table? And how far away must the atom be to count as removed? It seems really hard to give a precise answer to these questions. The case of the disappearance of the table seems somehow to be vague or indeterminate.

How should we make sense of these ubiquitous cases of temporal vagueness or indeterminacy? One option could be to say that the kind of indeterminacy here involved is merely epistemic. This amounts to saying that there is a clear-cut instant at which the table stops existing, and that our inability to determine which one is due to our ignorance of the facts. There is a definitive atom which, once removed, is responsible for the disappearance of the table. Our puzzlement comes simply from the fact that we do not know which one it is. Though some scholars are happy to defend this epistemic option, others find it odd to insist that there must be a precise atom the removal of which results in the disappearance of the table. And that there is a precise distance of the atom from the rest of the table to make it count as removed. Why is it really that atom as opposed to, say, the immediately previous one? What is it so special about that atom that makes the table stop existing? And what is so special about the given distance to be enough to make the atom count as removed? After all, if you look at what remains of the table after the removal of that atom, you would probably be unable to tell any significant difference from what was there before the removal.

A second option could be to say that the kind of indeterminacy here involved does not have to do with our epistemic profile, but rather with the world itself. The reason why it is so difficult to identify a sharp cut-off point at which the table stops existing is that there is no fact of the matter about what this point is. While at some earlier and later times the table definitely does or does not exist, there are some times at which it simply is indeterminate whether the table still exists. Philosophers have always had a hard time in trying to understand ontic or worldly indeterminacy. For a long time, the standard option has been simply to reject this option as impossible (Dummett 1974; Russell 1923; Sider 2001).

However, if the indeterminacy here involved is neither epistemic nor ontic, what is it? Interestingly enough, perdurantism offers a clear way out from this dilemma. The perdurantist will believe that there is a series of four-dimensional entities involved in the case of the disappearing chair. A first four-dimensional entity includes temporal parts up to the point at which the first atom is removed, a second four-dimensional entity includes temporal parts up to the point at which the second atom is removed, and so on until we get to a four-dimensional entity that includes temporal parts up to the point at which only one atom of the table remains. Given this metaphysical picture, the question of the instant at which the table stops existing translates into the question of which of those four-dimensional entities is picked out by the term “table”. While a perdurantist might still say that the kind of indeterminacy here involved is epistemic or ontic, she could also say that it neither has to do with our epistemic limitations nor with the world itself. Rather, she could say that the problem arises because the term “table” is vague. Although the term is used in everyday circumstances, we simply have not made a decision as to how it should work in special circumstances such as the one that we are discussing here. That is where our puzzlement comes from. This kind of indeterminacy results from a mismatch between our language and the world and is therefore semantic in nature.

The endurantist might accept the alleged oddity that comes with interpreting these cases of indeterminacy as either epistemic or ontic and try to live with it. While endurantists have traditionally had a preference for the epistemic option, renewed interest in ontic indeterminacy—due for example to attempts to take canonical interpretations of quantum mechanics at face value—might make the second option a live one as well (Williams and Barnes 2011, Wilson and Calosi 2018). It has also been remarked that the endurantist might in principle mimic the perdurantist solution, along the following lines. The endurantist might posit in place of a single enduring table a series of coinciding enduring objects, each of which ceases to exist slightly later than the other. Such objects will have temporal boundaries that coincide with the nested temporal parts of the perdurantist solution, but unlike them will endure instead of perduring. Having this series of enduring objects in place, the question of the instant at which the table stops existing might translate into the question of which of those enduring entities is picked out by the term “table”. Thus, also for the endurantist this kind of indeterminacy will turn out to be semantic (Haslanger 1994). What can be said about this mimicking strategy? At first, one might be baffled by the sheer number of enduring, coinciding, and table-like entities that the solution requires. However, an endurantist might respond that the number of entities is no more than the one required by the perdurantist solution. In any case, while in the case of the perdurantist the position of this series of entities is part of the view itself, in the case of the endurantist it seems to be a mere strategy to solve the problem of vagueness, and thus it would not be surprising if perdurantists would consider it ad hoc.

d. The Unintelligibility Objection

Endurantism has it that persisting objects are wholly present at each time of their persistence. But what is it for something to be wholly present a time? If no account of this crucial notion is given, endurantism itself remains not properly defined. Moreover, if no account of the notion is possible at all—that is, if we cannot make sense of whole presence—then endurantism itself will turn out to be an unintelligible doctrine. And admittedly endurantists have no easy time in spelling out what whole presence really amounts to (Sider 2001).

Hence, again, what is it for x to be wholly present at time t? It might mean that:

(1) at time t, x has all of its parts.

But what does it mean to say that x has all of its parts? Are we talking about all the parts that x has at t? Or rather about all the parts that x had, has, and will ever have? In both cases, the endurantist is in trouble. In the former case, (1) becomes

(2) at time t, x has all the parts that it has at t.

However, this hardly identifies the endurantist solution alone. The perdurantist too will believe that at any given time, a four-dimensional entity has all the parts it has at that time. Given that the endurantist intended her view to be different from the perdurantist one, this was not what the endurantist had in mind when saying that persisting objects are wholly present at different times. In the latter case, (1) becomes:

(3)	at time t, y has all the parts that it had, has or will ever have.

However, recall that according to endurantism persisting objects are supposed to be wholly present at each time of their persistence. If whole presence is defined as in (3), this will imply that objects will never gain or lose parts. Which seems again to mischaracterize endurantism, which was supposed to be compatible with mereological change.

We should point out that in interpreting (1) as (2) or (3) we have switched from an apparently timeless notion of parthood (x is part of y) to a temporary one (x is part of y at time t). The move is a straightforward one for an endurantist to make. Usually, endurantists want their properties or relations—or at least the contingent ones—to be exemplified temporarily. However, at least some endurantists, those who are also presentists, might resist this switch and stick to the timeless notion of parthood. They might simply say that x is wholly present just in case it has all the parts it has, full stop (Merricks 1999). Whether or not this solution works in a presentist setting, it can hardly be applied in a non-presentist one.

Another option might be to argue that to be wholly present simply means to lack any proper temporal parts. This move sounds promising. However, it is not totally uncontroversial, for it has been argued that in special cases an endurantist might want her enduring objects to have proper temporal parts. Suppose, for instance, that an artist creates a bronze statue of Socrates by mixing copper and tin into the mold and then, unsatisfied with the result, destroys the statue by separating tin and copper again, so that the statue and the bronze will begin and cease to exist at the same times. Suppose, further, that the bronze and the statue are numerically different from each other (for reasons why they should be, see § 2b). The bronze might be taken to be a part of the statue (a proper part, insofar as it is different from the whole), but it will mereologically coincide with it during its existence. In this somehow tortuous scenario, even if the bronze and the statue might be conceived as enduring, the bronze will count as a temporal part of the statue at the interval of their persistence. For have a look back at the definition of temporal parts given before:

Temporal part

x is a temporal part of y at t if (i) x is a part of y at t; (ii) x exists at, and only at, t; (iii) x overlaps at t everything that is part of y at t.

Indeed, (i) the piece of bronze is a part of the statue that (ii) exists only at the interval for the persistence of the statue, and that (iii) overlaps everything that is part of the statue and exists at that time.

What lesson should we learn from this particular case? According to Sider (2001), a defender of the unintelligibility charge against endurantism, the conclusion to be drawn is that an endurantist might want her enduring objects to have, at least sometimes, proper temporal parts. And that consequently that endurantism cannot simply be the doctrine that objects persist without having proper temporal parts. In principle, one might be tempted to draw a different lesson, that is, that Sider’s definition of temporal parts is unsuccessful and that the notion of a temporal part should be defined in a different way.

In any case, it should be noted that so far, we have tried to characterize the notion of whole presence in mereological terms. However, the reader shall recall that in § 1b we distinguished two aspects which are mixed together in the canonical definition of endurantism offered above. Once again, we should distinguish (i) the mereological question of whether persisting objects have temporal parts, and (ii) the locative question of whether objects are exactly located at temporally extended, four-dimensional spacetime regions or rather at temporally unextended, three-dimensional regions only. So far, in trying to define whole presence in mereological terms, we have assumed that the notion pertained to the mereological question, rather than the locative one. On the other hand, if whole presence is to be characterized in locative terms, the task does not seem to be too difficult (Gilmore 2008, Parsons 2007, Sattig 2006). For example, under the view that we called locative three-dimensionalism, whole presence simply translates as exact location: a persisting object is wholly present at each instant of its persistence in the sense that it is exactly located at each instantaneous time or spacetime region of its persistence.

e. Arguments against Specific Versions of Endurantism

In § 1b and § 1c, we characterized several different versions of locative and non-locative endurantism. Each of them helped characterize better what the endurantist might have had in mind. However, each of them is subject to specific objections, which we here review summarily.

First, we have defined locative three-dimensionalism, according to which persisting objects are exactly located at temporally unextended regions only. This form of endurantism is committed to the possibility of multi-location, that is, to the possibility of a single entity having more than one exact location. Multi-location has been put to work in several contexts, in helping to make sense not only of endurantism, but also of Aristotelian universals and property exemplification, to mention only a few cases. Still, several scholars take multi-location to be problematic, either because it implies contradictions (Ehring 1997a), or because it is at odds with the very notion of an exact location (Parsons 2007), or because it creates specific problems when applied to the case of persistence (Barker and Dowe 2005). Moreover, locative three-dimensionalism is prima facie committed to the existence of instants of time, which cannot be the case if time is gunky (see Leonard 2018).

Second, we have defined simplism, according to which persisting objects are mereologically simple and exactly located at the temporally extended region of their persistence. Simplism is committed to the possibility of extended simples, that is, the possibility that something without any proper parts can be located at an extended region. Extended simples have enjoyed a fair share of popularity and have been argued to be a possibility which flows from recombinatorial considerations (McDaniel 2007b, Saucedo 2011, Sider 2007), from quantum mechanics (Barnes and Williams, 2011) and from string theory (McDaniel 2007a). Still, some scholars look at extended simples with a distrustful stare, because they think that dividing into parts is part of the nature of extension (Hofweber and Velleman 2011), because extended simples are excluded by our best theories of location (Varzi and Casati 1999), or because specific reasons given in favour of the possibility of extended simples are unsuccessful.

Third, we have introduced non-locative versions of endurantism. These versions usually assume that there are two radically different ways of being in a dimension, that objects are in space in a radically different way with respect to the one in which they are in time, and that these two different ways explain why objects divide into spatial but not into temporal parts. Such views are immune from the specific problems of locative three-dimensionalism and of simplism. Still, they have been argued to come with specific drawbacks of their own. In particular, they seem to be at odds with spacetime unitism (see § 1e). Indeed, under spacetime unitism, regions of time and regions of space are simply spatiotemporal regions of some sort. So, it seems that if anything holds a relation to a region of space, it cannot fail to hold the same relation to some region of time as well (Hofweber and Lange 2017).

3. Arguments against Perdurantism

Perdurantism has become a popular option. However, it does not come without its own drawbacks. This section briefly reviews arguments to the effect that that it offends against our intuitions (§ 3a), it makes change impossible (§ 3b), it is committed to mysterious and yet systematic cases of coming into existence ex nihilo (§ 3c), it is ontologically inflationary (§ 3d), it involves a category mistake (§ 3e), it does not make sense (§ 3f), and it has a problem with counting (§ 3g).

a. The Argument from Intuition

Endurantists and their foes alike often agree that endurantism is closer to common sense beliefs, or more intuitive, than perdurantism. Moreover, some philosophers believe that common sense beliefs or intuition should be taken seriously when doing philosophy. This often translates into the idea that such intuitions or beliefs should be preserved as much as possible, that is, until eventually proven false or at least significantly problematic (Sider 2001). Presumably, this is also why endurantism is sometimes considered the champion view, and that the burden of proof in the persistence debate lies on the perdurantist side of the debate (Rea 1998). Now, has endurantism been proven false or significantly problematic? The previous section reviewed several arguments to this effect and registered that several endurantists remain unconvinced. They would therefore conclude that perdurantism is unmotivated and, since it is the challenger view, should be rejected.

We shall not here tackle the question of whether endurantism has been proven false (see § 2 for this). Rather, we focus on other possible ways in which the perdurantist might respond to this specific challenge.

First of all, though, we should wonder: why is endurantism supposed to be more intuitive than perdurantism? What aspects of perdurantism are supposed to be that counter-intuitive? Perdurantism implies that when seeing a tree or talking with a friend, what you have in front of you is not a whole tree or a whole person, but rather only parts of them. It also implies that objects are extended in time just like they are extended in space and a bit like an event is supposed to be. These mereological and locative consequences of perdurantism are supposed to be counter-intuitive: intuitively, we would say that what we have in front of us in the cases described are a whole tree and a whole person, and that we are not extended in time like we are in space, or like events are supposed to be.

Clearly enough, one option for the perdurantist is simply to reject the idea that in philosophy intuitions or common sense should have the weight the endurantist is here proposing. What an endurantist calls “intuitions” a perdurantist might insist are nothing more than unwarranted biases. However, we do not discuss this option here. Tackling the general question of the role of intuition in philosophy goes beyond the scope of this article (for an introduction to the topic, see Intuition).

A second option consists in pointing out that while perdurantism does indeed have counter-intuitive consequences, endurantism is not immune from counter-intuitiveness too. For example, we have already mentioned that several popular versions of endurantism are committed to claims—as the claim that things can have more than one exact location or that extended simples are possible (see § 2e)—which might arguably be taken to be counter-intuitive.

A third option consists in pointing out that even if intuition should play a role in philosophy, the kind of evidence that it offers might be biased, for it might be based on our misleading vantage point on reality. In particular, it might be argued that our endurantist intuitions are based on the fact that human beings commonly experience reality a time after a time. However, if spacetime unitism and eternalism are true, a more veritable perspective would be one that would allow us to perceive the whole of spacetime in a single bird’s-eye view. Were we able to see the whole of spacetime in a single bird’s-eye view, our intuitions might be different, and we might rather be led to believe persisting objects to be spatiotemporally extended, and to see their instantaneous “sections” with which human beings are usually acquainted, as parts of them. In that case, our usual condition would be reminiscent of that of the inhabitants of Flatland, who perceive the passage of a three-dimensional sphere on their plane of perception as the sudden expansion and contraction of a bi-dimensional circle. Once again, here we shall not tackle the question of whether eternalism and spacetime unitism are true (for an introduction to the topic, see Gilmore, Costa, Calosi 2016).

b. The No-Change Objection

A second objection traditionally marshalled against perdurantism is that it makes change impossible (Geach 1972, Lombard 1986, Mellor 1998, Oderberg 2004, Sider 2001, Simons 1987; 2000a). But change quite obviously occurs everywhere and everywhen. Hence, perdurantism is false.

Why would perdurantism make change impossible? Change requires difference and identity. In order for a change to occur, the argument goes, something must be different, that is, must have incompatible properties, but must also be identical, that is, must be one and the same thing. The identity condition is important, for we would not normally call a change a situation in which two numerically different things have incompatible properties. For example, we would not call a change a situation in which an apple is red and a chair is blue. However, the perdurantist account of change (§ 2b) seems committed to invariably violate the identity condition. Under perdurantism, when a change occurs, it is not the numerically same thing which has the incompatible properties. Rather, the incompatible properties are had by numerically different temporal parts of said thing. For example, if a hot poker becomes cold, it is not the persisting poker itself which is hot and cold. Rather, two numerically different temporal parts of it are hot and cold.

Is it really the case that perdurantism violates the identity condition? For sure, under perdurantism, the incompatible properties are had by numerically different temporal parts: an earlier part of the poker is hot, a later one is cold. However, can we not say that the persisting thing has them too: the perduring poker itself is hot and cold? After all, we call red a thing even if not all, but only some, of its parts are red. It is crucial here to stop and wonder what we might mean that the perduring poker itself is hot and cold. One straightforward option would be to say that the poker itself literally is hot and cold, just like its different temporal parts are. However, this is implausible. After all, one of the main motivations for being a perdurantist consists in saying that it is impossible for the numerically same poker to be hot and cold, for it would violate Leibniz’s Law or even the Law of Non-contradiction (§ 2b). Hence, when a perdurantist says that the perduring poker itself is hot and cold she must mean something different. Presumably, she means that the poker is hot insofar as it has hot parts and is cold insofar as it has cold parts. However, if this is what the perdurantist really means, she would presumably be violating the difference condition. For change requires the same subject to have incompatible properties. Whereas having hot parts and having cold parts are not incompatible properties.

A second and popular move consists in rejecting the identity condition. Change does not require one and the same thing to have incompatible properties. At least in some cases, different things would do too (Sider 2001). However, foes of perdurantism would insist that it is not possible to give up the identity condition so lightly. They would insist, for example, that having parts with incompatible properties is insufficient for change. For example, a single poker would not change for the simple fact of having hot parts and cold parts: mereological heterogeneity is not change. Perdurantists might concede that mereological heterogeneity is not always change, but specify that under certain circumstances, it is. In particular, mereological heterogeneity is change in cases where incompatible properties are had by different temporal parts of a single thing.

Some endurantists remain unconvinced by this proposed amendment to the identity condition. They would say, for example, that since temporal parts are numerically different from each other, under perdurantism there is no change, but only replacement. At this point, perdurantists have at least two options. The first one is simply to disagree: change is a particular kind of replacement. The second one consists in giving up on change: if change really requires the original identity condition, then let it be: philosophy has taught us that where we believed there to be change, there really only is replacement (Simons 2000b; Lombard 1994).

c. The Crazy Metaphysic Objection

A third objection against perdurantism is that it is a “crazy metaphysic”, for it involves systematic and yet mysterious cases of coming into existence. The objection refers here to the fact that, under perdurantism, new temporal parts of a single thing come into (and go out of) existence continuously. As Thomson famously puts it:

[perdurantism] seems to me a crazy metaphysic (…). [It] yields that if I have had exactly one bit of chalk in my hand for the last hour, then there is something in my hand which is white, roughly cylindrical in shape, and dusty, something which also has a weight, something which is chalk, which was not in my hand three minutes ago, and indeed, such that no part of it was in my hand three minutes ago. As I hold the bit of chalk in my hand, new stuff, new chalk keeps constantly coming into existence ex nihilo. That strikes me as obviously false (Thomson 1983, 213).

Under perdurantism, these cases of coming into being really are systematic. But what does it mean to say that they are crazy or mysterious? It might mean that they do not make sense (for this option, see the unintelligibility objection in § 3e). But there is another option which is worth exploring. According to this option, mystery has to do with the absence of an indispensable explanation. If perdurantism is true, the objection goes, there are systematic cases of coming into existence. These cases cry out for an explanation: how is it that these new things come into existence? Where do they come from? However, perdurantism seems to be unable to offer an explanation for these cases. Under perdurantism, the systematic coming into being of new and new temporal parts is a brute fact, of which there is no explanation.

First, we shall wonder: is perdurantism really unable to offer an explanation for these cases of coming into existence? Thomson seems to be persuaded that it cannot. If perdurantism is true, these temporal parts do not come from a source which might explain their appearance. In her words, they come into existence ex nihilo. But is this really the case? What does it mean that a new temporal part of a thing comes into existence ex nihilo, from nothing? Does it mean that nothing existed before the temporal part? Certainly not: other temporal parts of the thing existed before the appearance of that particular temporal part. Does it mean that the coming into existence of the temporal part is an event which has no cause? Again, this seems to be implausible. If perdurantists take causation seriously (for if they do not, the objection would not apply in the first place, see (Russell 1913)), some perdurantists would say that there is a causal connection between temporal parts of a single thing: the later ones are caused to exist by the previous ones (Heller 1990, Oderberg 1993, Sider 2001). Endurantists might disagree here. For example, they might believe that later temporal parts cannot be caused to exist by the previous ones, for (immediate) causation requires simultaneity of cause and effect (see Huemer and Kovitz 2003, Kant 1965).

We have discussed how a perdurantist might try to offer an explanation for the continuous coming into existence of new temporal parts. But is it really the case that the coming into existence of new temporal parts really requires an explanation? In that connection, perdurantists usually follow two lines of reasoning. First, they argue that the succession of new temporal parts as we move through time is analogous to the succession of new spatial parts as we move through space. And since we do not think there is anything mysterious in the latter case, so we should have the same attitude in the former one as well (Heller 1990, Varzi 2003). This argument from analogy gains plausibility especially under a unitist view of spacetime. However, one might argue that the analogy fails, for example because causation unfolds diachronically over time and not synchronically through space, so we have a reason not to expect there to be an explanation in the spatial case and to require an explanation in the temporal one instead. The second line of reasoning takes the form of a tu quoque. Thomson believes that the continuous coming into existence of new temporal parts requires an explanation. But is the continuous existence of an enduring object not equally mysterious? How is it that an enduring object continues to exist instead of ceasing to exist? If the endurantist’s continuous existence is no mystery, so also is the continuous coming into existence of new temporal parts proposed by the perdurantist (Sider 2001, Varzi 2003).

d. The Objection from Ontological Commitment

One criterion that has been sometimes employed in order to evaluate metaphysical doctrines is Ockham’s razor, according to which a theory should refrain from making commitments if such commitments are not necessary to its theoretical success. One particular kind of commitment is ontological commitment, that is, the commitment of a theory to the existence of entities of kinds thereof. According to Ockham’s razor, this commitment is to be avoided if possible, and any theory which is less ontologically committed is, ceteris paribus, preferable with respect to one which has more ontological commitment (see The Razor).

Now, it might be noted that perdurantism is committed to the existence of a higher number of entities with respect to both endurantism and the stage view. Perdurantism is more ontologically committed than endurantism, for on top of a single persisting thing, it is committed to the existence of a series of numerically different temporal parts thereof. Perdurantism is also more ontologically committed than the stage view. Indeed, unlike perdurantism, the stage view is not necessarily committed to the existence of the perduring mereological sums of the instantaneous stages. If perdurantism is indeed more ontologically committed than endurantism and the stage view, the question is whether this commitment is really necessary. This question is of course discussed in § 2 and § 4. However, more generally, the perdurantist might wish to reject Ockham’s razor—for what reasons do we have to believe that the world is not more complex than our simplest theories? —or to ride the wave of contemporary metaphysicians which simply downplays the importance of ontological commitment, and suggests that the fundamental question of metaphysics is not what there is, but rather what is fundamental, or what grounds what (Schaffer 2009). Yet another response on behalf of the perdurantist is based on the distinction between quantitative and qualitative parsimony (Lewis 1973; 1986). A metaphysical system is more quantitatively parsimonious the fewer entities it acknowledges, while it is more qualitatively parsimonious the fewer ontological categories it introduces. Offending against quantitative parsimony is often considered to be less problematic, if at all, than offending against qualitative parsimony. And indeed, one might say, perdurantism offends against quantitative, but not qualitative, parsimony, for each temporal part of a material object is itself a material object.

e. The Category Mistake Argument

Perdurantism has it that persisting objects have temporal parts. This makes objects similar to events, for events too are also usually taken to have temporal parts. Because of this similarity, perdurantists have sometimes presented as a consequence of their view that objects and events are entities of the same kind, and the difference between events and objects is, at best, one of degree of stability (Broad 1923, Quine 1970). In the words of Nelson Goodman (1951, 357): “a thing is a monotonous event, an event is an unstable thing”.

Are events and objects entities of the same kind? Critics of perdurantism have sometimes argued that they are not, and that conflating objects and events would result in a serious category mistake. Perdurantism, which is committed to this mistake, would therefore need to be rejected. This is the category mistake argument against four-dimensionalism (Hacker 1982, Mellor 1981, Strawson 1959, Wiggins 1980).

What reasons are there to believe that events and objects belong to different ontological categories? For example, it has been pointed out that while objects are said to exist, events are said to happen, or take place (Cresswell 1986, Hacker 1982). This linguistic difference is sometimes said to be a reflection of an ontological one, that is, that objects and events enjoy different modes of being. Moreover, while objects exist at times and are at places, events are supposed to be at places and times. Once again, this linguistic difference is supposed to reflect an ontological one, that is, that objects and events relate to space and time in radically different ways (Fine 2006). Furthermore, objects do not usually allow for co-location, at least not to the extent in which events do (Casati and Varzi 1999, Hacker 1982). Finally, it is sometimes said that the spatial boundaries of events are usually vaguer than those of objects (what are the spatial boundaries of a football match?), whereas the temporal boundaries of events are usually less vague than those of objects (Varzi 2014).

A first way to resist this argument is to insist that conflating objects and events is no category mistake. Putative differences between objects and events will then either be considered irrelevant when it comes to metaphysics—for example because they are merely linguistic differences which do not reflect any underlying significant difference in reality—or in any case not enough to imply that objects and events belong to different ontological categories. After all, presumably, not all differences between kinds of entities are supposed to make them entities of a different kind (Sider 2001).

On the other hand, if a perdurantist is persuaded that conflating objects and events would be a category mistake, she could simply reject the claim that perdurantism implies that objects are events or vice versa. Perdurantism is the claim that objects have one feature that is usually—and not universally—attributed to events, that is, having temporal parts. And sharing some features is not a sufficient condition to belonging to the same ontological category. After all, entities of other kinds, such as time intervals or spacetime regions, are usually taken to have temporal parts without being events.

f. The Unintelligibility Objection

Some endurantists believe that perdurantism is not (only) false, but utterly unintelligible. According to this possible objection, perdurantism is a “mirage based on confusion” (Sider 2001, 54), a doctrine which makes “no sense” (Simons 1987, 175) or which is, at best, “scarcely intelligible” (Lowe 1987, 152). In the trenchant words of Peter van Inwagen:

I simply do not understand what [temporal parts of ordinary objects] are supposed to be, and I do not think this is my fault. I think that no one understands what they are supposed to be, though of course plenty of philosophers think they do. (van Inwagen 1981, 133)

In response to this objection, David Lewis (1986) famously stated that if one is unable to understand a view, one should not debate about it. Colorful as it is, Lewis’ stance misfires. The point of the objection is not that the objector has not understood perdurantism, but rather that perdurantism itself is unintelligible. Lewis’ point would apply in case where the objector was simply admitting her epistemic limitations. But the objector is not making a point about herself. Rather, she is making a point about the view itself, saying that it does not make sense. (Is it possible for something to be false and also not to make sense? Several scholars have indeed endorsed the view that some claims, such as contradictions or category mistakes, are false and do not make sense. But this view might be attacked.)

What is it precisely that is supposed not to make sense in perdurantism? Is it the notion of a temporal part itself? This is hardly the crux of the problem, since many endurantists claim that the notion itself, when applied to events, makes perfect sense (Lowe 1987). The unintelligibility of the view should rather come from some other aspect of the view. But if so, wherefrom? One option consists in saying that the unintelligibility comes from the fact that perdurantism is committed to a category mistake, and category mistakes, or at least some of them, are unintelligibile (for a discussion see § 3e). A second option might have to do with mereology. Indeed, Sider (2001), who takes the objection seriously, considers that the problem might lie in the fact that the notion of a temporal part is usually defined in terms of the timeless notion of parthood—x is part of y. Rather, endurantists tend to use the temporary notion of parthood—x is part of y at t. Sider suggests that maybe the sense of unintelligibility comes from the fact that perdurantists tend to use a mereological notion that endurantists take to be unintelligible—or to yield unintelligible claims when applied to everyday material objects. If Sider’s diagnosis is correct, then his definition of temporal parts in terms of temporary parthood discussed before (§ 1d) seems to take care of it.

g. The Objection from Counting

The objection from counting is traditionally presented as an objection against perdurantism and in favor of the stage view. The semantic difference between the two views is of particular importance here. Recall that the two views disagree about the reference of expressions referring to ordinary objects. Under perdurantism, expressions referring to ordinary objects, such as “Socrates”, refer to persisting, four-dimensional objects, whereas under the stage view, expressions referring to ordinary objects refer to one instantaneous stage (which particular stage is referred to is determined by the context).

Let us consider again the case of the statue and the piece of clay (§ 2b). Under perdurantism, both of them are four-dimensional entities, and their apparent coincidence boils down to their sharing some temporal parts. In particular, at any time in which the statue exists, there is an instantaneous statue-shaped entity that is both a temporal part of the statue and a temporal part of the piece of clay. Now suppose that at that particular time someone asks the question: how many statue-shaped objects are there? Intuitively, we would like to answer that there is only one. And this is the answer given by the stage view. For the stage view takes ordinary expressions such as “statue-shaped object” to refer to instantaneous stages, and there is only one of them that exists at that time. On the other hand, perdurantism counts by four-dimensional entities. And since that particular instantaneous stage is a temporal part of two ordinary objects, the statue and the piece of clay, perdurantism implies that there are in fact two statue-shaped objects there at that time. Hence, perdurantism, unlike the stage view, yields unwelcome results as regards the number of entities involved in such cases. This is the argument from counting against perdurantism (Sider 2001).

A possible answer consists in saying that in that particular context the predicate “statue-shaped object” does indeed refer to two four-dimensional entities, the statue and the piece of clay, but that we count them as one because they are, in a sense, identical at the time of the counting (Lewis 1976). In saying so, we are using an apparently time-relative notion of identity—x is identical to y at t—instead of the usual timeless one—x is identical to y. What does that mean? A four-dimensionalist would define the time-relative notion in terms of the timeless one: x is identical to y at t if the temporal part of y at t is identical to the temporal part of y at t. Stage theorists will probably remain unconvinced by this move for, they would insist, counting can only be done by identity. In Sider’s words: “I doubt that this procedure of associating numbers with objects is really counting. Part of the meaning of ‘counting’ is that counting is by identity; ‘how many objects’ means ‘how many numerically distinct objects’ (…). Moreover, the intuition that [there is just one statue-shaped object at that time] arguably remains even if one stipulates that counting is to be identity” (Sider 2001, 189).

4. Arguments against Stage View

This section reviews arguments against the stage view, to the effect that it goes against our intuitions (§ 4a), it makes change impossible (§ 4b), it is committed to mysterious and yet systematic cases of coming into existence ex nihilo (§ 4c), it is ontologically inflationary (§ 4d), it is incompatible with temporal gunk (§ 4e), it is incompatible with our mental life (§ 4f) and it has problems with counting (§ 4g).

a. The Argument from Intuition

In § 3a we discussed the argument from intuition against perdurantism. A similar argument has been proposed against the stage view as well. While the details of the present argument are somewhat different from the previous one, its general structure remains the same. The general idea is that closeness to intuitions or common sense constitutes a theoretical advantage that a view might have. And, the objector says, both endurantism and perdurantism are closer to intuitions than the stage view.

Why is the stage view supposed to be especially counter-intuitive? Presumably, the aspect of the stage view which offends the most against our intuitions is the fact that it denies persistence. Indeed, while endurantism and perdurantism agreed on the fact that some ordinary objects persist, either by enduring or by perduring, the stage view denies that ordinary objects persist. In place of a single persisting object, the stage view posits a series of numerically different instantaneous stages.

In order to tackle this objection, the stage viewer might decide to deploy some of the generic strategies outlined in § 3a. First, the stage viewer might insist that intuitions are no more than biases and thus deny that that intuitions place any disadvantage on the stage view. Second, the stage viewer might believe that the disadvantage exists, but is nevertheless outweighed, either by the fact that other views are intrinsically counter-intuitive too (see again § 3a), or by the fact that the other views have been proven false or at least significantly problematic.

Here, however, we focus on a fourth and more specific strategy available to the stage viewer. The strategy consists in arguing that the intuition that is supposed to disfavor the stage view does not really disfavor it. It is indeed true, the stage viewer would say, that we commonly have beliefs such as “I was once a child”. The critic of the stage viewer takes them to imply the persistence of the self, for how could I have been a child without existing in the past? But this, the stage viewer says, is a mistake. In fact, we could make sense of beliefs such as “I was once a child” just as well by means of the counterpart relation: “I was once a child” is true if a past counterpart of mine is a child. In other words, those beliefs are undetermined with respect to the question of whether things exist at more than one time (Sider 2001).

A possible reply is that the strategy might not be applied to all putative cases of commonsensical beliefs involving the past. Consider for example a tenseless statement of cross-time identity such as “I am identical to a young child”, in which I affirm my identity with my previous self. This statement cannot be taken care of in terms of counterparts. The stage viewer’s rejoinder might here be that these beliefs are perhaps too technical to be common sense or that, in any case, what really matters is that the stage viewer is able to make sense of and to validate cognate statements that are framed in terms which are much more mundane, such as “I was once a child” (Sider 2001).

b. The No-Change Objection

In § 3b we discussed the no-change objection against perdurantism. The objection was that change requires the numerical identity of the subject of change before and after the process of change. There, we discussed the option of amending this identity requirement. Change does not require that the subject before and the subject after the change be identical. They just need to be temporal parts of a single thing. The stage viewer might adopt this strategy to suit her needs. Change does not require that the subject before and the subject after the change be identical. They just need to be related by the counterpart relation. Some endurantists remain unconvinced by the perdurantist amendment. We might reasonably expect them to be unconvinced by the amendment proposed by the stage viewer too. Since the relevant stages are numerically different from each other, under the stage view there is no change, but only replacement. The stage viewer’s rejoinder might be either to insist that change is a particular kind of replacement or to give up on change and insist that there is nothing bad in saying that where we believed there to be change, there really is replacement.

c. The Crazy Metaphysic Objection

Section 3c reviewed an argument against perdurantism to the effect that it involved systematic and yet mysterious cases of coming into existence. The stage view is subject to a similar objection. Just like perdurantism requires the systematic coming into existence of new temporal parts, so the stage view requires the systematic coming into existence of new instantaneous stages. And if perdurantism did not have a plausible explanation for this systematic coming into existence, neither does the stage view.

However, it should also be noted that the stage viewer can apply the exact same strategies there proposed on behalf of the perdurantist. The stage viewer might insist that there indeed is an explanation for the coming into existence of new temporal parts. Their coming into existence is caused by the previous stages (Varzi 2003). Or she might argue that the systematic coming into existence is not mysterious after all, for it is no more mysterious than the succession of spatial parts through space, and no more mysterious than the continuous existence of an enduring object through time (Sider 2001; Varzi 2003).

d. The Objection from Ontological Commitment

Section § 3d reviewed an argument from ontological commitment against perdurantism. Its guiding principle was that unnecessary ontological commitments should be avoided and, therefore, any theory which is less ontologically committed is, ceteris paribus, preferable with respect to one which has more ontological commitment.

This kind of argument seems to disfavor the stage view with respect to endurantism. Indeed, instead of a single enduring thing, the stage view posits a myriad of numerically different instantaneous stages. However, this kind of argument does not disfavor the stage view with respect to perdurantism. Indeed, often the ontological commitments of the stage view and of perdurantism are perfectly aligned. Indeed, because of their commitment to mereological universalism, many stage viewers believe in the existence of four-dimensional aggregates on top of instantaneous stages (see § 1a). However, the stage view is not committed to four-dimensional aggregates by definition. So, depending on further metaphysical parameters, it might turn out that a stage viewer’s overall metaphysical view ends up being less ontologically committed than perdurantism.

In order to block this kind of argument, the stage viewer might adopt the usual strategies already described on behalf of the perdurantist. In particular, she might argue that the further ontological commitments of the stage view is fully justified because of the failures of endurantism (§ 2), or she might argue that a philosopher should not be scared to make all the ontological commitments that she sees fit, for what reasons do we have to believe that the world is not more complex than our simplest theories? Finally, she could ride the wave of contemporary metaphysicians which simply downplays the importance of ontological commitment and suggests that the fundamental question of metaphysics is not what there is, but rather what is fundamental, or what grounds what (Schaffer 2009).

e. The Objection from Temporal Gunk

When introducing the stage view, we pointed out that unlike perdurantism and endurantism, its very definition commits it to the existence of instantaneous entities. This might be a drawback of the stage view, in case time turns out to be gunky, that is, if it turns out that every temporal region can be divided into smaller temporal regions, and thus, temporal instants turn out not to exist (Arntzenius 2011, Whitehead 1927, Zimmerman 2006).

We do not here focus on the question of whether there exist temporal instants at all. Instead, we shall briefly remark that, as it stands, the argument implicitly assumes that if there are no instants, there cannot be instantaneous entities. This assumption might be taken to follow from a series of principles of location, most notably the principle of exactness, according to which anything that is in some sense in a dimension must also have an exact location in that dimension. Now, located entities share shape and size with their exact locations. Hence, if exactness is true, instantaneous entities do indeed require the existence of instants in order to exist. However, exactness has come under attack on different grounds, one of which concerns indeed the possibility of pointy entities in gunky dimensions (Gilmore 2018, Parsons 2001). Hence, it seems in principle possible for a stage viewer to uphold her view even if she takes time to be gunky.

f. The Objection from Mental Events

There is an objection often proposed against the stage view which concerns in particular the persistence of subjects of mental states. The stage view implies that ordinary objects, persons included, do not persist through time. However, some mental processes and states seem not to be possibly performed or possessed by instantaneous entities. For example, we say that people reflect on some ideas, make decisions, ponder means, act, fall in love, change their mind. All those mental events take time, and thus cannot possibly be possessed by instantaneous stages. (Brink 1997, Hawley 2001, Sider 2001, Varzi 2003).

The stage viewer will typically reply that acting, reflecting, pondering, making decisions and so on do not require a persisting subject. For example, they might insist that, say, acting is something that can be possessed by a stage in virtue of its instantaneous state as well as in virtue of its relations to its previous stages, provided that the previous stages possess their appropriate mental properties (Hawley 2001, Sider 2001, Varzi 2003). Alternatively, the stage viewer might insist that there are indeed extended mental events such as acting or pondering, but that such mental states do not have one single subject, but rather a series of subjects which succeed themselves one after the other. For each of them, to be acting is to be the subject of an instantaneous temporal part of a temporally extended event of acting. In any case, the stage viewer will concede that her view, unlike endurantism and perdurantism, is incompatible with the idea that such mental events are temporally extended and are possessed by a single subject.

g. The Objection from Counting

Section § 3g discussed an objection against perdurantism to the effect that it delivered the wrong counting results in cases of mereological coincidence. To the question “how many statue-shaped objects are there?”, asked at a time in which the piece of clay and the statue mereologically coincide, the perdurantist has to answer that there are two, whereas the stage viewer can give the intuitive answer that there is only one. However, while considerations about counting in cases of coincidence seem to favor the stage view, similar considerations in cases which are far more common seem to disfavor the stage view over its rivals, and endurantism in particular. Suppose Sam remains alone in a room for an hour. How many people have been in that room during that hour? Intuitively, we would like to answer that there has been only one. And this is the answer given by endurantism. For endurantism takes Sam to be one single persisting object that exists through the hour. On the other hand, the stage view takes ordinary expressions such as “person” to refer to instantaneous stages, and there is such an instantaneous stage of Sam for each instant making up the hour. Hence, the stage view, unlike endurantism, yields unwelcome results as regards the number of entities involved in such cases (Sider 2001). (How does perdurantism fare with this objection? It depends on whether it counts temporal parts of persons as persons. If it does (and it usually does, see § 2b), perdurantism is subject to the same objection.)

Suppose that the stage viewer is concerned with this problem and takes intuitions about counting seriously (and she arguably should, if she endorses the argument from counting in favor of her view presented in § 3g). In that case, the stage viewer has at least three options. The first one consists in saying that in that particular context the predicate “person” does indeed apply to several instantaneous stages, but that we count them as one because they are counterparts of each other. This option is subject to a rejoinder which was already employed in § 3g against the Lewisian solution to the problem of counting in favor of the stage view. Indeed, the present option suggests that sometimes we count by counterparthood and not by identity. This offends against the view that “part of the meaning of ‘counting’ is that counting is by identity” (Sider 2001, 189). A second option is available to the stage viewer who believes in the existence of four-dimensional sums of instantaneous stages. This stage viewer might claim that in the present context the predicate “person” applies to one single four-dimensional object instead of the instantaneous stages. In so doing, the stage viewer is adopting an unorthodox view which mixes the stage view and perdurantism, in which reference of ordinary terms such as “person” is flexible: sometimes they pick out instantaneous stages (as in the stage view), sometimes they pick out four-dimensional sums thereof (as in perdurantism). A third and final option consists in taking domains of counting to be restricted to entities existing to the time of utterance, or restricted in some other suitable way (Viebahn 2013).

5. What Is Not Covered in this Article

This section lists several aspects and issues concerning the metaphysics of persistence that are not covered in this article. Each of them is complemented with some references so as to guide readers in their exploration.

When it comes to the characterization of the views and of the debate, it is worth noting that some philosophers have tried to characterize the endurantism/perdurantism dispute in terms of explanation (Donnelly 2016; Wassermann 2016), while others have argued that the dispute is not substantial, but rather merely verbal (Benovsky 2016; Hirsch 2007; McCall and Lowe 2003; Miller 2005). It is also worth noting that apart from a few introductory words, not much has been covered about the history of the metaphysics of persistence (Carter 2011; Costa 2017; 2019; Cross 1999; Helm 1979).

When it comes to arguments for and against the various metaphysics of persistence, a couple of traditional arguments against perdurantism have not been covered in § 3, namely the modal argument (Heller 1990; Jubien 1993; van Inwagen 1990a; Shoemaker 1988; Sider 2001) and the rotating disk argument (Sider 2001). Moreover, it is important to note that several arguments have been drawn from physics for and against theories of persistence presented in this article, among which figure several arguments against endurantism, namely the shrinking chair argument (Balashov 2014; Gibson and Pooley 2006; Gilmore 2006; Sattig 2006), the explanatory argument (Balashov 1999; Gibson and Pooley 2006; Gilmore 2008; Miller 2004; Sattig 2006), the location argument (Gibson and Pooley 2006; Gilmore 2006; Rea 1998; Smart 1972), the superluminar objects argument (Balashov 2003, Gilmore 2006, Hudson 2005; Torre 2015), the invariance argument (Balashov 2014; Calosi 2015; Davidson 2014) as well as an argument from quantum mechanics against perdurantism (Pashby 2013; 2016).

6. References and Further Reading

Armstrong, D. M., 1980, “Identity Through Time”, in Peter van Inwagen (ed.), Time and Cause. Dordrecht: D. Reidel, 67–78.
Arntzenius, F., 2011, “Gunk, Topology, and Measure” The Western Ontario Series in Philosophy of Science, 75: 327–343.
Arntzenius, F., 2011, “The CPT theorem”, in The Oxford Handbook of Philosophy of Time, eds. Craig Callender, Oxford: Oxford University Press, 634-646.
Baker, L. R., 1997, “Why Constitution is not Identity”, Journal of Philosophy, 94: 599–621.
Baker, L. R., 2000, Persons and Bodies, Cambridge: Cambridge University Press.
Balashov, Y. 1999, “Relativistic Objects”, Noûs 33(4), 644-662.
Balashov, Y. 2000, “Enduring and Perduring Objects in Minkowski Space-Time”, Philosophical Studies, 99, pp. 129–166.
Balashov, Y., 2003, “Temporal Parts and Superluminar Motion”, Philosophical Papers 32, 1-13.
Balashov, Y., 2014, “Relativistic Parts and Places: A Note on Corner Slices and Shrinking Chairs”, in Calosi, C. and Graziani, P. (eds.), Mereology and the Sciences, Springer, 35-51.
Barker, S. and P. Dowe, 2003, “Paradoxes of Multi-Location”, Analysis, 63: 106–114.
Barker, S. and P. Dowe, 2005, “Endurance is Paradoxical”, Analysis, 65: 69–74.
Barnes, E. J. and J. R. G. Williams, 2011, “A Theory of Metaphysical Indeterminacy”, Oxford Studies in Metaphysics, vol. 6.
Benovsky, J., 2009, “Presentism and persistence”, Pacific Philosophical Quarterly, 90 (3):291-309.
Braddon-Mitchell, D. and K. Miller, 2006, “The Physics of Extended Simples”, Analysis, 66: 222–226.
Benovsky, J., 2016, Meta-metaphysics. On Metaphysical Equivalence, Primitiveness and Theory Choice, Springer.
Brewer, B. and Cumpa, J. (eds.), 2019, The Nature of Ordinary Objects, Cambridge: Cambridge University Press.
Brink, D. O., 1997, “Rational Egoism and the Separateness of Persons”, in J. Dancy (ed.) Reading Parfit, Oxford: Blackwell: 96–134.
Broad, C. D., 1923, Scientific Thought, London: Routledge and Kegan Paul.
Brogaard, B., 2000, “Presentist Four-Dimensionalism”, Monist, 83: 341–56.
Burke, M., 1992, “Copper statues and pieces of copper: a challenge to the standard account”, Analysis, 52: 12-17.
Burke, M., 1994, “Preserving the Principle of One Object to a Place: A Novel Account of the Relations among Objects, Sorts, Sortals and Persistence Conditions”, Philosophy and Phenomenological Research, 54: 591–624.
Calosi, C., 2015, “The Relativistic Invariance of 4D shapes”, Studies in History and Philosophy of Science 50, 1-4.
Carnap, R., 1967, The Logical Structure of the World, (trans.) George, R. A., Berkeley: University of California Press.
Carter, J., 2011, “St. Augustine on Time, Time Numbers, and Enduring Objects” Vivarium 49: 301–323.
Casati, R. and Varzi, A., 1999, Parts and Places, Cambridge, MA: MIT Press.
Casati, R. and Varzi, A., 2014, “Events”, The Stanford Encyclopedia of Philosophy (Winter 2015 Edition), Edward N. Zalta (ed.).
Chisholm, R. M., 1973, “Parts as Essential to their Wholes”, Review of Metaphysics, 26: 581–603.
Chisholm, R. M., 1975, “Mereological Essentialism: Some Further Considerations”, The Review of Metaphysics, 28 (3):477-484.
Chisholm, R. M., 1976, Person and Object, La Salle (IL): Open Court.
Cleve, J., 1986, “Mereological Essentialism, Mereological Conjunctivism, and Identity Through Time”, Midwest Studies in Philosophy, 11 (1):141-156.
Costa, D., 2017, “The Transcendentist Theory of Persistence”, Journal of Philosophy, 114 (2):57-75.
Costa, D. 2017a, “The Limit Decision Problem and Four-dimensionalism”, Vivarium 55, 199-216.
Costa, D. 2019, “Was Bonaventure a Four-dimensionalist?”, British Journal for the History of Philosophy 28(2), 393-404.
Cresswell, M. J., 1986, “Why Objects Exist but Events Occur”, Studia Logica, 45: 371–375; reprinted in Events, pp. 449–453.
Crisp, T. M., and D. P. Smith, 2005, “’Wholly Present’ defined”, Philosophy and Phenomenological Research 71(2): 318-344.
Cross, R., 1999, “Four-dimensionalism and Identity Across Time: Henry of Ghent vs. Bonaventure”, Journal of the History of Philosophy 37: 393–414.
Davidson, M., 2014, “Special Relativity and the Intrinsicality of Shape”, Analysis, 74, 57-58.
Donnelly, M., 2016, “Three-Dimensionalism”, in Davis, M. (ed.), Oxford Handbook of Philosophy Online, Oxford University Press.
Dummett, M., 1975, “Wang’s Paradox”, Synthese, 30: 301–24.
Ehring, D., 1997, Causation and Persistence, New York: Oxford University Press.
Ehring, D., 2002, “Spatial Relations between Universals” Australasian Journal of Philosophy, 80(1): 17–23.
Fine, K., 1999, “Things and their Parts”, Midwest Studies in Philosophy 23 (1), 61-74.
Fine, K., 2006, “In Defense of Three-Dimensionalism”, The Journal of Philosophy, 103 (12): 699–714.
Gallois, A., 1998, Occasions of Identity, Oxford: Clarendon Press.
Galton, A. P., 2006, “Processes as continuants”. In J. Pustejovsky & P. Revesz (eds), 13th International Symposium on Temporal Representation and Reasoning (TIME 2006: 187). Los Alamitos, CA: IEEE Computer Society.
Galton, A. P., and Mizoguchi, R., 2009, “The water falls but the waterfall does not fall: New perspectives on objects, processes and events”, Applied Ontology, 4 (2): 71-107.
Geach, P. T., 1972, Logic Matters, Oxford: Blackwell.
Geach, P. T., 1980, Reference and Generality, Ithaca, NY: Cornell University Press.
Gibbard, A., 1975, “Contingent Identity”, Journal of Philosophical Logic, 4(2):187-221.
Gibson, I. and Pooley, O. 2006. “Relativistic Persistence”, Philosophical Perspectives 20 (1), 157-198.
Gilmore, C., 2006, “Where in the Relativistic World Are We?”, Philosophical Perspectives, 20: 199–236.
Gilmore, C., 2007, “Time Travel, Coinciding Objects, and Persistence,” in Dean Zimmerman, ed., Oxford Studies in Metaphysics, vol. 3, New York: Oxford University Press, pp. 177–98.
Gilmore, C., 2008, “Persistence and Location in Relativistic Spacetime”, Philosophy Compass, 3.6: 1224–1254
Gilmore, C., Costa, D., and Calosi, C., 2016, “Relativity and Three Four-Dimensionalisms”. Philosophy Compass 11, no. 2: 102–120.
Goodman, N., 1951, The Structure of Appearance, Cambridge (MA): Harvard University Press.
Griffin, N., 1977, Relative Identity, New York: Oxford University Press.
Hacker, P. M. S., 1982, “Events, Ontology and Grammar”, Philosophy, 57:477–486; reprinted in Events, pp. 79–88.
Haslanger, S., 1989, “Endurance and Temporary Intrinsics”, Analysis, 49: 119–25.
Haslanger, S., 1994, “Humean Supervenience and Enduring Things”, Australasian Journal of Philosophy 72, 339-59.
Haslanger, S., 2003, “Persistence Through Time”, in Loux, M. and Zimmerman, D. (eds.) The Oxford Handbook of Metaphysics, Oxford: Oxford University Press
Hawley, K., 1999, “Persistence and Non-Supervenient Relations”, Mind, 108: 53–67.
Hawley, K., 2001, How Things Persist, Oxford: Oxford University Press.
Hawthorne, J. and G. Uzquiano, 2011, “How Many Angels Can Dance on the Point of a Needle? Transcendental Theology Meets Modal Metaphysics”, Mind, 120: 53–81.
Heller, M., 1984, “Temporal Parts of Four Dimensional Objects”, Philosophical Studies, 46: 323-34.
Heller, M., 1990, The Ontology of Physical Objects, Cambridge: Cambridge University Press.
Helm, P., 1979, “John Edwards and the Doctrine of Temporal Parts” Archiv für Geschichte der Philosophie, 61: 37–51.
Hinchliff, M., 1996, “The Puzzle of Change”, Philosophical Perspectives, 10: 119-136.
Hirsch, E., 2007, “Physical-object ontology, verbal disputes, and common sense”, Philosophy and Phenomenological Research 70(1), 67-97.
Hofweber, T. and D. Velleman, 2011, “How to Endure”, Philosophical Quarterly, 61: 37–57.
Hofweber, T., & Lange, M., 2017, “Fine’s fragmentalist interpretation of special relativity” Nous, 51(4), 871–883.
Hudson, H., 2000, “Universalism, Four-Dimensionalism and Vagueness”, Philosophy and Phenomenological Research, 60: 547–60.
Hudson, H., 2005, The Metaphysics of Hyperspace, Oxford: Oxford University Press.
Hudson, H., 2006, “Simple Statues”, Philo 9: 40-46.
Huemer M., Kovitz B, 2003, “Causation as simultaneous and continuous” The Philosophical Quarterly, 53, 556–565.
Johnston, M., 1987, “Is There a Problem about Persistence?”, Proceedings of the Aristotelian, 61: 107-135.
Jubien, M., 1993, Ontology, Modality, and the Fallacy of Reference, Cambridge: Cambridge University Press.
Kleinschmidt, S., 2017, “Refining Four-Dimensionalism”, Synthese, 194: 4623-4640.
Kant, I., 1965, Critique of Pure Reason, orig. 1781, trans. N. Kemp Smith. New York: Macmillan Press.
Koslicki, K., 2008, The Structure of Objects, Oxford: Oxford University Press.
Leonard, M., 2018, “Enduring Through Gunk”, Erkenntnis, 83: 753-771.
Le Poidevin, R., 1991, Change, Cause and Contradiction, Basingstoke: Macmillan.
Lewis, D. K., 1986, On the Plurality of Worlds, Oxford: Blackwell.
Lewis, D. K., 1988, “Re-arrangement of Particles: Reply to Lowe”, Analysis, 48: 65–72.
Lewis, D., 1976, “Survival and Identity”, in Amelie Rorty (ed.), The Identities of Persons, Berkeley, CA: University of California Press, 117–40. Reprinted with significant postscripts in Lewis’s Philosophical Papers vol. I, Oxford: Oxford University Press.
Lombard, L. B., 1999, “On the alleged incompatibility of presentism and temporal parts”, Philosophia, 27 (1-2): 253-260.
Lombard, L. B., 1986, Events: A Metaphysical Study, London: Routledge.
Lombard, L. B., 1994, “The Doctrine of Temporal Parts and the ‘No-Change’ Objection”, Philosophy and Phenomenological Research, 54.2: 365–72.
Lowe, E. J., 1987, “Lewis on Perdurance versus Endurance”, Analysis, 47: 152–4.
Lowe, E. J., 1988, “The Problems of Intrinsic Change: Rejoinder to Lewis”, Analysis, 48: 72-7.
Lowe, E. J., 1995, “Coinciding objects: in defence of the ‘standard account’”, Analysis, 55(3), 171–178.
Mackie, P., 2008, “Coincidence and Identity”, Royal Institute of Philosophy Supplement, 62: 151-176.
Markosian, N., 1998, “Brutal Composition”, Philosophical Studies 22(3): 211-249.
Markosian, N., 2004, “Simples, Stuff and Simple People”, The Monist, 87: 405-428.
McCall, S. and Lowe, J., 2006, “The 3D/4D controversy: a storm in a teacup”, Noûs 40(3), 570-578.
McDaniel, K., 2003, “Against MaxCon Simples”, Australasian Journal of Philosophy, 81: 265-275.
McDaniel, K., 2007a, “Brutal Simples”, in D. Zimmerman (ed.), Oxford Studies in Metaphysics, 3: 233–265.
McDaniel, K., 2007b, “Extended Simples”, Philosophical Studies, 133: 131–141.
McTaggart, J. M. E., 1921, The Nature of Existence, I, Cambridge: Cambridge University Press.
McTaggart, J. M. E., 1927, The Nature of Existence, II, Cambridge: Cambridge University Press.
Mellor, D. H., 1981, Real Time, Cambridge: Cambridge University Press.
Mellor, D. H., 1998, Real Time II, London: Routledge.
Merricks, T., 1994, “Endurance and Indiscernibility”. Journal of Philosophy, 91: 165–84.
Merricks, T., 1995, “On the incompatibility of enduring and perduring entities”, Mind, 104 (415): 521-531.
Merricks, T., 1999, “Persistence, Parts, and Presentism”, Noûs 33, 421-438.
Miller, K., 2004, “Enduring Special Relativity”, Southern Journal of Philosophy 42, 349-70.
Miller, K., 2005, “The Metaphysical Equivalence of Three and Four Dimensionalism”, Erkenntnis 62 (1), 91-117.
Noonan, H. and Curtis, B., 2018, “Identity”, The Stanford Encyclopedia of Philosophy (Summer 2018 Edition), Edward N. Zalta (ed.).
Noonan, H., 1999, “Identity, Constitution and Microphysical Supervenience”, Proceedings of Aristotelian Society, 99: 273-288.
Oderberg, D., 1993, The Metaphysics of Identity over Time. London/New York: Macmillan/St Martin’s Press.
Oderberg, D., 2004, “Temporal Parts and the Possibility of Change”, Philosophy and Phenomenological Research, 69.3: 686–703.
Parsons, J., 2000, “Must a Four-Dimensionalist Believe in Temporal Parts?” The Monist, 83(3): 399–418.
Parsons, J., 2007, “Theories of Location”, in D. Zimmerman (ed.), Oxford Studies in Metaphysics, pp. 201-32.
Pashby, T., 2013, “Do Quantum Objects Have Temporal Parts?”, Philosophy of Science 80(5), 1137-47.
Pashby, T., 2016, “How Do Things Persist? Location in Physics and the Metaphysics of Persistence”, Dialectica 70(3), 269-309.
Quine, W. V. O., 1953, “Identity, Ostension and Hypostasis”, in his From a Logical Point of View, Cambridge, MA: Harvard University Press, 65–79.
Quine, W. V. O., 1960, Word and Object, Cambridge, Mass.: MIT Press.
Quine, W. V. O., 1970, Philosophy of Logic, Englewood Cliffs (NJ): Prentice-Hall.
Quine, W. V. O., 1981, Theories and Things, Cambridge, MA: Harvard University Press.
Rea, M., (ed.), 1997, Material Constitution, Lanham, MD: Rowan & Littlefield.
Rea, M., 1995, “The Problem of Material Constitution”, Philosophical Review, 104: 525–52.
Rea, M., 1998, “Temporal Parts Unmotivated”, Philosophical Review, 107: 225–60.
Rosen, G. and Dorr, C., 2002, “Composition as Fiction”, in Gale, R., (ed.), The Blackwell Guide to Metaphysics, Oxford: Blackwell, pp. 151-174.
Russell, B., 1914. Our Knowledge of the External World, London: Allen & Unwin Ltd.
Russell, B., 1923, “Vagueness”, Australasian Journal of Philosophy and Psychology, 1: 84–92.
Russell, B., 1927, The Analysis of Matter, New York: Harcourt, Brace & Company.
Sattig, T., 2006, The Language and Reality of Time, Oxford: Oxford University Press.
Saucedo, R., 2011, “Parthood and Location”, in K. Bennett and D. Zimmerman (eds.), Oxford Studies in Metaphysics, 6: 223–284.
Schaffer, J., 2009, “On What Grounds What”, in Chalmers, D., D. Manely, and R. Wasserman (eds.), Metametaphysics, pp. 347–283.
Seadley 1982, “The Stoic Criterion of Identity”, Phronesis, 27 (3): 255-275.
Shoemaker, S., 1988, “On What There Are”, Philosophical Topics, 26: 201-23.
Sider, T., 1996, ‘All the World’s a Stage’, Australasian Journal of Philosophy, 74: 433–53.
Sider, T., 2001, Four-Dimensionalism, Oxford: Oxford University Press.
Sider, T., 2007, “Parthood”, The Philosophical Review, 116: 51–91
Sider, T., 2013, “Against Parthood”, in Bennett, K. and Zimmerman, D.W., (ed.), Oxford Studies in Metaphysics, vol. 8, Oxford: Oxford University Press, pp. 237-93.
Simons, P., 1987, Parts: A Study in Ontology, Oxford: Clarendon Press.
Simons, P., 2000a, “How to Exist at a Time When You Have No Temporal Parts,” The Monist, 83 (3): 419–36.
Simons, P., 2000b, “Continuants and Occurrents”, Proceedings of the Aristotelian Society, Supplementary Volume 74: 59–75.
Simons, P., 2004, “Extended Simples: A Third Way Between Atoms and Gunk”, The Monist, 87: 371-84.
Smart, J. J. C., 1963, Philosophy and Scientific Realism, London: Routledge & Kegan Paul.
Smart, J. J. C., 1972, “Space-Time and Individuals”, in Richard Rudner and Israel Scheffler, eds., Logic and Art: Essays in Honor of Nelson Goodman, New York: Macmillan Publishing Company, pp. 3–20.
Steem, K. I., 2010, “Threedimentionalist Semantic Solution to the Problem of Vagueness”, Philosophical Studies 150 (1): 79-96.
Steward, H., 2013, “Processes, Continuants, and Individuals”, Mind, 122 (487): 781-812.
Stout, R., 1997, “Processes”, Philosophy, 72: 19–27.
Stout, R., 2016, “The Category of Occurrent Continuants”, Mind, 125 (497): 41-62.
Strawson, P. F., 1959, Individuals: An Essay in Descriptive Metaphysics, London: Methuen.
Thomson, J. J., 1983, “Parthood and Identity Across Time”, Journal of Philosophy, 80: 201–20.
Thomson, J. J., 1998, “The Statue and the Clay”, Noûs, 32: 148–73.
Torre, S., 2015, “Restricted Diachronic Composition and Special Relativity”, British Journal for the Philosophy of Science, 66(2), 235-55.
van Fraassen, B. C., 1970, An Introduction to the Philosophy of Time and Space, Columbia University Press.
van Inwagen, P., 1981, “The Doctrine of Arbitrary Undetached Parts”, Pacific Philosophical Quarterly, 62: 123–137.
van Inwagen, P., 1988, “How to Reason about Vague Objects”, Philosophical Topics, 16: 255–84.
van Inwagen, P., 1990a, “Four-Dimensional objects”, Nous, 24 (2): 245-255.
Van Inwagen, P., 1990b, Material Beings, Ithaca, NY: Cornell University Press.
van Inwagen, P., 2000, “Temporal parts and identity through time”, Monist, 83 (3): 437-459.
Varzi, A., 2003, “Naming the Stages”, Dialectica, 57: 387–412.
Varzi, A., 2007, “Promiscous Endurantism and Diachronic Vagueness”, American Philosophical Quarterly, 44: 181-189.
Varzi, A., 2016, “Mereology”, The Stanford Encyclopedia of Philosophy (Spring 2019 Edition), Edward N. Zalta (ed.).
Viebahn, E., 2013, “Counting Stages”, Australasian Journal of Philosophy, 91: 311-324.
Wassermann, R., 2016, “Theories of Persistence”, Philosophical Studies 173, 243-50.
Whitehead, A. N., 1920, The Concept of Nature, Cambridge: Cambridge University Press.
Wiggins, D., 1968, “On Being in the Same Place at the Same Time”, Philosophical Review, 77: 90–5.
Wiggins, D., 1979, “Mereological Essentialism”, Grazer Philosophische Studien, 7: 297-315.
Wiggins, D., 1980, Sameness and Substance, Oxford: Basil Blackwell.
Wilson, J. M., 2013, “A determinable-based account of metaphysical indeterminacy”, Inquiry, 56: 359-385.
Wilson, J., Calosi, C., 2019, “Quantum metaphysical indeterminacy”, Philosophical Studies, 176: 2599–2627.
Zimmerman, D., 1996, “Could Extended Objects Be Made Out of Simple Parts? An Argument for ‘Atomless Gunk’”, Philosophy and Phenomenological Research, 5 (1): 1–29.
Zimmerman, D., 2006, Oxford Studies in Metaphysics, Volume 2, New York: Oxford University Press.

Author Information

Damiano Costa
Email: damiano.costa@usi.ch
University of Italian Switzerland (Universita’ della Svizzera Italiana, University of Lugano)
Switzerland

Associationism in the Philosophy of Mind

Association dominated theorizing about the mind in the English-speaking world from the early eighteenth century through the mid-twentieth and remained an important concept into the twenty-first. This endurance across centuries and intellectual traditions means that it has manifested in many different ways in different views of mind. The basic idea, though, has been constant: Some psychological states come together more easily than others, and one factor in explaining this connection is prior pairing.

Authors sometimes trace the idea back to Aristotle’s brief discussion of memory and recollection. Association got its name—“the association of ideas”—in 1700, in John Locke’s Essay Concerning Human Understanding. British empiricists following Locke picked up the concept and built it into a general explanation of thought. In the resulting associationist tradition, association was a relation between imagistic “ideas” in the trains of conscious thought. The rise of behaviorism in the early twentieth century brought with it a reformulation of the concept. Behaviorists treated association as a link between physical stimuli and motor responses, omitting any intervening “mentalistic” processes. However, they still treated association just as centrally as the empiricist associationists. In later twentieth-century and early twenty-first-century work, association is variously treated as a relation between functionally defined representational mental states such as concepts, “subrepresentational” states (in connectionism), and activity in parts of the brain such as neurons, neural circuits, or brain regions. As a relation between representational states, association is viewed as one process among many in the mind; however, as a relation between subrepresentational or neural activities, it is again often viewed as a general explanation of thought.

Given this variety of theoretical contexts, associationism is better viewed as an orientation or research program rather than as a theory or collection of related theories. Nonetheless, there are several shared themes. First, there is a shared interest in sequences of psychological states. Second, though the laws of association vary considerably, association by contiguity has been a constant. The idea of association by contiguity is that each pairing of psychological states strengthens the association between them, increasing the ease with which the second state follows the first. In its simplest form, this can be thought of as akin to a footpath: Each use beats and strengthens the path. Third, this carries with it a more general emphasis on learning and a tendency to posit minimal innate cognitive structure.

The term “association” can refer to the sequences of thoughts themselves, to some underlying connection or disposition to sequence, or to the principle or learning process by which these connections are formed. This article uses the term to refer to underlying connections unless otherwise specified, as this is the most common use and the one that unites the others.

This article traces these themes as they developed over the years by presenting the views of central historical figures in each era, focusing specifically on their conception of the associative relation and how it operates in the mind.

The Empiricist Heyday (1700-1870s)
Fractures in Associationism (1870s-1910s)
Behaviorism (1910s-1950s)
After the Cognitive Revolution (1950s-2000s)
Ongoing Philosophical Discussion (2000s-2020s)
1. Dual-Process Theories and Implicit Bias
2. The Association/Cognition Distinction
Conclusion
References and Further Reading

1. The Empiricist Heyday (1700-1870s)

Associationism as a general philosophy of mind arguably reached its pinnacle in the work of the British Empiricists. These authors were explicit in their view of association as the key explanatory principle of the mind. Associationism also had a massive impact across the intellectual landscape of Britain in this era, influencing, for instance, ethics (through Reverend John Gay, Hume, and John Stuart Mill), literature, and poetry (see Richardson 2001).

Association in this tradition was called upon to solve two problems. The first was to explain the sequence of states in the conscious mind. The thought here is that there are some reliable patterns to the sequences which must be explained. These were explained by the “laws of association.” The basic procedure was, first, to identify sequences or patterns in sequence. Hobbes’s discussion of “mental discourse” demonstrates this interest, inspiring later associationist theories of mind and providing a famous example:

For in a discourse of our present civil war, what could seem more impertinent than to ask (as one did) what was the value of a Roman penny? Yet the coherence to me was manifest enough. For the thought of the war introduced the thought of the delivering up the king to his enemies; the thought of that brought in the thought of the delivering up of Christ; and that again the thought of the 30 pence which was the price of treason; and thence easily followed that malicious question; and all this in a moment of time, for thought is quick. (Leviathan, chapter 3)

Once the sequences have been identified, the next step is to classify them by the relations between their elements. For example, two ideas may be related by having been frequently paired, or may be similar in some way. This section and the next use “suggestion” to refer to particular incidents of sequence and “association” to refer to the underlying disposition. Secondly, some authors took the same relation to explain the generation of “complex” ideas out of “simple” ideas, often viewed as a kind of psychological atom. The empiricist project requires explaining how all knowledge could be generated from experience. This was perhaps the most common way of doing so, though it was not universal.

Associationist authors then show how associations of the various sorts that they posit can or cannot explain various phenomena. For example, belief may be treated as simply a strong association. Abilities like memory, imagination, or even sometimes reason can be treated as simply different kinds of associative sequence. As empiricists, most eschew innate knowledge and tend to limit innate mental structure relative to competing traditions, though the claim that the mind is truly a blank slate would oversimplify. Their opponents in the Scottish school, for example, treat each of these as manifesting distinct, innate faculties, and posit innate beliefs in the form of “principles of common sense.”

a. John Locke (1632-1704)

John Locke laid the groundwork for empiricist associationism and coined the term “association of ideas” in a chapter he added to the fourth edition of his Essay Concerning Human Understanding (1700). He sets up the Cartesian notion of innate ideas as a primary opponent and asserts that experience can be the only source of ideas, through two “fountains” (book 2, chapter 1): “sensation,” or experience of the outside world, and “reflection,” or experience of the internal operations of our mind. He distinguishes between “simple” ideas, such as the idea of a particular color, or of solidity, and “complex” ideas, such as the idea of beauty or of an army. Simple ideas are “received” in experience through sensation or reflection. Complex ideas, on the other hand, are created in the mind by combining two or more simple ideas into a compound.

In his chapter on association of ideas (book 2, chapter 33), Locke emphasizes the ways that different ideas come together. As he puts it:

Some of our ideas have a natural correspondence and connexion one with another: it is the office and excellency of our reason to trace these . . . Besides this, there is another connexion of ideas wholly owing to chance or custom. Ideas that in themselves are not all of kin, come to be so united in some men’s minds, that . . . the one no sooner at any time comes into the understanding, but its associate appears with it.

His discussion in this chapter focuses on the connections based on chance or custom and describes them as the root of madness. Associations as described here are formed by prior pairings and strengthened passively as habitual actions or lines of thought are repeated.

Thus, despite the significance of his work in setting the stage for later associationists, Locke does not treat association as explaining the mind in general. He treats it as a failure to reason properly, and his interest in it is not only explanatory but normative. For these reasons, some have questioned whether one ought to treat Locke as an associationist, on the thinking that associationists viewed association as the central explanatory posit in the mind (for example, see Tabb 2019). Where one lands on this question seems to depend on the use of the term. After all, Locke’s description of the formation of complex ideas by combining simple ideas was counted as a kind of association by many later associationists. The key, for Locke, is that association is a passive process, while the mind is more active in other processes. The passive nature of association will return as a criticism of associationism; see also Hoeldtke (1967) for a discussion of the history of this line of thought in British psychiatry.

b. David Hume (1711-1776)

David Hume presented arguably the first attempt to understand thought generally in associative terms. He first lays out these views in A Treatise of Human Nature (1739) and then reiterates them in An Enquiry Concerning Human Understanding (1748). According to Hume, the trains of thought are made up of ideas, which are basically images in the mind. Simple ideas, such as a specific color, taste, or smell, are copies of sensory impressions. Thoughts in general are built from these simple ideas by association.

He begins his discussion of association in the Enquiry:

It is evident that there is a principle of connexion between the different thoughts or ideas of the mind, and that, in their appearance to the memory or imagination, they introduce each other with a certain degree of method and regularity. (Enquiry, section III)

His use of the term is not limited to irrationality and madness, as Locke’s was, but it is applied to the trains of thought generally. He questions what relations might explain the observed regularities and claims that there are three: resemblance, contiguity in time or place, and cause or effect. He mentions contrast or contrariety as another candidate in a footnote (footnote 4, section III), but rejects it, arguing it is a mixture of causation and resemblance. Association also explains the combination of simple ideas into complex ideas.

Hume’s inclusion of “cause or effect” as one of the primary categories of association might be thought incongruous with his general view on causality. While the best understanding of association by cause or effect has been controversial, Hume treats it as an independent principle of association, and it can be understood as such, and not, for example, as just a strong association by contiguity. He argues that we gain the impression of a causal power by coming to expect, in the imagination, the effect with the presentation of the cause. As a general matter, he suggests that we cannot feel the relations between sequential ideas, but we can uncover them with imagination and reasoning, though these relations may be different from the factors responsible for association.

Just how generally Hume applied his conception of association may also be subject to interpretation. On the one hand, his discussions of induction, probability, and miracles in the Enquiries suggest that he views association, or habit, as the sole basis of our reasoning about the world, and as such, a normatively adequate means for doing so. On the other hand, he arguably posits several other principles of mind throughout his work. For example, he often treats the imagination as a separate capacity, and he discusses several moral sentiments that would seem to require separate principles. He also expresses uncertainty in the completeness of his list of laws of association. Moreover, he characteristically avoids claims about the ultimate foundation of human nature. In the Treatise, he says: “as to its [association’s] causes, they are mostly unknown, and must be resolv’d into original qualities of human nature which I pretend not to explain” (pg. 13). It may be that, despite its centrality in his philosophy, Hume did not view association as a bedrock principle or cause of thought, though that view later became common, due in large part to the work of David Hartley.

c. David Hartley (1705-1757)

Hartley’s Observations on Man (1749) was published just after Hume’s Enquiry, though he claimed to have been thinking about the power of association for about 18 years. Hartley’s discussion of association is more focused and sustained than Hume’s because of his explicitly programmatic goals. Following Newton’s axiomatization of physics, Hartley sought to axiomatize psychology on the twin principles of association and vibration. Vibrations, in Hartley’s system, are the physiological counterpart of associations. As association carries the mind from idea to idea, vibrations in the nerves carry sensations to the brain and through it. He references physical vibrations as causing mental associations (pg. 6), but then expresses dissatisfaction with this framing and uncertainty on the exact association-vibration relation (pp. 33-34).

The idea is that external stimuli act on nerves, inciting infinitesimally small vibrations in invisible particles of the nerve. These vibrations travel up the nerves, and upon reaching the brain, cause our experience of sensations. If a particular frequency or pattern of vibration is repeated, the brain gains the ability to incite new vibrations like them. This is, effectively, storing a copy of the idea for later thought. These ‘ideas of sensation’ are the elements from which all others are built. Ideas become associated when they are presented at the same time or in immediate succession, meaning that the first idea will bring the second to mind, and, correspondingly, their vibrations in the brain will follow in sequence.

Hartley, like Hume, viewed association as both the principle by which ideas came to follow one another and by which simple ideas were combined into complex ideas: A complex idea is the end point of the process of strengthening associations between simple ideas. Unlike Hume, though, Hartley only posited association by contiguity and did not allow for any other laws of association.

He was also, as noted, explicit in his goal of capturing psychology with the principle. He argues that supposed faculties like memory, imagination, and dreaming, as well as emotional capacities like sympathy, are merely applications of the associative principle. He also emphasized associations between sensations, ideas, and motor responses. For instance, the tuning of motor responses by association explains how we get better at skilled activities with practice. He recognizes that the resulting picture is a mechanical picture, but he does not see this as incompatible with free will, appropriately conceived.

Hartley’s most important contribution is the very project of describing an entire psychology in associative terms. This animated the associationist tradition for the next hundred years or so. In setting up his picture, he was also the first to connect association to physiological mechanisms. This became important in the work of the later empiricist associationists, and in reformulations of associative views after the cognitive revolution discussed in section 4.

d. The Scottish School: Reid, Stewart, and Hamilton

The Scottish Common Sense School, led by Thomas Reid (1710-1796) and subsequently Dugald Stewart (1753-1828) and William Hamilton (1788-1856), was the main competition to associationism in Britain. Their views are instructive in articulating the role and limits of the concept, as well as in setting up Brown’s associationism, discussed below. The Scottish School differed from the associationists in two main ways. Firstly, they took humans to be born with innate knowledge, which Reid called “principles of common sense.” Secondly, they argued for a faculty psychology: They took the mind to be endowed with a collection of distinct “powers” or capacities such as memory, imagination, conception, and judgment. The associationists, in contrast, usually treated these as different manifestations of the single principle of association. Nevertheless, the Scottish School did provide a role for associations.

Reid takes the train of conscious thoughts to be an aggregate effect of the perhaps numerous faculties active at any given time (Essays on the Intellectual Powers of Man, Essay IV, chapter IV). He does allow that frequently repeated trains might become habitual. He treats habit, then, as another faculty that makes these sequences easier to repeat. Associations, or dispositions for certain trains to repeat, are an effect of the causally prior faculty of habit.

Stewart reverses the causal order between association and habit (see Mortera 2005). For Stewart, association is a distinct operation of the mind, which produces mental habits. Association plays a more important role in his system than in Reid’s. He does retain other mental faculties, though, which are responsible for at least the first appearance of any particular sequence in thought. The mistake the associationists make, on his view, is in thinking that they have traced all mental phenomena to a single principle (1855, pp. 11-12). He admits it is possible that philosophers may someday discover the ultimate principle of psychology but doubts that the associationists have done so. Stewart is responding specifically to Joseph Priestly, who edited a famous abridged edition of Hartley’s work.

William Hamilton’s contributions to the concept of association are less direct. He provides the first history of the concept of association of ideas in his notes on The Works of Thomas Reid (1872, Supplemental Dissertation D). Hamilton’s own views also inspired later work by John Stuart Mill in his Examination of Sir William Hamilton’s Philosophy (1878).

e. Thomas Brown (1778-1820)

Thomas Brown occupies a unique position in the history of associationism. His main work, Lectures on the Philosophy of the Human Mind (1820), was published after his death at the age of 43. On the one hand, he is a student of the Scottish School, having studied under Dugald Stewart. On the other hand, he was an ardent associationist, reducing all of the supposed faculties to association. Brown explicitly casts his project as one of identifying and classifying the sequences of “feelings” in the mind, which was his general term for mental states, including ideas, emotions, and sensations.

Arguably, his philosophy of mind is more Humean than Hume’s, in that he extends Hume’s arguments against necessary connections between cause and effect in the world to the mind. He argues that an association is not a “link” between ideas that explains their sequence; it is the sequence itself. The idea of an associative link is vacuous and explanatorily circular. Brown actually argues for the term “suggestion” over “association,” though he uses the terms interchangeably when he fears no misinterpretation (Lecture 40). He differentiates two kinds of suggestion: simple suggestion, in which feelings simply follow in sequence, and relative suggestion, in which the relationship between sequential ideas is felt as well. Simple suggestion is responsible for capacities like memory and imagination, while relative suggestion allows capacities like reason and judgment.

Brown also differs from the standard associationist picture in that he, like Reid, embraces innate knowledge, which he calls “intuitive beliefs.” His prime example is belief in personal identity over time. Another is that “like follows like,” which can serve as the basis for the associating principle. He expresses an expectation that all associations will eventually be shown to be instances of association by contiguity, but does not think this has been shown yet. He thus finds it best to “avail ourselves of the most obvious categories” of contiguity, similarity, and contrast (Lecture 35).

Brown introduces several “secondary” laws of association, which can help predict which of any particular associations are likely to be followed in any given case (Lecture 37). He lists nine, including liveliness of feelings associated, frequency with which they had paired, recency, and differences arising from emotional context. While members of subsequent lists changed, the introduction of secondary laws of association may have been Brown’s most enduring legacy.

In common with those associationists above, Brown emphasizes a role for association in the formation of complex ideas out of simple ideas. However, he views ideas as states of the mind itself, not objects in the mind—a mistake he attributes primarily to Locke. As a result, he argues that it is metaphysically impossible that complex ideas are literally built of simple ideas, since the mind can only occupy one state at a time. He does argue that it is useful to think of simple ideas as continuing in a “virtual coexistence” in complex ideas, but the focus here is an historical/etiological story of how complex ideas came to be, rather than a literal decomposition.

Despite his idiosyncratic views, Brown identified his position as associationist, and it was accepted as such by the tradition. Though his work has been largely forgotten, it was very influential in the United Kingdom and United States in the years following its publication. Brown’s place in the associationist tradition strains standard interpretations of the tradition and what, if anything, unites it. After all, he denies the central associationist posit, the associative link, and allows innate knowledge.

f. James Mill (1773-1836) and John Stuart Mill (1806-1876)

James Mill’s view rivals Hartley’s as a candidate prototypical associationist picture of mind. Mill presents his views in his Analysis of the Phenomena of the Human Mind (originally published 1829, cited here from 1869; this edition includes comments from John Stuart Mill and Alexander Bain).

Like Hartley, James Mill argues that contiguity is the only law of association. Specifically, James Mill argues that similarity is just a kind of contiguity. The claim is that we are used to seeing similar objects together, as sheep tend to be found in a flock, and trees in a forest. In his editorial comments in the 1869 edition, John Stuart Mill calls this “perhaps the least successful attempt at a generalisation and simplification of the laws of mental phenomena, to be found in the work” (pg. 111). For his part, James Mill does not attribute much significance to the question, saying: “Whether the reader supposes that resemblance is, or is not, an original principle of association, will not affect our future investigations” (pg. 114).

In discussing the associative relation itself, James Mill distinguishes synchronous and successive association. Some stimuli are experienced simultaneously, as in those emanating from a single object, and others successively, as in a sequence of events. The resulting ideas are associated correspondingly. Synchronous ideas arise together and themselves constitute complex ideas. Thus, a complex idea, in James Mill’s system, is a literal composite of simpler ideas. Successively associated ideas will arise successively. Of successive association, James Mill remarks that it is not a causal relation, though he does not elaborate on what he means by this (pg. 81). He describes three different ways that the strength of an association can manifest: “First, when it is more permanent than another: Secondly, when it is performed with more certainty: Thirdly when it is performed with more facility” (pg. 82). Adapting some of Brown’s secondary laws, he argues that strength is caused by the vividness of the associated feelings and frequency of the association.

James Mill reduces the various “active” and “intellectual” powers of the mind to association. He limits his discussion of association to mental phenomena, though he recognizes the significance of physiology for motor movements and reflexes. For instance, conception, consciousness, and reflection simply refer to the train of conscious ideas itself. Memory and imagination are particular segments of the trains. Motives are associations between actions and positive or negative sensations which they produce. The will is also reduced to an association between various ideas and muscular movements. Thus, even the active powers are mechanistic. Belief is just a strong association. Ratiocination, as in syllogistic reasoning, simply chains associations. Consider the syllogism: “All men are animals: kings are men : therefore kings are animals” (pg. 424). This produces the compound association “kings – men – animals.” For James Mill, this compound association includes an intermediate that remains in place, but is simply passed over so quickly it can be imperceptible and appear to simply be “kings – animals”; much in the same way that complex ideas still include all of the simpler ideas. This sets up a noteworthy disagreement between James and his son, John Stuart Mill.

John Stuart Mill argues, against his father, that complex ideas are new entities, not mere aggregates of simple ideas, and that intermediate ideas can drop out of sequences like that above. In general, John Stuart Mill analogizes the association of ideas to a kind of chemistry, where a new compound has new properties separate from its constituent elements (A System of Logic, chapter IV). In James Mill’s view of association, ideas retain their identity in combination, like bricks in a wall.

John Stuart Mill’s views on association are spread through several texts (see Warren 1928 pp.95-103 for a summary of his views), and his psychological aspirations are not as imperial or systematic as his father’s. This is evident partly in his lack of a sustained treatment, but also in the phenomena he does not attribute to association. For instance, he does not treat induction as an associative phenomenon, breaking with Hume (see A System of Logic). Similarly, breaking with his father, he does not view belief as simply a strong association, arguing that it must include some other irreducible element (notes in James Mill’s Analysis, pg. 404). When John Stuart Mill does allude to a systematic development of association, he usually defers to our next subject, Alexander Bain.

g. Alexander Bain (1818-1903)

Alexander Bain presents a sophisticated version of empiricist associationism. His main work on the topic comes in The Senses and the Intellect (originally published 1855, cited here from 3^rd ed., 1868). Bain’s early work was developed and published with significant help from his close friend and mentor, J. S. Mill, but became a standard.

Bain differs most from previous associationists in the role he grants to instincts. By “instincts,” he means reflex actions, basic coordinated movement patterns such as walking and simple vocalization, and the seeds of volition (the potential for spontaneous action). This discussion is unique, first, in that he separates these out from the domain explained by association. He takes instincts to be “primordial,” inborn, and unlearned. Second, he opens his text with a discussion of basic neuroanatomy and function and explains instincts largely by appeal to the structure of the nervous system and the flow of nervous energy. This discussion was aided in part by recent progress in physiology, but also by an avowed interest in bringing physiology and psychology in contact.

Bain, nonetheless, takes association to be the central explanatory principle for phenomena belonging to the intellect. By “intellect,” he has in mind phenomena one might call thought, such as learning, memory, reasoning, judgment, and imagination. When he switches to his discussion of the intellect, his physiological discussions drop out, and his method is entirely introspective. As Robert Young notes: “his work points two ways: forward to an experimental psychophysiology, and backward to the method of introspection” (1970, pg.133).

Bain never makes any distinction between simple and complex ideas, and he discusses association in successive terms. He also does not restrict association to ideas and argues that the same principles can combine, sequence, and modify patterns of movement, emotions, sensations, and the instincts generally.

He admits three fundamental principles of association: similarity, contiguity, and contrast. Contiguity is the basic principle of memory and learning, while similarity is the basic principle of reasoning, judgment, and imagination. Nonetheless, the three are interdependent in complex ways. For instance, similarity is required for contiguity to be possible: Similarity is required for us to recognize that this sequence is similar enough to a former sequence for them to both strengthen the same association by contiguity. The principle of contrast has a more complex role. On the one hand, it is fundamental to the stream of consciousness in the first place. We would not recognize changes in consciousness as changes without this principle. As such, we cannot be conscious of anything as something without recognizing that there is something else it is not: If red were the only color, we would simply not be conscious of color. The other principles would be impossible. Nonetheless, it can also drive sequences, but only when properly scaffolded by similarity or contiguity. Similarity is necessary for association by contrast because contrast is always within a kind, and similarity is necessary for recognition of that original kind; he notes, “we oppose a long road to a short road, we do not oppose a long road to a loud sound” (1868, pg. 567). In many particular cases, contrast can be driven by contiguity, as contrasting concepts are frequently paired: up and down, pain and pleasure, true and false, and so on. Experiences of contrast themselves, he notes, often arouse emotional responses, as in works of poetry and literature. In other work, however, Bain does not seem to find the question of whether contrast is a separate principle of association to be all that interesting, since transitions based on contrast are very rare, and many instances of contrast-based associations are in fact based in contiguity (1887).

He discusses two other kinds of association: compound association and constructive association (in his first edition, he lists these as additional principles of association, but drops that categorization by the third). Compound association includes the ways associations can interact. For instance, if there are several features present that all remind us of a friend, all of those associative strengths can combine to make it more likely that we think of the friend. He groups imagination and creativity under “constructive association,” an active process of combining ideas, as in imagination, creativity, and the formation of novel sentences.

h. Themes and Lessons

Surveying these views uncovers significant diversity, even among the “pure” associationists found in the empiricist tradition. Most abstractly, the authors differed in their metaphysics. Brown was an avowed dualist. Hartley expresses uncertainty on the mind/brain relation but posited a physiological counterpart to association. Hume and Reid refused to speculate on metaphysics. Precursors include George Berkeley, an idealist, and Thomas Hobbes, a materialist.

The topics of debate within associationism itself included, first, the proposed list of laws of association. While all of the authors mentioned took association by contiguity to be among them, Hume included resemblance and cause or effect, Brown and Bain included similarity and contrast, and Hartley and James Mill included no others. It is common to view associationism as defined by the reliance on association by contiguity. While contiguity was generally posited, this is an oversimplification. It misses not only the diversity in laws posited, but also by the attitude authors take towards those laws. Many central associationists, including Hume, Brown, James Mill, and Bain, either describe their classification to be provisional, or express some willingness to defer. Overall, Stewart’s discussion of the question of how far one traces the causal/explanatory thread captures the general situation. The starting point is observed sequences of conscious thought, and the question is how far one can go in finding the principles that explain those sequences.

Authors also disagreed on whether the process, force, or principle combining simple ideas into complex ideas (simultaneous association) was the same as that producing the sequences of ideas through the mind (successive association). All of the theorists discussed here accept successive association, while simultaneous association is more controversial. Brown disavows simultaneous association, while Bain simply ignores it. Even proponents of simultaneous association disagree on how it operates, as evidenced in John Stuart Mill’s disagreement with his father on “mental chemistry.” Questions like this, about how more complex ideas are formed, remain at issue(for example, see Fodor and Pylyshyn 1988 and Fodor 1998). The formation of abstract ideas was a particularly difficult version of this problem through much of the tradition; it is much easier to see how ideas formed through sensory impressions can refer to concrete objects. Simultaneous association could provide an answer according to which abstract ideas include all of the particulars, while others take abstract ideas to simply include a particular feature, or simply a name for a feature, by, for instance, examining a feeling of similarity between two ideas of particulars.

Finally, there is disagreement in what psychological elements associations are supposed to hold between. Discussion of association often latches onto Locke’s term “association of ideas,” ignoring views that take stimuli and motor movements (most of the authors above, including arguably Locke himself as he describes a visual context improving a dance; Essay, book 2, chapter 33, section 16), reflexes, and instincts (Bain) to be associable in just the same way. Even when discussing association as a relation between ideas, there is disagreement on the nature of ideas and their relationship to mind. For instance, Brown criticizes Locke for treating ideas as independent objects in the mind, rather than states of the mental substance.

The diversity in associationist views suggests that associationism is better viewed as a research program with shared questions and methods, rather than a shared theory or set of theories (Dacey 2015). Such an approach makes better sense of similarities and differences in the views. Hume, Hartley, and James Mill make good prototypes for associationism, but one misses much if one takes any particular author to speak for the tradition as a whole.

2. Fractures in Associationism (1870s-1910s)

In the late nineteenth and early twentieth centuries, the associationist tradition began to fracture. Several factors combined to shape this overall trend. Important changes in the intellectual landscape included the arrival of evolutionary theory, the rise of experimental psychology—bringing with it psychology’s separation from philosophy as a field—and increasing understanding of neurophysiology. At the same time, several criticisms of the pure associationist philosophies became salient. Through this era, the basic conception of association was still largely preserved from the previous one: It is a relation between internal elements of consciousness. By this time, materialism had largely taken over, and most authors here view association as having some neural basis, even if association itself is a psychological relation.

Associationism fractured in this era because the trend was to disavow the general, purely associationist program described in the last section, even if authors still saw association as a central concept. Thus, while associationism lost a shared outlook and purpose, there was still much progress made in testing the possibilities and limits of the concept of association.

a. Herbert Spencer (1820-1903)

Herbert Spencer’s philosophy was framed by a systematic worldview that placed evolutionary progress, as he conceived it, at its core. His psychology was no different. His Principles of Psychology was first published in 1855, four years before On the Origin of Species, but was substantially rewritten by the third edition, which is the focus here (1880, cited here from 1899). By this point, the work had been folded into his System of Synthetic Philosophy, a ten-volume set treating everything from physics to psychology to social policy (Principles of Psychology became volumes 4 and 5). Spencer’s conception of evolution was quite different from later views. Firstly, Spencer believed in the inheritance of acquired traits. Secondly, and partly as a result of this, Spencer viewed evolution as a universal force for progress; species literally get better as they evolve.

The basic units of consciousness for Spencer are nervous shocks, or individual bursts of nervous activity. Thus, the atoms in his picture are much smaller than what we might usually call thoughts or ideas, and all of the psychological activities he describes are assumed to be localizable within the nervous system. Spencer distinguishes between “feelings” proper and relations between feelings. Feelings include what would previously have been called sensations and ideas, as well as emotions. They can exist in the mind independently. Relations are felt, in that they are present in consciousness, but they can only exist between two feelings. For instance, we might feel a relation of relative spatial position between objects in a room as we scan or imagine the scene. Both feelings and relations are associable.

The primary kind of association is that between a particular feeling and members of its same kind. Thus, similarity is the fundamental law of association, both with feelings and relations. A particular experience of red will revive a feeling corresponding to other red feelings. Spencer seems to think that the resulting “assemblages” do not constitute new feelings, effectively siding with James Mill over John Stuart. “Revivability” varies with the vividness of the reviving feeling, the frequency with which feelings have occurred together, and with the general “vigor” of the nervous tissues. This last variable includes the particular claim that a long time spent contemplating one subject will deplete resources in the corresponding bits of brain tissue, making related ideas temporarily less revivable. Relations are generally more revivable, and so more associable, than feelings. Relations can, themselves, aggregate into classes, and revive members of the class. As a result, many relations may arise in mind between two feelings, though some, perhaps most, of these will pass too quickly to be noticed.

Spencer takes the laws of association to simply be manifestations of certain relations between feelings, which are actually associated based on similarity. For instance, he takes association by contiguity to be a relation of “likeness of relation in Time or in Space or in both” (267), which is just a kind of similarity. He does not seem to see any problem in making this claim, while still asserting frequency of co-occurrence as an independent law of revivability above. Moreover, when two feelings arrive in sequence in the mind, they are always mediated by at least two relations: one of difference, as the feelings must not be identical, and one of coexistence or sequence.

Spencer claims to have squared empiricist and rationalist views of mind using evolution (pg. 465). He combines the law of frequency with his view on the heritability of acquired traits to argue that associations learned by members of one generation can be passed on to the next. The empiricists are right that knowledge comes from learning, but the rationalists are right that we are individually born with certain frameworks of understanding the world. In early animals, simple reflexes were so combined to create more flexible instincts. Some relations in the world, like those of space and time, are so reliably encountered that their inner mental correspondents are fixed through evolutionary history. Thus, human beings are born with certain basic ideas, like those of space and time. The resulting view is one in which thought is structured by association, but associations are accrued across generations (see Warren 1928, pg. 132).

b. Early Experimentalists: Galton, Ebbinghaus, and Wundt

Francis Galton (1822-1911), Darwin’s polymath cousin, published the first experiments on association under the title Psychometric Experiments in 1879. He ran his experiments on himself; the method was to work through a list of 75 words, one by one, and record the thoughts each suggested and the time it took to form each associated thought clearly. He did so four different times, in different contexts at least a month apart. He reports 505 associations over 600 seconds total, for a rate of about 50 associations per minute. Of the 505 ideas formed, 289 were unique, with the rest repetitions. He emphasizes that this demonstrates how habitual associations are. He notes that ideas connected to memories from early in his life were more likely to be repeated across the four presentations of the relevant word. This he takes to show that older associations have achieved greater fixity.

Among his pioneering studies on memory, Hermann Ebbinghaus (1850-1909) tested capacity for learning sequences of nonsense syllables, arguably the first test of the learning of associations (1885). He found, using himself as his subject, that the number of repetitions required to learn a sequence increased with the length of the sequence. He also found that rehearsing a sequence 24 hours before learning it brought savings in learning. The savings increased with increasing number of rehearsal repetitions.

Though the first experimental psychology labs were established in Germany, where the concept of association never reached the significance it had in Britain, association remained a target of early experiments, directly or indirectly (see Warren 1928, chapter 8 for a fuller survey; see also sections on Calkins and Thorndike below). These studies established association as a controllable, measurable target for experiment, even among those who did not subscribe to associationism as a general view. This role arguably sustained association as a central concept of psychology into the twenty-first century.

Wilhelm Wundt (1832-1920) provides perhaps the most complete theoretical treatment of association among the early experimentalists (1896, section 16; 1911, chapter 3). While association plays an important role in his system, he objects that associationists leave no place for the will among the passive processes of association. Thus, he distinguishes the passive process of combination he calls association and an active process of combination he calls “apperception” These ideas were developed into structuralism in America by Wundt’s student E. B. Titchener.

c. William James (1842-1910)

William James is not generally considered an associationist, and he attacks the associationists at several points in his Principles of Psychology (originally published 1890, cited here from 1950). However, at the close of his chapter on association (chapter XIV), he professes to have maintained the body of the associationist psychology under a different framing. His framing is captured as follows:

Association, so far as the word stands for an effect, is between THINGS THOUGHT OF—it is THINGS, not ideas, which are associated in the mind. We ought to talk of the association of objects, not of the association of ideas. And so far as association stands for a cause, it is between processes in the brain—it is these which, by being associated in certain ways, determine what successive objects shall be thought. (pg. 554)

James notes here an ambiguity in the term “association”; that between association as an observed sequence of states in the conscious mind (an effect) and association as the causal process driving those sequences. His handling of each side of the ambiguity highlights, in turn, his major criticisms of associationist psychologies before him.

His claim that we ought to talk of association of objects rather than association of ideas stems from his criticism of the associationist belief that the stream of consciousness is made up of discrete “ideas.” James shares with the associationists an emphasis on the stream of consciousness: He takes it to be the first introspective phenomenon of analysis for psychology (chapter IX). However, his introspective analysis of the stream of consciousness reveals it to be too complicated to be broken up into ideas. There are two main reasons for this: First, he notes, ideas are standardly treated as particular entities that are repeatedly revived across time: My idea of “blue” is the same entity now as it was 5 years ago. In contrast, James notes that the totality of our conscious state is always varied. Some of these differences come from external conditions, such as the current illumination of a blue object, or different sounds present, temperatures, and so on. Other differences come internally, including particular moods, varying emotional significance to a particular object, and previous thoughts fading away. He even suggests that organic differences in the brain, like blood flow, might influence our experience of some thought at different times.

His second concern is that consciousness does not present breaks, as one would expect when transitioning between discrete ideas. Rather, consciousness is continuous. Thoughts arise and fade, but they overlap, sometimes attitudes persist in the background, and he insists there is always a feeling present, even if some are transient and difficult to name. Thus, he prefers the term “streams of consciousness” to “trains of thought.”

The association of ideas presents a false view because conscious states are not discrete, and they are never revived in exactly the same way. Both mistakes share one major cause: the fact that we name and identify representational states by the objects that they represent. It is the common referent in the world that makes us think that the idea itself is the same each time, ignoring the nuance of particular experiences. Similarly, we focus on these ideas, ignoring the feelings that bridge them and persist through them. Thus, these problems are solved by shifting to association of objects. This, however, is just a description of the stream of consciousness, and cannot explain it.

James believes that looking at association as a brain process can explain the streams of thought while still respecting the nuances of consciousness just discussed. This claim depends on his view of habit, which he treats as a physiological, even generally physical, fact (chapter IV). Actions often repeated become easier. He explains that channels for nerve discharge become worn with use, just as a path is worn with use, or a paper creased in folding.

Thus, brain processes become associated in the sense that processes frequently repeated in sequence will tend to come in sequence. At any given moment, there are many processes operating behind a particular conscious state: Some processes will have to do with a thought we are considering, some with moods, some with emotional states, and some with ongoing perception as we think. Each of these will, in some way, contribute to the set of thoughts and feelings that come next. This, James held, could explain the various, multifaceted sequences of thought. The various feelings present are not literal “parts” of any conscious state, as in the common associationist picture of complex ideas. Even so, different feelings can potentially influence the direction of the stream of consciousness at any given point because each is attended by brain processes which are separable, and which actually direct the stream. This also allows active processes, like attention and interest, to contribute to guiding the stream of consciousness, even if they are, in effect, operating through habit.

A natural question would be how we know which of any candidate set of thoughts will come next. James discusses some factors much like Brown’s “secondary laws” above, including interest, recency, vividness, and emotional congruity. This is the question taken up by Mary Whiton Calkins.

d. Mary Whiton Calkins (1863-1930)

Mary Whiton Calkins was both the first woman president of the American Psychological Association and the first woman president of the American Philosophical Association. She was a student of James, and despite his enthusiastic support, she was refused her PhD from Harvard because of her gender. This did not prevent her from an influential career and many years as a faculty member at Wellesley College. Her description of association in her textbook (1901) largely follows James’s. However, Calkins was much more interested in experimental methods than him.

She was particularly interested in the question, “What one of the numberless images which might conceivably follow upon the present percept or image will actually be associated with it?” (1896, pg. 32), taking this to be the key to making concrete predictions about the stream of consciousness, and even perhaps to control problematic sequences. In so doing, she targets what had elsewhere been called the secondary laws: frequency, vividness, recency, and primacy. In a paired-associate memory task, she finds frequency to be by far the most significant factor. She finds this surprising, as she takes introspection to indicate that recency and vividness are just as important. She sees this result as significant for training and correcting associative sequences.

e. Sigmund Freud (1856-1939)

Sigmund Freud’s relationship to associationism is most evident in two aspects of his work. First, Freud outlined a thoroughly associationist picture of the mind and brain in his early and unpublished Project for a Scientific Psychology (written 1895, published posthumously in 1950). Second is his invention of the method of free association.

In the Project, Freud conceives of the nervous system as a network of discrete, but contacting, neurons, through which flows a nervous energy he calls “Q.” As neurons become “cathected” (filled) with Q, they eventually discharge to the next downstream neurons. The ultimate discharge of Q results in motor movements, which is how we actually release Q energy. In the central neurons, responsible for memory and thought, there is a resistance at the contact barrier. There is no such resistance at the barriers of sensory neurons. Learning occurs because frequent movements of Q through a barrier will lower its resistance. He identifies this as association by contiguity (pg. 319). Thus, the neurophysiological picture is also a psychological picture, and these basic processes are associative.

In addition, Freud adds two other systems. First is a class of neurons that respond to the period of activity in other neurons. These are able to track which perceptions are real and which are fantasy or hallucination, because stimuli coming in through the senses have characteristic periods. Second is the ego. In this work, the ego is simply a pattern of Q levels distributed across the entire network. By maintaining this distribution, the ego prevents any one neuron or area from becoming too heavily cathected with Q, which would result in erratic thought and action because of the resulting violent discharge. The role of the ego is thus inhibitory. Together, these additional systems control the underlying associative processes in ways that allow rational thought.

Freud never published this work and abandoned most of the details. Nonetheless, it arguably previews the basic underlying theories of much of his later work (as noted by the editor of the standard edition of Freud [Vol 1. pp. 290-292] and Kitcher 1992; see Sulloway 1979 for discussion of continental associationist influences on Freud). The thinking would go that breakdowns in rationality, as in dreaming or pathology, come when basic processes like association operate uncontrolled.

Regardless of exactly how it fits in his overall theoretical framework, his invention of the method of free association deserves note as well. Freud began using free association in the 1890s as an alternative to hypnosis. The patient would lie in a relaxed but waking state and simply discuss thoughts as they came freely to mind. The therapist would then analyze the sequence of thoughts and attempt to determine what unconscious thoughts or desires might be directing them. In later versions, patients are asked to keep in mind a starting point of interest or are presented a particular word or image to respond to. Free association was massively influential, and it remains the core psychoanalytic method (and has also been used in mapping semantic networks; see section 4). It also takes associative processes to operate in the unconscious, another view that would be revived later (see section 5).

f. G. F. Stout (1860-1944)

G. F. Stout continues the trend of criticizing associationism while allowing a significant role for association in his Manual of Psychology (1899). A prominent British philosopher and psychologist at the turn of the century, Stout taught, at different times, at Cambridge (including students G. E. Moore and Bertrand Russell), Aberdeen, Oxford, and St. Andrews. He accepts association as a valuable story for the reproduction of particular elements of consciousness, but he argues that there is an independent capacity for generating new elements. He specifically attacks John Stuart Mill and his analogy of mental chemistry (1899, book I, chapter III). According to Stout, Mill was right that complex ideas are not mere aggregates of simple ideas, but failed to recognize that this means that a new idea must be generated: The new idea had aggregates of associated simple ideas as precursors, not as parts—previewing the work of the Gestalt psychologists. He claims that Mill’s attempt to include the simple ideas in complex ideas as in chemical combination is a desperate attempt to save the theory from a fatal flaw.

Stout does grant association a significant part in the reproduction of ideas in the train of thought. There, as well, he provides a novel interpretation (book IV, chapter II). Specifically, he argues that association by contiguity should be rephrased as “contiguity of interest.” This means that only those elements that are interesting—at the time, based on goals, intentions, and other states—will be associated, and uninteresting elements will be dropped. He takes this to be the sole law of association. Apparent associations by similarity are in fact associations by contiguity of interest, because similar objects will have some aspects that are identical, and these aspects drive the suggestion. He also addresses the question of which of several competing associations will actually lead thought. He mentions Brown’s secondary laws as factors, but he takes the most important to be the “total mental state,” or the “general trend of psychical activity,” such that factors like intentions or background desires are usually decisive.

Finally, he argues that the process of ideational construction is active at all times and does not merely generate new ideas. It also modifies ideas as they are revived. Ideas take on new relations to other ideas. They may be seen in a different light, with different aspects emphasized based on differences in context, as well as in mental state and interests. Ideas are, in a real sense, remade as they are revived.

g. Themes and Lessons

The proliferation of interpretations of association through this era demonstrates the decline of the pure empiricist versions of the view. Nonetheless, the empiricist conception remains prominent. Authors who disavow that position still hold views substantially similar to it. Those working to refine the concept are still working from an empiricist starting point: Associations hold between conscious states, and contiguity and similarity remain the most common laws of association. Compared with the associationists described in the previous section, the diversity of views in this section is greater by a quantitative, rather than qualitative, degree.

Nonetheless, these authors do not proclaim their adherence to associationism, and many expressly disavow it. Worries about the theory itself center on its atomism—treating simple ideas as discrete indivisible units that are reified in thought—and its passive, mechanical depiction of mind. More general trends include increasing knowledge in related fields such as evolutionary theory, neurophysiology, and experimental psychology. Evolutionary theory poses a challenge to associationist empiricism, as it allows a mechanism for innate ideas. Neurophysiology and experimental psychology both contributed to the fracturing of associationism, partly because progress on each came at the time from the continent, where there was less interest in a general associationist picture than in the United Kingdom. Nonetheless, each development supported a role for association. At least superficially, the network of neural connections looks a lot like the network of associated ideas. And associations make good experimental targets because they are easy to induce and test.

It does not seem that associationism must stand or fall with any of these challenges or developments singly, as there are views broadly consistent with each in the previous section. Rather, these problems persisted and compiled at the same time as new ideas from other fields allowed researchers to step out of the old paradigm and cast about for new formulations of the old idea. The general picture, then, is of a concept losing its role as the single core-concept of psychology and philosophy of mind, but nonetheless retaining several important roles. The development that finally brought this particular associationist tradition to an end, the rise of behaviorism, returned association to its central position.

3. Behaviorism (1910s-1950s)

Behaviorism arose in America as a reaction to the introspective methods that had dominated psychology to that point. Most of the authors listed above built their systems entirely from introspection. Even the experimentalists mostly recorded introspective reports, often using themselves as the only subject. The behaviorists did not see this as a reliable basis for a scientific psychology. Science, as they saw it, only succeeded when it studied public, observable phenomena that could be recorded, measured, and independently verified. Introspection is a private process, which is not independently verifiable or objectively measurable.

The result of adopting this viewpoint was a complete change in the conceptual basis of psychology, as well as in its methodology and theory. Behaviorists abandoned concepts like “ideas” and “feelings,” and the notion that the stream of consciousness was the primary phenomenon of psychology. Some even denied the phenomenon of consciousness itself. What they did not abandon, however, was the concept of association. In fact, association regained its role as the central concept of psychology, now reimagined as a relation between external stimuli and responses rather than internal conscious states. Even the law of association by contiguity was co-opted.

a. Precursors: Pavlov, Thorndike, and Morgan

Ivan Pavlov’s (1849-1936) famous work provided what would be a core phenomenon and some of the basic language of the behaviorists. Pavlov (1902) was interested in the physiology of the digestive system of dogs and the particular stimuli which elicit salivation. In the course of his studies, he observed that salivation would occur as the attendant who usually fed the animal approached. He noted a difference between “unconditional reflex,” as when salivation occurs due to a taste stimulus, and a “conditional reflex,” as when salivation occurs due to the approaching attendant (1902, pg. 84). Pavlov was able to show that a stimulus as arbitrary as a musical note or a bright color could cause salivation if paired frequently with food. He notes that the effect is only caused when the animal is hungry, and that it seems important that the unconditional reflex is tied to a basic life process. His account of the phenomenon is characteristically physiological:

It would appear as if the salivary centre, when thrown into action by the simple reflex, because a point of attraction for influences from all organs and regions of the body specifically excited by other qualities of the object. (pg. 86)

This phenomenon came to be known as “classical conditioning.” As Pavlov presciently remarks: “An immeasurably wide field for new investigation is opened up before us” (pg. 85). In subsequent work, Pavlov (1927) further explores these processes, including inhibitory processes such as extinction, conditioned inhibition, and delay.

Edward Thorndike (1874-1949) explicitly targeted the processes of association in animals (1898). He laments that existing work tells us that a cat will associate hearing the phrase “kitty kitty” with milk, but does not tell us the actual sequence of associated thoughts, or “what real mental content is present” (pp. 1-2). To test this objectively, he placed animals in a series of puzzle boxes with food visible outside. Most were cats, but he also experimented with dogs and chicks. Escape, and thus food, required unlocking the door using one or more actions such as pulling a string, pressing a lever, or depressing a paddle. If they did not escape within a certain time limit, they would be removed without food.

As Thorndike describes it, animals placed in the box first perform “instinctive” actions like clawing at the bars and attempting to squeeze through the gaps. Eventually, the animal will happen upon the actual mechanism and accidentally manipulate it. Once some action is successful, the animal will associate it with the stimulus of the inside of the box. This association gradually strengthens with repeated presentation, as shown by learning curves of animals more rapidly escaping with sequential trials, which came to be known as operant, or instrumental, conditioning. He argues that this must be explained with associations between an idea or sense impression and an impulse to a particular action, rather than the “association of ideas,” as ideas themselves are inert (pg.71). He expresses the belief that animals have conscious ideas but remains officially agnostic, and he emphasizes that humans are not merely animals plus reason; human associations are different from animal associations as well. Thus, he arrives at the basic idea that he later restated under the name “the law of effect”:

Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond. (1911, pg. 244)

While the name “law of effect” has stuck, it is worth noting that in his dissertation (1898) and his textbook (1905 pp. 199-203), Thorndike simply calls it the “law of association.”

Lloyd Morgan (1852-1936) also discusses “the association of ideas” in nonhuman animals. However, his most significant contribution to the use of the concept is indirect, through a methodological principle that came to be known as his “Canon”:

In no case may we interpret an action as the outcome of the exercise of a higher psychical faculty, if it can be interpreted as the outcome of the exercise of one which stands lower in the psychological scale. (Morgan 1894, pg. 53)

The behaviorists took Morgan’s Canon to encourage positing minimal mental processes. More generally, associative processes are usually thought to be among the “lowest,” or “simplest,” processes available. This means that an associative explanation will be preferred until it can be ruled out; a practice that remains (see sections 4 and 5).

b. John B. Watson (1878-1958)

Watson rung in the behaviorist era with his paper Psychology as the Behaviorist Views It (1913). In that work, he attacks the introspective method and claims about conscious feelings or thoughts. As he develops the view (1924/1930), he says that all of psychology can be reframed in terms of stimulus and response. The connection between them is a “reflex arc” of neural connections running from the sense organ to the muscles and glands necessary for a response. Watson thus identifies each stimulus with specific physical features, and each response with specific physiological changes or movements. This came to be known, following Tolman (1932), as the “molecular” definition of behavior, distinct from the “molar” definition, which characterizes behaviors more abstractly; purposively (intentionally), or as a pattern of specific excitations and movements.

Watson applies the same system to humans and to nonhuman animals. He takes infants to be born with only a small stock of simple reflexes, or “unconditioned” stimulus-response pairs—nothing that could properly be called instinct. These basic reflex patterns are modified by conditioning. In conditioning, the new conditioned stimulus either “replaces” the original unconditioned stimulus as a cause of the response, like the musical notes in Pavlov’s experiments, or a new response is conditioned to an existing stimulus, as when one becomes afraid of a dog that had been previously seen as friendly. As these conditioned changes compound, stimulus-response sets can be coordinated in the ways that allow sophisticated behaviors in humans. He backs this up using experiments with infants, such as his ethically fraught Little Albert experiment: Watson conditioned a fear response to a white rat in 11-month-old Albert by making a loud noise every time the rat was produced (1924/1930, pp. 158-164).

Though Watson does not cast his own view in associative terms, his stimulus-response psychology effectively places association back at the center of psychology, and offhand references to association suggest he recognizes some connection. Even setting aside the specific points that S-R connections operate like associations, and classical conditioning like association by contiguity, Watson’s behaviorism shares with associationism an empiricist, anti-nativist orientation and an ideal of basing psychology on a single principle.

c. Edward S. Robinson (1893-1937)

Edward S. Robinson’s work Association Theory To-Day (1932) argues that associations themselves are the same in both behaviorism and the older associationist tradition. The difference is what answer one gives to the question, “What is associated?” Associationism had been rejected in large part because it was taken to be a relation between mentalistic ideas. Robinson takes this to be unfair, pointing to the diversity of views in earlier associationists. Robinson was far from the first to note the role of association in behaviorism (the earliest paper he cites as arguing along these lines is Hunter 1917; see also Guthrie 1930, discussed below), but he presents a systematic attempt to import previously existing associationist machinery to behaviorism.

An association is still an association, according to Robinson, whether it holds between ideas, stimuli and responses, or neural pathways. He adopts the generic term “psychological activities” to capture all of these, saying that association is a disposition of some activities to instigate particular others. He tentatively adopts a “molar” view of psychological activities over Watson’s molecular view because he doesn’t think existing research has actually shown associations between particular physiological activities. Thus, he argues that the relevant activities must be described at a more abstract level. Robinson does rely on behavioral evidence but does not proclaim the behaviorist rejection of all mentalistic postulates. He takes it to be an open empirical question which activities will be associated in the most effective version of the theory.

Robinson goes on to discuss several laws of association, describing how each should be viewed and summarizing relevant experimental findings. Contiguity, the first, is apparent in conditioning. He attributes the second, assimilation, to Thorndike’s observation that a person will give the same response when presented with sufficiently similar situations (pp. 81-82). Robinson denies this is the same as association by similarity proper, but it is the same basic role Bain gives similarity. Others include frequency, duration, context, acquaintance, composition, and individual differences. He takes the actual associative strength to be a sum of all of these features, lamenting the overemphasis on contiguity itself.

d. B. F. Skinner (1904-1990)

Skinner, like Watson, does not frame his understanding of behaviorism in terms of association. Nonetheless, his work is noteworthy for placing reinforcement at the center of learning. The focus here is on his early career. Skinner studied operant conditioning using an apparatus in which a rat would press a lever to receive food. The food, in this case, reinforces the action of pressing the lever. In Skinner’s view, reinforcement is necessary for operant learning. While this basic idea was known as part of Thorndike’s law of effect, it was not widely believed that effects could reinforce behavioral causes until Skinner. He went on to study reinforcement itself, especially the effects of various schedules of reinforcement (1938).

Skinner differentiated operant conditioning from Pavlovian, or classical, conditioning based on the sequences of stimulus/response (1935). Operant conditioning requires a four-step chain involving two reflexes: from a stimulus (sight of the lever) to an action (pressing the lever), which then causes another stimulus (food, the reinforcer) to a final action (eating/salivating). In Pavlovian-style experiments, a stimulus (for example, a light) switches from triggering an arbitrary reflex (such as orienting towards the light) to triggering a reflex relevant to the reinforcer (such as salivation if food is the reinforcer). Reinforcement is necessary for both; it simply plays a different role. Thinking in associative terms, different types of conditioning are differentiable by structure of associations. But this again modifies the conception of the process of association. Simple contiguity is not enough, one of the stimuli involved must also play the role of reinforcer.

Later, Skinner abandoned the stimulus-response framing of operant conditioning, arguing that the action (lever press) need not be viewed as a direct response to a stimulus (seeing the lever). To explain behavior in such a case, one must look back to the history of reinforcement, rather than any particular eliciting stimulus (1978). Skinner generally opposed private mentalistic posits, but his views on this were not always clear or consistent. He did, like Watson, treat behavior as the only legitimate target of study, retain a generally empiricist picture of mind, and take the view to apply generally. He was able to show that “shaping” techniques based on operant conditioning could train animals to complete sophisticated tasks, and he took this to apply to humans as well (1953), including with regard to language (1957) and even society (1976).

e. Edwin Guthrie (1886-1959)

Edwin Guthrie argues that the core phenomenon of conditioning is just association by contiguity, which he views as the single principle of learning. He states the principle as such: “Stimuli acting at a given instant tend to acquire some effectiveness toward the eliciting of concurrent responses, and this effectiveness tends to last indefinitely” (1930, pg. 416). He goes on to argue that various empirical phenomena of learning, including even forgetting and insight, “may all be understood as instances of a very simple and very familiar principle, the ancient principle of association by contiguity in time” (1930, pg. 428). He later builds on this conception by arguing that stimuli to which animals pay attention will become associated. He takes this to be the actual action by which reinforcers work, dissatisfied by Skinner’s seemingly circular definition of the term “reinforcer.” He presents the new version in simplified form as follows: “What is being noticed becomes a signal for what is being done” (1959, pg. 186).

Guthrie takes the focus on behavior to be an abstraction intended to make psychology empirically tractable, in the same way that physics models frictionless planes. As such, his behaviorism could be seen as less extreme than Watson or Skinner, but perhaps more so than Robinson.

f. Themes and Lessons

Across behaviorist views, association remains the core concept. As in the previous section, though, some authors explicitly take on the associationist mantle while others ignore it. Also as above, there is a diversity in views on the actual structure of associations, how they develop, and what is taken to be associated. Skinner (1945) captured perhaps the largest division: that between the radical behaviorists and the methodological behaviorists. This division is easily cast in terms of their views on association. The radical behaviorists, exemplified by Watson and Skinner, aim to eliminate mentalistic concepts; association can allow this, via the minimal connection between stimulus and response. The methodological behaviorists, exemplified here by Guthrie and Robinson, take the emphasis on behavior to be a methodological abstraction or simplification necessary for scientific progress. By implication, association itself is an abstract relation, which in principle can subsume various possible mechanisms, rather than excluding them.

4. After the Cognitive Revolution (1950s-2000s)

As cognitivism came to dominate in the mid-twentieth century, association took up various roles in different literatures. The rise of cognitivism brought two key changes in psychology generally. First, internal mental states returned. However, these states were generally viewed as functionally defined representational states rather than as imagistic ideas, as in the empiricist associationists. Second, cognitivism views the mind in broadly computational terms. Cognitivists take many psychological processes, called “cognitive processes,” to be algorithms that operate by applying formal rules to symbolic representational states, perhaps in a manner similar to language. Cognitive processes are often contrasted with associative processes, setting up a general view in which association is one kind of psychological process among many. Association is thought to be limited, in particular, because it is too simple to account for complex, rational thought (see Dacey 2019a). Learning by contiguity cannot differentiate which experienced sequences reflect real-world relations and which are mere accidents. Associative sequences in thought do not allow flexible application; they must be rigidly followed. Thus, associative processes are usually posited in simpler systems, like nonhuman animals, or the human unconscious. However, as connectionist computational strategies began to bear fruit, some treated these as a new, revitalized form of general associationism.

This section discusses three research programs that each treat associations in different ways and collectively capture the main threads of late twentieth- and early twenty-first-century thought on association.

a. Semantic Networks

The first program represents semantic memory—memory for facts—as a network of linked concepts. Retrieval or recall of information in such a model is described by activation spreading through this network. When activation reaches some critical level, the information is retrieved and available for use or report. This program got its formal start in the late 1960s with work by Ross Quillian and Allen Collins (Collins and Quillian 1969), and subsequently John R. Anderson (1974) and Elizabeth Loftus (Collins and Loftus 1975). The general idea is that different patterns of association explain facts about information retrieval, such as when it succeeds or fails, and how long it takes. John Anderson generalized the basic idea as part of his Human Associative Memory (HAM) model (Anderson and Bower 1973) and Adaptive Control of Thought (ACT) model and its descendants (Anderson 1996). In more specific circumstances, this basic strategy has been applied in a number of phenomena where information is accessed automatically, including: cued recall, priming (McNamara 2005), word association task responses, false memory (Gallo 2013), reading comprehension (Ericsson and Kintsch 1995), creativity (Runco 2014), and implicit social bias (Fazio 2007, Gawronski and Bodenhausen 2006; see also section 5).

Spreading activation in a network manifests one side of the standard associative story. The difference from previous traditions is that associations relate concepts or propositions, and these networks usually include a possibility of subcritical activation of a concept that can facilitate later retrieval. These models rarely say anything explicitly about learning, but they sometimes carry implications for learning. Often, links are not taken to represent any particular relation, signifying only the disposition to spread activation. This is taken to indicate that the links are learned through a process like association by contiguity, which cannot encode meaningful real-world information. However, sometimes links are labeled with a meaningful relationship between concepts, which would imply a learning process capable of tracking that relation. In addition, some models that emerged out of related research, such as Latent Semantic Analysis (LSA) (Landauer and Dumais 1997) and Bound Encoding of the Aggregate Language Environment (BEAGLE) (Jones and Mewhort 2007), extract semantic information (for example, semantic similarity) about words in a linguistic corpus based on clustering patterns with other words.

b. Associative Learning and the Resorla-Wagner Model

Work on learning proceeded largely separately from the work on semantic networks just described. After the cognitive revolution, conditioning effects remained a representative phenomenon of basic learning processes. They were, again, re-described. Since the associations were taken to be formed between internal mental representations, conditioning was subsumed under the heading of “contingency learning” or “associative learning”: the learning of relations between events that tend to co-occur. “Associative learning” is sometimes used in this literature to refer to this phenomenon, regardless of what mechanism is taken to produce it. In this literature, human and nonhuman animal research have long informed one another. However, the orientation can depend on the subjects. It has long been accepted that humans have complex cognitive processes running in parallel with any simple associative processes (Shanks 2007). The question in the human literature is often whether purely associative models can explain any human learning. Research on animal minds is still heavily influenced by Morgan’s Canon (section 3.a). As a result, associative explanations have been heavily favored. Thus, the question is often whether nonhuman animals have any processes that cannot be described in associative terms.

The Rescorla–Wagner model (1972) has dominated much of this research, either by itself or through its various modifications and descendants. This model includes a “prediction” that is made when the antecedent cue is produced. Associative strength is either increased or decreased based on whether that prediction is borne out. For instance, if an animal has a strong association between a cue and a target, the animal will expect the target once the cue is presented. If the target does not follow, the associative strength is reduced. This presents a different conception of association from those encountered so far, as a prediction-error process, contrasted with the footpath notion of contiguity and with reinforcement (Rescorla 1988; see also Danks 2014, pg. 20, arguing that the prediction itself is not usually taken realistically). It also makes the Rescorla-Wagner model more successful at predicting various phenomena in contingency learning than previous conceptions of association. For instance, it predicts the fact that existing associations can block new associations from forming (Miller, Barnet, and Graham 1995). The computational precision and simplicity of associative models like the Rescorla-Wagner model are a major draw, and they have been further supported by neural evidence of prediction-error tracking in the brain (Schultz, Dayan, and Montague 1997).

However, one can also complicate models like this in various ways. Some models allow interactions between existing associations during learning (Dickinson 2001). Others allow interactions between association and other processes, like attention or background knowledge (Pearce and Macintosh 2010, Dickinson 2012, Thorwart and Livesey 2016). Finally, one can also model interference between associations at retrieval, as in the SOCR (Sometimes-Competing Retrieval) model (Stout and Miller 2007).

Even with these complicated types of models, critics have argued that simple associative stories cannot capture the complexity of associative learning. For instance, some argue that the processes responsible for human associative learning must be propositional (Mitchell, DeHouwer, and Lovibond 2009). Gallistel has been perhaps the most prominent opponent of associative theories of learning in animals generally, arguing that the processes responsible must be symbolic (Gallistel 1990, Gallistel and Gibbon 2002).

c. Connectionism

The arrival of c onnectionism as a major theory of mind in the 1980s was hailed as a revolution by many of its proponents (Rumelhart, McClelland, and PDP research group 1986). Connectionist models perform especially well in various kinds of categorization tasks. They are a kind of spreading activation model in which activation spreads through sequential layers of nodes. Though there were important precursors, especially Hebb (1949) and Rosenblatt (1962), connectionism came into its own when new techniques allowed much more computationally powerful three-layer networks. These networks include a “hidden” layer between “input” and “output” layers. The revolutionary claims of connectionism are usually based on the idea that the hidden layer represents information in a distributed manner, as a pattern of activation across multiple nodes. Thus, nodes are treated as “subrepresentational” units of information that also presumably correspond to something in the brain, such as neurons, sets or assemblies of neurons, or brain regions (Smolensky 1988). This is also thought to be a realistic view of representation in the brain, which is likely distributed. Unlike the other research programs discussed in this section, which take association to describe one kind of processing among many, connectionism, at least initially, purported to provide a general model of mind.

Connectionism has been treated as a version of associationism by both proponents (Bechtel and Abrahamsen 1991, Clark 1993) and opponents (Fodor and Pylyshyn 1988). This is because it implements a kind of spreading activation, as well as the fact that connectionist networks are able to learn—something symbolic systems struggle with. While the emphasis on learning aligns with a generally empiricist approach, the specific mechanism matters for what, exactly, to make of this. Perhaps the most common process, backpropagation, is not usually thought to be realistic. Another common process, Hebbian learning, implements a version of association by contiguity (Hebb 1949). This is treated as more biologically plausible, but models implementing it are less powerful.

These networks modify the treatment of association by providing another set of answers to the question of what is associated. In this case, it is subrepresentational units or parts of the brain. While neural level stories have attended association throughout its history (see above sections on Hartley, Freud, and Watson; see also Sutton 1998 for discussion of similarities between connectionism and these historical views), they are usually secondary to a psychological-level story. Connectionists, in contrast, actually attempt to model neural-level phenomena.

In many networks, the number of hidden-layer nodes is chosen somewhat arbitrarily, and the network is tuned in whatever way gets the input-output mappings right. The question of what each node might represent in the brain is secondary, complicating their interpretation as actual models of the mind/brain. Arguably, later work during this period split between two approaches. Many researchers simply explore the framework as a computational tool, up to and including deep learning. These researchers are not primarily concerned with accurate modeling of brain processes, though they may view their models as “how-possibly” models (see Buckner 2018 for such a discussion of deep learning models and abstraction). Computational neuroscientists, on the other hand, generally start with neural information like single unit recordings, and model specific neural circuits, networks, or regions.

5. Ongoing Philosophical Discussion (2000s-2020s)

This section briefly surveys two debates that brought the concept of association back under philosophical scrutiny. These debates take place largely in the frameworks outlined in the last section.

a. Dual-Process Theories and Implicit Bias

One of the most philosophically important implications of early twenty-first-century work in psychology, especially social psychology, was the finding that much of our behavior is driven, or heavily influenced, by unconscious processes. Theorists generally captured these findings with Dual-Process theories, which separate the mind into two systems or processing types. Type 1 processing is fast, effortless, uncontrolled, and unconscious, while Type 2 processing is slow, effortful, controlled, and conscious. It is often the case that association is considered to be among the processes in Type 1, but Type 1 is also sometimes treated as associative in general (Kahneman 2011, Uhlmann, Poehlman, and Nosek 2012). This stronger claim is controversial (Mandelbaum 2016), but it is often implicit in discussions of unconscious processing.

The conception of association involved largely stems from the semantic network program described above. These authors, however, tend to emphasize the simplicity of associative processing, and so take onboard an associative account of learning as well. Thus, at stake is not just how one thinks about the mechanisms of unconscious processing, but how they relate to one’s agency and responsibility. It is often thought that unconscious processes cannot produce responsible action because they are associative and as such are too inflexible to produce responsible action (Levy 2014). How one understands and attributes associative models and associative processes is, as a result, significant for the conclusions one draws from this work (Dacey 2019b).

b. The Association/Cognition Distinction

The second discussion has occurred in relation to work in comparative animal psychology. In that literature, many debates are centered on whether the process responsible is associative or cognitive, with association gaining a default status due to Morgan’s Canon. As a result, associative processes are usually thought to be ubiquitous and sometimes can even potentially explain seemingly complex behavior (see Heyes 2012). Some authors have attacked the associative or cognitive framing as unproductive (Buckner 2011, Smith, Couchman, and Beran 2014, Dacey 2016). It remains an empirical question whether psychological processes cluster in ways that support a distinction between associative and cognitive processes. Nonetheless, there are reasons to reframe associative models as operating at either a lower, neural level (Buckner 2017) or a higher, more abstract level (Dacey 2016). Either move would, in principle, allow associative models and cognitive models to be applied to the same process, dissolving the problematic dichotomy.

6. Conclusion

Association is one of the most enduring concepts in the history of theorizing about the mind because it is one of the most flexible and one of the most powerful. The basic phenomena seem clear and indisputable: Some thoughts follow easily in sequence, and frequency of repetition is one reason for this. The models that formalize and articulate this insight seem capable of capturing many psychological phenomena. What this means is disputed and much less clear. There are questions pertaining to the specific mechanisms behind these phenomena, how many phenomena can be explained in these terms, what the associations are, and what is associated. The various views discussed above present very different answers to these questions.

7. References and Further Reading

Anderson, J. R. (1974). Retrieval of Propositional Information from Long-Term Memory. Cognitive Psychology, 6(4), 451-474.
Anderson, J. R. (1996). ACT: A Simple Theory of Complex Cognition. American Psychologist, 51(4), 355.
Anderson, J. R., and Bower, G. H. (1973). Human Associative Memory. Washington, D. C.:V. H. Winston and Sons.
Aristotle (2001). Aristotle’s On the Soul and On Memory and Recollection. J. Sachs (Trans.). Santa Fe: Green Lion Press.
Bain, A. (1868). The Senses and the Intellect. 3^rd ed. London: Longman’s, Green, and Co.
Bain, A. (1887). On ‘Association’-Controversies. Mind, 12(46), 161-182.
Bechtel, W., and Abrahamsen, A. (1991). Connectionism and the Mind: Parallel Processing, Dynamics, and Evolution in Networks. Oxford: Blackwell Publishing.
Buckner, C. (2011). Two Approaches to the Distinction between Cognition and ‘Mere Association’. International Journal of Comparative Psychology, 24(4).
Brown, T. (1820). Lectures on the Philosophy of the Human Mind. Edinburgh: W. and C. Tait.
Buckner, C. (2017). Understanding Associative and Cognitive Explanations in Comparative Psychology. The Routledge Handbook of Philosophy of Animal Minds. Oxford: Routledge, 409-419.
Buckner, C. (2018). Empiricism without Magic: Transformational Abstraction in Deep Convolutional Neural Networks. Synthese, 195(12), 5339-5372.
Calkins, M. W. (1896). Association (II.). Psychological Review, 3(1), 32.
Calkins, M. W. (1901). An Introduction to Psychology. London: The Macmillan Company.
Clark, A. (1993). Associative Engine: Connectionism, Concepts, and Representational Change. Cambridge MA: MIT Press.
Collins, A. M., and Loftus, E. F. (1975). A Spreading-Activation Theory of Semantic Processing. Psychological Review, 82(6), 407.
Collins, A. M., and Quillian, M. R. (1969). Retrieval Time From Semantic Memory. Journal of Verbal Learning and Verbal Behavior, 8(2), 240-247.
Dacey, M. (2015). Associationism without Associative Links: Thomas Brown and the Associationist Project. Studies in History and Philosophy of Science Part A, 54, 31–40.
Dacey, M. (2016). Rethinking Associations in Psychology. Synthese, 193(12), 3763-3786.
Dacey, M. (2019a). Simplicity and the Meaning of Mental Association. Erkenntnis, 84(6), 1207-1228.
Dacey, M. (2019b). Association and the Mechanisms of Priming. Journal of Cognitive Science, 20(3), 281-321.
Danks, D. (2014). Unifying the Mind: Cognitive Representations as Graphical Models. Cambridge, MA: MIT Press.
Dickinson, A. (2001). Causal Learning: An Associative Analysis. The Quarterly Journal of Experimental Psychology, 54B(1), 3-25.
Dickinson, A. (2012). Associative Learning and Animal Cognition. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603), 2733–2742.
Ebbinghaus, H. (1885). 1913. Memory: A Contribution to Experimental Psychology.
Ericsson, K. A., and Kintsch, W. (1995). Long-Term Working Memory. Psychological Review, 102(2), 211.
Fazio, R. (2007). Attitudes as Object-Evaluation Associations of Varying Strength. Social Cognition, 25(5), 603–637.
Fodor, J. A. (1998). Concepts: Where Cognitive Science Went Wrong. Oxford: Oxford University Press.
Fodor, J. A., and Pylyshyn, Z. W. (1988). Connectionism and Cognitive Architecture: A Critical Analysis. Cognition, 28(1-2), 3-71.
Freud, S. (1953-1964). The Standard Edition of the Complete Psychological Works of Sigmund Freud (J. Strachey and A. Freud Eds.), 24 vols. London: The Hogarth Press and the Institute of Psycho-Analysis.
Includes the Project for a Scientific Psychology in Volume 1.
Gallistel, C. R. (1990). The Organization of Learning. Cambridge, MA: The MIT Press.
Gallistel, C. R., and Gibbon, J. (2002). The Symbolic Foundations of Conditioned Behavior. n. p.: Psychology Press.
Gallo, D. (2013). Associative Illusions of Memory: False Memory Research in DRM and Related Tasks. n. p.: Psychology Press.
Galton, F. (1879). Psychometric Experiments. Brain, 2(2), 149-162.
Gawronski, B., and Bodenhausen, G. V. (2006). Associative and Propositional Processes in Evaluation: An Integrative Review of Implicit and Explicit Attitude Change. Psychological Bulletin, 132(5), 692.
Guthrie, E. R. (1930). Conditioning as a Principle of Learning. Psychological Review, 37(5), 412.
Guthrie, E. (1959). Association by Contiguity. in Psychology: A Study of a Science. Vol. 2: General Systematic Formulations, Learning, and Special Processes. S. Koch (ed.). New York: McGraw Hill Book Company.
Hartley, D. (1749/1966). Observations on Man. Gainesville, FL: Scholars’ Facsimiles and Reprints.
Hebb, D. O. (1949). The Organization of Behavior. New York: Wiley.
Heyes, C. (2012). Simple Minds: A Qualified Defence of Associative Learning. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603), 2695-2703.
Hobbes, T. (1651/1991). Leviathan, R. Tuck (ed.). Cambridge: Cambridge University Press.
Hoeldtke, R. (1967). The History of Associationism and British Medical Psychology. Medical History, 11(1), 46-65.
A history of associationism focusing on psychiatric applications.
Hume, D. (1739/1978). A Treatise of Human Nature. L. A. Selby-Bigge, and P. H. Niddich (eds.), Oxford: Clarendon Press.
Hume, D. (1748/1974), Enquiries concerning Human Understanding and concerning the Principles of Morals. L. A. Selby-Bigge (ed.). Oxford: Clarendon Press.
Hunter, W. S. (1917). A Reformulation of the Law of Association. Psychological Review, 24(3), 188.
James, W. (1890/1950). The Principles of Psychology. New York: Dover Publications.
Jones, M. N., and Mewhort, D. J. (2007). Representing Word Meaning and Order Information in a Composite Holographic Lexicon. Psychological Review, 114(1), 1.
Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.
Kitcher, P. (1992). Freud’s Dream: A Complete Interdisciplinary Science of Mind. Cambridge, MA: MIT Press.
Landauer, T. K., and Dumais, S. T. (1997). A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. Psychological Review, 104(2), 211.
Levy, N. 2014. Consciousness and Moral Responsibility. New York: Oxford University Press.
Locke, J. (1700/1974). An Essay concerning Human Understanding. Peter H. Nidditch (ed.). Oxford: Clarendon Press.
Mandelbaum, E. (2016). Attitude, Inference, Association: On the Propositional Structure of Implicit Bias. Noûs, 50(3), 629-658.
McNamara, T. P. (2005). Semantic Priming: Perspectives from Memory and Word Recognition. n. p.: Psychology Press.
Mill, J. (1869) An Analysis of the Phenomena of the Human Mind. (A. Bain and J. S. Mill Eds.). London: Longmans, Green and Dyer.
- This edition includes comments from both Alexander Bain and John Stuart Mill.
Mill, J. S. (1963-91). The Collected Works of John Stuart Mill. J. M. Robson. (Gen. Ed.) 33 vols. Toronto: University of Toronto Press.
Miller, R. R., Barnet, R. C., and Grahame, N. J. (1995). Assessment of the Rescorla–Wagner Model. Psychological Bulletin, 117(3), 363–386.
Mitchell, C. J., De Houwer, J., and Lovibond, P. F. (2009). The Propositional Nature of Human Associative Learning. Behavioral and Brain Sciences, 32(2), 183-198.
Morgan, C. Lloyd. (1894). An Introduction to Comparative Psychology. London: Walter Scott.
Mortera, E. L. (2005). Reid, Stewart and the Association of Ideas. Journal of Scottish Philosophy, 3(2), 157-170.
Pavlov, I. P. (1897/1902). The Work of the Digestive Glands. W. H. Thompson (Trans.). London: Charles Griffin and Company.
Pavlov, I. P. (1927). Conditional Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex. G. V. Anrep (Trans.). London: Oxford.
Pearce, J. M., and Mackintosh, N. J. (2010). Two Theories of Attention: A Review and a Possible Integration. Attention and Associative learning: From Brain to Behaviour. Oxford: Oxford University Press.
Rapaport, D. (1974). The History of the Concept of Association of Ideas. New York: International Universities Press, Inc.
- This history focuses on the prehistory of the idea of association, applying the term somewhat more broadly than the authors themselves do.
Reid, T. (1872). The Works of Thomas Reid, D. D. W. Hamilton (ed.). Edinburgh: MacLaghlan and Stewart.
- Includes Essays on the Intellectual Powers of Man and William Hamilton’s history of association, discussed here.
Rescorla, R. A. (1988). Pavlovian Conditioning: It’s Not What You Think it Is. American Psychologist, 43(3), 151.
Rescorla, R. A., and Wagner, A. R. (1972). A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement. In A. H. Black and W. F. Prokasy (eds.), Classical Conditioning II (pp. 64–99). New York: Appleton-Century-Crofts.
Richardson, A. (2001) British Romanticism and the Science of the Mind. Cambridge: Cambridge University Press.
Robinson, E. S. (1932). Association Theory To-day: An Essay in Systematic Psychology. New York: The Century Co.
Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Washington: Spartan Books.
Rumelhart, D. E., McClelland, J. L., and PDP Research Group (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations. Cambridge, MA: MIT Press.
Runco, M.A. (2014). Creativity: Theories and Themes: Research, Development, and Practice. Amsterdam: Academic Press.
Schultz, W., Dayan, P., and Montague, P. R. (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593-1599.
Shanks, D. R. (2007). Associationism and Cognition: Human Contingency Learning at 25. The Quarterly Journal of Experimental Psychology, 60(3), 291-309.
Skinner, B. F. (1935). Two Types of Conditioned Reflex and a Pseudo Type. Journal of General Psychology, Vol. 13, 1: 66-77.
Skinner, B. F. (1938). The Behavior of Organisms. New York: Appleton-Century-Crofts, Inc.
Skinner, B. F. (1945). The Operational Analysis of Psychological Terms. Psychological Review, 52, 270-277, 291-294.
Skinner, B. F. (1953). Science and Human Behavior. London: Collier Macmillan Publishers.
Skinner, B. F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts, Inc.
Skinner, B. F. (1976). Walden two. Indianapolis: Hackett Publishing.
Skinner, B. F. (1978). The Experimental Analysis of Behavior (A History). In B. F. Skinner (ed.), Reflections on Behaviorism and Society (pp.113-126). Englewood Cliffs, NJ: Prentice-Hall.
Smith, J. D., Couchman, J. J., and Beran, M. J. (2014). Animal Metacognition: A Tale of Two Comparative Psychologies. Journal of Comparative Psychology, 128(2), 115.
Smolensky, P. (1988). On the Proper Treatment of Connectionism. Behavioral and Brain Sciences, 11(1), 1-23.
Spencer, H. (1898). Principles of Psychology Vol 1. New York: D. Appelton and Company.
- The substantially revised 3^rd edition was first published in 1880 and also serves as Volume 4 of his System of Synthetic Philosophy.
Stewart, D. (1855). Philosophical Essays. In W. Hamilton (ed.), The Collected Works of Dugald Stewart (Vol. V) Edinburgh: Thomas Constable and Co.
Stout, G. F. (1899) A Manual of Psychology. New York: University Correspondence College Press.
Stout, S. C., and Miller, R. R. (2007). Sometimes-Competing Retrieval (SOCR): A Formalization of the Comparator Hypothesis. Psychological Review, 114(3), 759.
Sulloway, F. J. (1979) Freud, Biologist of the Mind: Beyond the Psychoanalytic Legend. New York: Basic Books, Inc.
Sutton, J. (1998). Philosophy and Memory Traces: Descartes to Connectionism. Cambridge: Cambridge University Press.
Tabb, K. (2019). Locke on Enthusiasm and the Association of Ideas. Oxford Studies in Early Modern Philosophy Vol 9. DOI: 10.1093/oso/9780198852452.003.0003
Thorndike, E. L. (1898). Animal Intelligence: An Experimental Study of the Associative Processes in Animals. Psychological Monographs: General and Applied, 2(4), i-109.
Thorndike, E. L. (1905). The Elements of Psychology. New York: A. G. Seiler.
Thorndike, E. L. (1911). Animal Intelligence: Experimental Studies. New York: The MacMillan Company
Thorwart, A., and Livesey, E. J. (2016). Three Ways that Non-Associative Knowledge May Affect Associative Learning Processes. Frontiers in Psychology, 7, 2024.
Tolman, E. C. (1932/1967). Purposive Behavior in Animals and Men. New York: Irvington Publishers, Inc.
Uhlmann, E. L., Poehlman, T. A., and Nosek, B. (2012). Automatic Associations: Personal Attitudes or Cultural Knowledge? In Jon D. Hanson (ed.), Ideology, Psychology, and Law. New York: Oxford University Press, 228-260.
Warren, H. C. (1916). Mental Association from Plato to Hume. Psychological Review, 23(3), 208.
Warren, H. C. (1928) A History of the Association Psychology. New York: Charles Scribner’s Sons.
- The most complete history of associationism in existence, covering the period up to its publication. Includes more detail on views of most authors covered here, and many others.
Watson, J. B. (1913). Psychology as the Behaviorist Views it. Psychological Review, 20(2), 158.
Watson, J. B. (1924/1930). Behaviorism. Chicago: The University of Chicago Press.
Wundt, W. (1901/1902). Outlines of Psychology 4^th ed. C. H. Judd (Trans.). Leipzig: Wilhelm Engelmann
Wundt, W. (1911/1912). An Introduction to Psychology. R. Pintner (Trans.). London: George Allen and Company.
Young, R. M. (1970). Mind, Brain and Adaptation in the Nineteenth Century: Cerebral Localization and Its Biological Context from Gall and Ferrier. Oxford: Clarendon Press.

Author Information

Mike Dacey
Email: mdacey@bates.edu
Bates College
U. S. A.

The Philosophy of Climate Science

Climate change is one of the defining challenges of the 21st century. But what is climate change, how do we know about it, and how should we react to it? This article summarizes the main conceptual issues and questions in the foundations of climate science, as well as of the parts of decision theory and economics that have been brought to bear on issues of climate in the wake of public discussions about an appropriate reaction to climate change.

We begin with a discussion of how to define climate. Even though “climate” and “climate change” have become ubiquitous terms, both in the popular media and in academic discourse, the correct definitions of both notions are hotly debated topics. We review different approaches and discuss their pros and cons. Climate models play an important role in many parts of climate science. We introduce different kinds of climate models and discuss their uses in detection and attribution, roughly the tasks of establishing that the climate of the Earth has changed and of identifying specific factors that cause these changes. The use of models in the study of climate change raises the question of how well-confirmed these models are and of what their predictive capabilities are. All this is subject to considerable debate, and we discuss a number of different positions. A recurring theme in discussions about climate models is uncertainty. But what is uncertainty and what kinds of uncertainties are there? We discuss different attempts to classify uncertainty and to pinpoint their sources. After these science-oriented topics, we turn to decision theory. Climate change raises difficult questions such as: What is the appropriate reaction to climate change? How much should we mitigate? To what extent should we adapt? What form should adaptation take? We discuss the framing of climate decision problems and then offer an examination of alternative decision rules in the context of climate decisions.

Introduction
Defining Climate and Climate Change
Climate Models
Detection and Attribution of Climate Change
Confirmation and Predictive Power
Understanding and Quantifying Uncertainty
Conceptualising Decisions Under Uncertainty
Managing Uncertainty
Conclusion
Glossary
References and Further Reading

1. Introduction

Climate science is an umbrella term referring to scientific disciplines studying aspects of the Earth’s climate. It includes, among others, parts of atmospheric science, oceanography, and glaciology. In the wake of public discussions about an appropriate reaction to climate change, parts of decision theory and economics have also been brought to bear on issues of climate. Contributions from these disciplines that can be considered part of the application of climate science fall under the scope of this article. At the heart of the philosophy of climate science lies a reflection on the methodology used to reach various conclusions about how the climate may evolve and what we should do about it. The philosophy of climate science is a new sub-discipline of the philosophy of science that began to crystalize at the turn of the 21^st century when philosophers of science started having a closer look at methods used in climate modelling. It comprises a reflection on almost all aspects of climate science, including observation and data, methods of detection and attribution, model ensembles, and decision-making under uncertainty. Since the devil is always in the detail, the philosophy of climate science operates in close contact with science itself and pays careful attention to the scientific details. For this reason, there is no clear separation between climate science and the philosophy thereof, and conferences in the field are often attended by both scientists and philosophers.

This article summarizes the main problems and questions in the foundations of climate science. Section 2 presents the problem of defining climate. Section 3 introduces climate models. Section 4 discusses the problem of detecting and attributing climate change. Section 5 examines the confirmation of climate models and the limits of predictability. Section 6 reviews classifications of uncertainty and the use of model ensembles. Section 7 turns to decision theory and discusses the framing of climate decision problems. Section 8 introduces alternative decision rules. Section 9 offers a few conclusions.

Two qualifications are in order. First, we review issues and questions that arise in connection with climate science from a philosophy of science perspective, and with special focus on epistemological and decision-theoretic problems. Needless to say, this is not the only perspective. Much can be said about climate science from other points of view, most notably science studies, sociology of science, political theory, and ethics. For want of space, we cannot review contributions from these fields.

Second, to guard against possible misunderstandings, it ought to be pointed out that engaging in a critical philosophical reflection on the aims and methods of climate science is in no way tantamount to adopting a position known as climate scepticism. Climate sceptics are a heterogeneous group of people who do not accept the results of ‘mainstream’ climate science, encompassing a broad spectrum from those who flat out deny the basic physics of the greenhouse effect (and the influence of human activities on the world’s climate) to a small minority who actively engage in scientific research and debate and reach conclusions at the lowest end of climate impacts. Critical philosophy of science is not the handmaiden of climate scepticism; nor are philosophers ipso facto climate sceptics. So, it should be stressed here that we do not endorse climate scepticism. We aim to understand how climate science works, reflect on its methods, and understand the questions that it raises.

2. Defining Climate and Climate Change

Climate talk is ubiquitous in the popular media as well as in academic discourse, and climate change has become a familiar topic. This veils the fact that climate is a complex concept and that the correct definitions of climate and climate change are a matter of controversy. To gain an understanding of the notion of climate, it is important to distinguish it from weather. Intuitively speaking, the weather at a particular place and a particular time is the state of the atmosphere at that place and at the given time. For instance, the weather in central London at 2 pm on 1 January 2015 can be characterised by saying that the temperature is 12 degrees centigrade, the humidity is 65%, and so forth. By contrast, climate is an aggregate of weather conditions: it is a distribution of particular variables (called the climate variables) arising for a particular configuration of the climate system.

The question is how to make this basic idea precise, and this is where different approaches diverge. 21st-century approaches to defining climate can be divided into two groups: those that define climate as a distribution over time, and those that define climate as an ensemble distribution. The climate variables in both approaches include those that describe the state of the atmosphere and the ocean, and sometimes also variables describing the state of glaciers and ice sheets [IPCC 2013].

Distribution over time. The state of the Earth depends on external conditions of the system such as the amount of energy received from the sun and volcanic activity. Assume that there is a period of time over which the external conditions are relatively stable in that they exhibit small fluctuations around a constant mean value c. One can then define the climate over this time period as the distribution of the climate variables over that period under constant external conditions c [for example, Lorenz 1995]. Climate change then amounts to successive time periods being characterised by different distributions. However, in reality the external conditions are not constant and even when there are just slight fluctuations around c, the resulting distributions may be very different. Hence this definition is unsatisfactory [Werndl 2015].

This problem can be avoided by defining climate as the empirically observed distribution over a specific period of time, where external conditions are allowed to vary. Again, climate change amounts to different distributions for successive time periods. This definition is popular because it is easy to estimate from the observations, for example, from the statistics taken over thirty years that are published by the World Meteorological Organisation [Hulme et al. 2009]. A major problem of this definition can be illustrated by the example in which, in the middle of a period of time, the Earth is hit by a meteorite and becomes a much colder place. Clearly, the climate before and after the hit of the meteor differ. Yet this definition has no resources to recognize this because all it says is that climate is a distribution arising over a specific time period.

To circumvent this problem, Werndl [2015] introduces the idea of regimes of varying external conditions and suggests defining climate as the distribution over time of the climate variables arising under a specific regime of varying external conditions. The challenge for this account is to spell out what exactly is meant by a regime of varying external conditions.

Ensemble Distribution. An ensemble of climate systems (not to be confused with a model ensemble) is an infinite collection of virtual copies of the climate system. Consider the sub-ensemble of these that satisfy the condition that the present values of the climate variables lie in a specific interval around the values measured in the actual climate system (that is, the values compatible with the measurement accuracy). Now assume again that there is period of time over which the external conditions are relatively stable in that they exhibit small fluctuations around a constant mean value c. Then climate at future time t is defined as the distribution of values of the climate variables that arises when all systems in the ensemble evolve from now to t under constant external conditions c [for example, Lorenz 1995]. In other words, the climate in the future is the distribution of the climate variables over all possible climates that are consistent with current observations under the assumption of constant external conditions c.

As we have seen previously, in reality, external conditions are not constant and even small fluctuations around a mean value can lead to different distributions [Werndl 2015]. This worry can be addressed by tracing the development of the initial condition ensemble under actual external conditions. The climate at future time t then is the distribution of the climate variables that arises when the initial conditions ensemble is evolved forward for the actual path taken by the external conditions [for example, Daron and Stainforth 2013].

This definition faces a number of conceptual challenges. First, it makes the world’s climate dependent on our knowledge (via measurement accuracy), but this is counterintuitive because we think of climate as something objective that is independent of our knowledge. Second, the above definition is a definition of future climate, and it is difficult to see how the present and past climate should be defined. Yet without a notion of the present and past climate one cannot define climate change. A third problem is that ensemble distributions (and thus climate) do not relate in a straightforward way to the past time series of observations of the actual Earth and this would imply that the climate cannot be estimated from them [compare, Werndl 2015].

These considerations show that defining climate is nontrivial and there is no generally accepted or uncontroversial definition of climate.

3. Climate Models

A climate model is a representation of particular aspects of the climate system. One of the simplest climate models is an energy-balance model, which treats the Earth as a flat surface with one layer of atmosphere above it. It is based on the simple principle that in equilibrium the incoming and outgoing radiation must be equal (see Dessler [2011], Chapters 3-6, for a discussion of such models). This model can be refined by dividing the Earth into zones, allowing energy transfer between zones, or describing a vertical profile of the atmospheric characteristics. Despite their simplicity, these models provide a good qualitative understanding of the greenhouse effect.

Modern climate science aims to construct models that integrate as much as possible of the known science (for an introduction to climate modelling see [McGuffie and Henderson-Sellers 2005]). Typically, this is done by dividing the Earth (both the atmosphere and ocean) into grid cells. In 2020, global climate models have a horizontal grid scale of around 150 km. Climatic processes can then be conceptualised as flows of physical quantities such as heat or vapour from one cell to another. These flows are mathematically described by equations. These equations form the ‘dynamical core’ of a global circulation model (GCM). The equations typically are intractable with analytical methods, and powerful supercomputers are used to solve them. For this reason, they are often referred to as ‘simulation models’. To solve equations numerically, time is discretised. Current state-of-the-art simulations use time steps of approximately 30 minutes, taking weeks or months in real time on supercomputers to simulate a century of climate evolution.

In order to compute a single hypothetical evolution of the climate system (a ‘model run’), we also require an initial condition and boundary conditions. The former is a mathematical description of the state of the climate system (projected into the model’s own domain) at the beginning of the period being simulated. The latter are values for any variables which affect the system, but which are not directly output by the calculations. These include, for instance, the concentration of greenhouse gases, the amount of aerosols in the atmosphere at a given time, and the amount of solar radiation received by the Earth. Since these are drivers of climatic change, they are often referred to as external forcings or external conditions.

Where processes occur on a smaller scale than the grid, they may be included via parameterisation, whereby the net effect of the process is separately calculated as a function of the grid variables. For instance, cloud formation is a physical process that cannot be directly simulated because typical clouds are much smaller than the grid. So, the net effect of clouds is usually parameterised (as a function of temperature, humidity, and so forth) in each grid cell and fed back into the calculation. Sub-grid processes are one of the main sources of uncertainty in climate models.

There are now dozens of global climate models under continuous development by national modelling centres like NASA, the UK Met Office, and the Beijing Climate Center, as well as by smaller institutions. An exact count is difficult because many modelling centres maintain multiple versions based on the same foundation. As an indication, in 2020 there were 89 model-versions submitted to CMIP6 (Coupled Model Intercomparison Project phase 6), from 35 modelling groups, though not all of these should be thought of as being “independent” models since assumptions and algorithms are often shared between institutions. In order to be able to compare outputs of these different models, the Coupled Model Intercomparison Project (CMIP) defines a suite of standard experiments to be run for each climate model. One standard experiment is to run each model using the historical forcings experienced during the twentieth century so that the output can be directly compared against real climate system data.

Climate models are used in many places in climate science, and their use gives rise to important questions. These questions are discussed in the next three sections.

4. Detection and Attribution of Climate Change

Every empirical study of climate has to begin by observing the climate. Meteorological observatories measure a number of variables such as air temperature near the surface of the Earth using thermometers. But more or less systematic observations are available since about 1750, and hence to reconstruct the climate before then scientists have to rely on proxy data: data for climate variables that are derived from observing other natural phenomena such as tree rings, ice cores, and ocean sediments.

The use of proxy data raises a number of methodological problems centred around the statistical processing of such data, which are often sparse, highly uncertain, and several inferential steps away from the climate variable of interest. These issues were at the heart of what has become known as the Hockey Stick controversy, which broke at the turn of the century in connection with a proxy-based reconstruction of the Northern Hemisphere temperature record [Mann, Bradley and Hughes, 1998]. The sceptics pursued two lines of argument. They cast doubt on the reliability of the available data, and they argued that the methods used to process the data are such that they would produce a hockey-stick-shaped curve from almost any data [for example, McIntyre and McKitrick 2003]. The papers published by the sceptics raised important issues and stimulated further research, but they were found to contain serious flaws undermining their conclusions. There are now more than two dozen reconstructions of this temperature record using various statistical methods and proxy data sources. Although there is indeed a wide range of plausible past temperatures, due to the constraints of the data and methods, these studies do robustly support the consensus that, over the past 1400 years, temperatures during the late 20th century are likely to have been the warmest [Frank et al. 2010].

Do rising temperatures indicate that there is climate change, and if so, can the change be attributed to human action? These two problems are known as the problems of detection and attribution. The Intergovernmental Panel on Climate Change (IPCC) defines these as follows:

Detection of change is defined as the process of demonstrating that climate or a system affected by climate has changed in some defined statistical sense without providing a reason for that change. An identified change is detected in observations if its likelihood of occurrence by chance due to internal variability alone is determined to be small […]. Attribution is defined as ‘the process of evaluating the relative contributions of multiple causal factors to a change or event with an assignment of statistical confidence.’ [IPCC 2013]

These definitions raise a host of issues. The root cause of the difficulties is the clause that climate change has been detected only if an observed change in the climate is unlikely to be due to internal variability. Internal variability is the phenomenon that the values of climate variables such as temperature and precipitation would change over time due to the internal dynamics of the climate system even in the absence of a change in external conditions, because of fluctuations in the frequency of storms, ocean currents, and so on.

Taken at face value, this definition of detection has the consequence that there cannot be internal climate change. The ice ages, for instance, would not count as climate change if they occurred because of internal variability. This is not only at odds with basic intuitions about climate and with the most common definitions of climate as a finite distribution over a relatively short time period (where internal climate change is possible); it also leads to difficulties with attribution: if detected climate change is ipso facto change not due to internal variability, then from the very beginning it is excluded that particular factors (namely, internal climate dynamics) can lead to a change in the climate, which seems to be an unfortunate conclusion.

For the case of the ice ages, many researchers would stress that internal variability is different from natural variability. Since, say, orbital changes explain the ice ages, and orbital changes are natural but external, this is a case of external climate change. While this move solves some of the problems, there remains the problem that there is no generally accepted way to separate internal and external factors, and the same factor is sometimes classified as internal and sometimes as external. For instance, glaciation processes are sometimes treated as internal factors and sometimes as prescribed external factors. Likewise, sometimes the biosphere is treated as an external factor, but sometimes it is also internally modelled and treated as an internal factor. One could even go so far to ask whether human activity is an external forcing on the climate system or an internally-generated Earth system process. Research studies usually treat human activity as an external forcing, but it could consistently be argued that human activities are an internal dynamical process. The appropriate definition simply depends on the research question of interest. For a discussion of these issues, see Katzav and Parker [2018].

The effects of internal variability are present on all timescales, from the sub-daily fluctuations experienced as weather to the long-term changes due to cycles of glaciation. Since internal variability stems from processes in a highly complex nonlinear system, it is also unlikely that the statistical properties of internal variability are constant over time, which further compounds methodological difficulties. State-of-the-art climate models run with constant forcing show significant disagreements both on the magnitude of internal variability and the timescale of variations. (On http://www.climate-lab-book.ac.uk/2013/variable-variability/#more-1321 the reader finds a plot showing the internal variability of all CMIP5 models. The plot indicates that models exhibit significantly different internal variability, leaving considerable uncertainty.) The model must be deemed to simulate pre-industrial climate (including variability) sufficiently well before it can be used for such detection and attribution studies, but we do not have thousands of years of detailed observations upon which to base that judgement. Estimates of internal variability in the climate system are produced from climate models themselves [Hegerl et al. 2010], leading to potential circularity. This underscores the difficulties in making attribution statements based on the above definition, which recognises an observed change as climate change only if is unlikely to be due to internal variability.

Since the IPCC’s definitions are widely used by climate scientists, the discussion about detection and attribution in the remainder of this section is based on these definitions. Detection relies on statistical tests, and detection studies are often phrased in terms of the likelihood of a particular event or sequence of events happening in the absence of climate change. In practice, the challenge is to define an appropriate null hypothesis (the expected behaviour of the system in the absence of changing external influences), against which the observed outcomes can be tested. Because the climate system is a dynamical system with processes and feedbacks operating on all scales, this is a non-trivial exercise. An indication of the importance of the null hypothesis is given by the results of Cohn and Lins [2005], who compare the same data against alternate null hypotheses, with results differing by 25 orders of magnitude of significance! This does not in itself show that either null is more appropriate, but it demonstrates the sensitivity of the result to the null hypothesis chosen. This, in turn, underscores the importance of the choice of null hypothesis and the difficulty of making any such choice if the underlying processes are poorly understood.

In practice, the best available null hypothesis is often the best available model of the behaviour of the climate system, including internal variability, which for most climate variables usually means a state of the art GCM. This model is then used to perform long control runs with constant forcings in order to quantify the internal variability of the model (see discussion above). Climate change is then said to have been detected when the observed values fall outside a predefined range of the internal variability of the model. The difficulty with this method is that there is no single “best” model to choose: many such models exist, they are similarly well developed, but, as noted above, they have appreciably different patterns of internal variability.

The differences between different models are relatively unimportant for the clearest detection results such as recent increases in global mean temperature. Here, as stressed by Parker [2010], detection is robust across different models (for a discussion of robustness see Section 6), and, moreover, there is a variety of different pieces of evidence all pointing to the conclusion that the global mean temperature has increased beyond that which can be attributed to internal variability. However, the issues of which null hypothesis to use and how to quantify internal variability, can be important for the detection of subtler local climate change.

If climate change has been detected, then the question of attribution arises. This might be an attribution of any particular change (either a direct climatic change such as increased global mean temperature, or an impact such as the area burnt by forest fires) to any identified cause (such as increased CO₂ in the atmosphere, volcanic eruptions, or human population density). Where an impact is considered, a two-step or multi-step approach may be appropriate. An example of this, taken from the IPCC Good Practice Guidance paper [Hegerl et al. 2010], is the attribution of coral reef calcification impacts to rising CO₂ levels, in which an intermediate stage is used by first attributing changes in the carbonate ion concentration to rising CO₂ levels, then attributing calcification processes to changes in the carbonate ion concentration. This also illustrates the need for a clear understanding of the physical mechanisms involved, in order to perform a reliable multi-step attribution in the presence of many potential confounding factors.

In the interpretation of attribution results, in particular those framed as a question of whether human activity has influenced a particular climatic change or event, there is a tendency to focus on whether the confidence interval of the estimated anthropogenic effect crosses zero. The absence of such a crossing indicates that change is likely to be due to human factors. This results in conservative attribution statements, but it reflects the focus of the present debate where, in the eyes of the public and media, “attribution” is often understood as confidence in ruling out non-human factors, rather than as giving a best estimate or relative contributions of different factors.

Statistical analysis quantifies the strength of the relationship, given the simplifying assumptions of the attribution framework, but the level of confidence in the simplifying assumptions must be assessed outside that framework. This level of confidence is standardised by the IPCC into discrete (though subjective) categories (“very high”, “high”, and so forth.), which aim to take account of the process knowledge, data limitations, adequacy of models used, and the presence of potential confounding factors. The conclusion that is reached will then have a form similar to the IPCC’s headline attribution statement:

It is extremely likely [³95% probability] that more than half of the observed increase in global average surface temperature from 1951 to 2010 was caused by the anthropogenic increase in greenhouse gas concentrations and other anthropogenic forcings together. [IPCC 2013; Summary for Policymakers, section D.3].

One attribution method is optimal fingerprinting. The method seeks to define a spatio-temporal pattern of change (fingerprint) associated with each potential driver (such as the effect of greenhouse gases or of changes in solar radiation), normalised relative to the internal variability, and then perform a statistical regression of observed data with respect to linear combinations of these patterns. The residual variability after observations have been attributed to each factor should then be consistent with the internal variability; if not, this suggests that an important source of variability remains unaccounted for. Parker [2010] notes that fingerprint studies rely on several assumptions. Chief among them is linearity, that is, that the response of the climate system when several forcing factors are present is equal to a linear combination of the effects of the forcings. Because the climate system is nonlinear, this is clearly a source of methodological difficulty, although for global-scale responses (in contrast to regional-scale responses) additivity has been shown to be a good approximation.

Levels of confidence in these attribution statements are primarily dependent on physical understanding of the processes involved. Where there is a clear, simple, well-understood mechanism, there should be greater confidence in the statistical result; where the mechanisms are loose, multi-factored or multi-step, or where a complex model is used as an intermediary, confidence is correspondingly lower. The Guidance Paper cautions that,

Where models are used in attribution, a model’s ability to properly represent the relevant causal link should be assessed. This should include an assessment of model biases and the model’s ability to capture the relevant processes and scales of interest. [Hegerl 2010, 5]

As Parker [2010] argues, there is also higher confidence in attribution results when the results are robust and there is a variety of evidence. For instance, the finding that late twentieth-century temperature increase was mainly caused by greenhouse gas forcing is found to be robust given a wide range of different models, different analysis techniques, and different forcings; and there is a variety of evidence all of which supports this claim. Thus our confidence that greenhouse gases explain global warming is high. (For further useful extended discussion of detection and attribution methods in climate science, see pages 872-878 of IPCC [2013] and in the Good Practice Guidance paper by Hegerl et al. [2010], and for a discussion of how such hypotheses are tested see Katzav [2013].)

In addition to the large-scale attribution of climate change, attribution of the degree to which individual weather events have become either more likely or more extreme as a result of increasing atmospheric greenhouse gas concentrations is now common. It has a particular public interest as it is perceived as a way both to communicate that climate impacts are happening already, perhaps quantifying risk numerically to price insurance, and offering a motivation for climate mitigation. There is therefore also an incentive to conduct these studies quickly, to inform timely news articles, and some groups have formed to respond quickly to reports of extreme weather and conduct attribution studies immediately. This relies on the availability of data, may suffer from unclear definitions of exactly what category of event is being analysed, and is open to criticism for publicity prior to peer review. There are also statistical implications of choosing to analyse only those events which have happened and not those that did not happen. For a discussion of event attribution see Lloyd and Oreskes [2019] and Lusk [2017].

5. Confirmation and Predictive Power

Two questions arise in connection with models: how are models confirmed and what is their predictive power? Confirmation concerns the question of whether, and to what degree, a specific model is supported by the data. Lloyd [2009] argues that many climate models are confirmed by past data. Parker [2009] objects to this claim. She argues that the idea that climate models per se are confirmed cannot be seriously entertained because all climate models are known to be wrong and empirically inadequate. Parker urges a shift in thinking from confirmation to adequacy for purpose: models can only be found to be adequate for specific purposes, but they cannot be confirmed wholesale. For example, one might claim that a particular climate model adequately predicts the global temperature increase that will occur by 2100 (when run from particular initial conditions and relative to a particular emission scenario). Yet, at the same time, one might hold that the predictions of global mean precipitation by 2100 by the same model cannot be trusted.

Katzav [2014] cautions that adequacy for purpose assessments are of limited use. He claims that these assessments are typically unachievable because it is far from clear which of the model’s observable implications can possibly be used to show that the model is adequate for the purpose. Instead, he argues that climate models can at best be confirmed as providing a range of possible futures. Katzav is right to stress that adequacy for purpose assessments are more difficult than appears at first sight. But the methodology of adequacy for purpose cannot be dismissed wholesale; in fact, it is used successfully across the sciences (for example, when ideal gas models are confirmed to be useful for particular purposes). Whether or not adequacy for purpose assessment is possible depends on the case at hand.

If one finds that one model predicts specific variables well and another model doesn’t, then one would like to know the reasons why the first model is successful and the second not. Lenhard and Winsberg [2010] argue that this is often very difficult, if not impossible: For complex climate models a strong version of confirmation holism makes it impossible to tell where the failures and successes of climate models lie. In particular, they claim that it is impossible to assess the merits and problems of sub-models and the parts of models. There is a question, though, whether this confirmation holism affects all models and whether it is here to stay. Complex models have different modules for the atmosphere, the ocean, and ice. These modules can be run individually and also together. The aim of the many new Model Intercomparison Projects (MIPs) is, by comparing individual and combined runs, to obtain an understanding of the performance and physical merits of separate modules, which it is hoped will identify areas for improvement and eventually result in better performance of the entire model.

Another problem concerns the use of data in the construction of models. The values of model parameters are often estimated using observations, a process known as calibration. For example, the magnitude of the aerosol forcing is sometimes estimated from data. When data have been used for calibration, the question arises whether the same data can be used again to confirm the model. If data are used for confirmation that have not already been used for calibration, they are use-novel. If data are used for both calibration and confirmation, this is referred to as double-counting.

Scientists and philosophers alike have argued that double-counting is illegitimate and that data have to be use-novel to be confirmatory [Lloyd 2010; Shackley et al. 1998; Worrall 2010]. Steele and Werndl [2013] oppose this conclusion and argue that on Bayesian and relative-likelihood accounts of confirmation double-counting is legitimate. Furthermore, Steele and Werndl [2015] argue that model selection theory presents a more nuanced picture of the use of data than the commonly endorsed positions. Frisch [2015] cautions that Bayesian as well as other inductive logics can be applied in better and worse ways to real problems such as climate prediction. Nothing in the logic prevents facts from being misinterpreted and their confirmatory power exaggerated (as in ‘the problem of old evidence’ which Frisch [2015] discusses). This is certainly a point worth emphasising. Indeed, Steele and Werndl [2013] stress that the same data cannot inform a prior probability for a hypothesis and also further (dis)confirm the hypothesis. But they do not address all the potential pitfalls in applying Bayesian or other logics to the climate and other settings. Their argument must be understood as a limited one: there is no univocal logical prohibition against the same data serving for calibration and confirmation. As far as non-Bayesian methods of model selection goes, there are two cases. First, there are methods such as cross-validation where the data are required to be use-novel. For cross-validation, the data are split up into two groups: the first group is used for calibration and the second for confirmation. Second, there are the methods such as the Akaike Information Criterion for which the data need not be use-novel, although information criteria methods are hard to apply in practice to climate models because the number of degrees of freedom is poorly defined.

This brings us to the second issue: prediction. In the climate context this is typically framed as the issue of projection. ‘Projection’ is a technical term in the climate modelling literature and refers to a prediction that is conditional on a particular forcing scenario and a particular initial conditions ensemble. The forcing scenario is specified either by the amount of greenhouse gas emissions and aerosols added to the atmosphere or directly by their atmospheric concentrations, and these in turn depend on future socioeconomic and technological developments.

Much research these days is undertaken with the aim of generating projections about the actual future evolution of the Earth system under a particular emission scenario, upon which policies are made and real-life decisions are taken. In these cases, it is necessary to quantify and understand how good those projections are likely to be. It is doubtful that this question can be answered along traditional lines. One such line would be to refer to the confirmation of a model against historical data (Chapter 9 of IPCC [2013] discusses model evaluation in detail) and argue that the ability of a model to successfully reproduce historical data should give us confidence that it will perform well in the future too. It is unclear at best whether this is a viable answer. The problem is that climate projections for high forcing scenarios take the system well outside any previously experienced state, and at least prima facie there is no reason to assume that success in low forcing contexts is a guide to success in high-forcing contexts; for example, a model calibrated on data from a world with the Arctic Sea covered in ice might no longer perform well when the sea ice is completely melted and the relevant dynamical processes are quite different. For this reason, calibration to past data has at most limited relevance for the assessment of a model’s predictive success [Oreskes et al. 1994; Stainforth et al. 2007a, 2007b, Steele and Werndl 2013].

This brings into focus the fact that there is no general answer to the question of the trustworthiness of model outputs. There is widespread consensus that predictions are better for longer time averages, larger spatial averages, low specificity and better physical understanding; and, all other things being equal, shorter lead times (nearer prediction horizons) are easier to predict than longer ones. Global mean temperature trends are considered trustworthy, and it is generally accepted that the observed upward trend will continue [Oreskes 2007], although the basis of this confidence is usually a physical understanding of the greenhouse effect with which the models are consistent, rather than a direct reliance on the output of models themselves. A 2013 IPCC report [IPCC 2013, Summary for Policymakers, section D.1] professes that modelled surface temperature patterns and trends are trustworthy on the global and continental scale, but, even in making this statement, assigns a probability of at least 66% (‘likely’) to the range within with 90% of model outcomes fall. In plainer terms, this is an expert-assigned probability of at least tens of percent that the models are substantially wrong even about global mean temperature.

There still are interesting questions about the epistemic grounds on which such assertions are made (and we return to them in the next section). A harder problem, however, concerns the use of models as providers of detailed information about the future local climate. The United Kingdom Climate Impacts Programme produces projections that aim to make high-resolution probabilistic projections of the local climate up to the end of the century, and similar projects are run in many other countries [Thompson et al. 2016]. The Programme’s set of projections known as UKCP09 [Sexton et al. 2012, Sexton and Murphy 2012] produces projections of the climate up to 2100 based on HadCM3, a global climate model developed at the UK Met Office Hadley Centre. Probabilities are given for events on a 25km grid for finely defined specific events such as changes in the temperature of the warmest day in summer, the precipitation of the wettest day in winter, or the change in summer-mean cloud amount, with projections blocked into overlapping thirty-year segments which extend to 2100. It is projected, for instance, that under a medium emission scenario the probability for a 20-30% reduction in summer mean precipitation in central London in 2080 is 0.5. There is a question of whether these projections are trustworthy and policy relevant. Frigg et al. urge caution on grounds that many of the UKCP09’s foundational assumptions seem to be questionable [2013, 2015] and that structural model error may have significant repercussions on small scales [2014]. Winsberg [2018] and Winsberg and Goodwin [2016] criticise these cautionary arguments as overstating the limitations of such projections. In 2019, the Programme launched a new set of projections, known as UKCP18 (https://www.metoffice.gov.uk/research/collaboration/ukcp). It is an open question whether these projections are open to the same objections, and, if so, how severe the limitations are.

6. Understanding and Quantifying Uncertainty

Uncertainty features prominently in discussions about climate models, and yet is a concept that is poorly understood and that raises many difficult questions. In most general terms, uncertainty is a lack of knowledge. The first challenge is to circumscribe more precisely what is meant by ‘uncertainty’ and what the sources of uncertainty are. A number of proposals have been made, but the discussion is still in a ‘pre-paradigmatic’ phase. Smith and Stern [2011] identify four relevant varieties of uncertainty: imprecision, ambiguity, intractability and indeterminacy. Spiegelhalter and Riesch [2011] consider a five-level structure with three within-model levels-event, parameter and model uncertainty-and two extra-model levels concerning acknowledged and unknown inadequacies in the modelling process. Wilby and Dessai [2010] discuss the issue with reference to what they call the cascade of uncertainty, studying how uncertainties magnify as one goes from assumptions about future global emissions of greenhouse gases to the implications of these for local adaption. Petersen [2012, Chapters 3 and 6] introduces a so-called uncertainty matrix listing the sources of uncertainty in the vertical and the sorts of uncertainty in the horizontal direction. Lahsen [2005] looks at the issue from a science studies point of view and discusses the distribution of uncertainty as a function of the distance from the site of knowledge production. And these are but a few of the many proposals.

The next problem is the one of measuring and quantifying uncertainty in climate predictions. Among the approaches that have been devised in response to this challenge, ensemble methods occupy centre stage. Current estimates of climate sensitivity and increase in global mean temperature under various emission scenarios, for instance, include information derived from ensembles containing multiple climate models. Multi-model ensembles are sets of several different models which differ in mathematical structure and physical content. Such an ensemble is used to investigate how predictions of relevant climate variables vary (or do not vary) according to model structure and assumptions. A special kind of multi-model ensemble is known as a “perturbed parameter ensemble”. It contains models with the same mathematical structure in which particular parameters assume different values, thereby effectively conducting a sensitivity analysis on a single model by systematically varying some of the parameters and observing the effect on the outcomes. Early analyses such as the climateprediction.net simulations and the UKCP09 results rely on perturbed parameter ensembles only, due to resource limitations; international projects such as the Coupled Model Intercomparison Projects (CMIP) and the work that goes into the IPCC assessments are based on multi-model ensembles containing different model structures. The reason to use ensembles is the acknowledged uncertainties in individual models, which concerns both the model structure and the values of parameters in the model. It is a common assumption that ensembles help understand the effects of these uncertainties either by producing and identifying “robust” predictions, or by providing estimates of this uncertainty about future climate change. (Parker [2013] provides an excellent discussion of ensemble methods and the problems that attach to them.)

A model-result is robust if all or most models in the ensemble show the same result; for general discussion of robustness analysis see Weisberg [2006]. If, for instance, all models in an ensemble show more than 4º increase in global mean temperature by the end of the century when run under a specific emission scenario, this result is robust across the specified ensemble. Does robustness justify increased confidence? Lloyd [2010, 2015] argues that robustness arguments are powerful in connection with climate models and lend credibility at least to core claims such as the claim that there was global warming in the 20th Century. Parker [2011], by contrast, reaches a more sober conclusion: ‘When today’s climate models agree that an interesting hypothesis about future climate change is true, it cannot be inferred […] that the hypothesis is likely to be true or that scientists’ confidence in the hypothesis should be significantly increased or that a claim to have evidence for the hypothesis is now more secure’ [ibid. 579]. One of the main problems is that if today’s models share the same technological constraints posed by today’s computer architecture and understanding of the climate system, then they inevitably share some common errors. Indeed, such common errors have been widely acknowledged (see, for instance, Knutti et al. [2010]) and studies have demonstrated and discussed the lack of model independence [Bishop and Abramowitz 2013; Jun et al. 2008a; 2008b]. But if models are not independent, then there is a question about how much epistemic weight agreement between them carries.

When ensembles do not yield robust predictions, then the spread of results within the ensemble is sometimes used to estimate quantitatively the uncertainty of the outcome. There are two main approaches to this. The first approach aims to translate the histogram of model results directly into a probability distribution: in effect, the guiding principle is that the probability of an outcome is proportional to the fraction of models in the ensemble which produce that result. The thinking behind this method seems to be to invoke some sort of frequentist approach to probabilities. The appeal to frequentism presupposes that models can be treated as exchangeable sources of information (in the sense that there is no reason to trust one ensemble member any more than any other). However, as we have previously seen, the assumption that models are independent has been questioned. There is a further problem: MMEs are ‘ensembles of opportunity’, grouping together existing models. Even the best ensembles such as CMIP6 are not designed to systematically explore all possibilities. It is therefore not clear why the frequency of ensemble projections should double as a guide to probability. The IPCC acknowledges this limitation (see discussion in Chapter 12 of IPCC [2013]) and thus downgrade the assessed likelihood of ensemble-derived ranges, deeming it only “likely” (³66%) that the real-world global mean temperature will fall within the 90% model range (for a discussion of this case see Thompson et al [2016]).

A more modest approach regards ensemble outputs as a guide to possibility rather than probability. In this view, the spread of an ensemble presents the range of outcomes that cannot be ruled out. The bounds of this set of results-often referred to as a ‘non-discountable envelope’-provide a lower bound of the uncertainty [Stainforth et al. 2007b]. In this spirit Katzav [2014] argues that a focus on prediction is misguided and that models ought to be used to show that particular scenarios are real possibilities.

While undoubtedly less committal than the probability approach, also non-discountable envelopes raise questions. The first is the relation between non-discountability and possibility. Non-discountable results are ones that cannot be ruled out. How is this judgment reached? Do results which cannot be ruled out indicate possibilities? If not, what is their relevance for estimating lower bounds? And, could the model, if pushed more deliberately towards “interesting” behaviours, actually make that envelope wider? Furthermore, it is important to keep in mind that the envelope just represents some possibilities. Hence it does not indicate the complete range of possibilities, making particular types of formalised decision-making procedures impossible. For a further discussion of these issues see Betz [2009, 2010].

Finally, a number of authors emphasise the limitations of model-based methods (such as ensemble methods) and submit that any realistic assessment of uncertainties will also have to rely on other factors, most notably expert judgement. Petersen [2012, Chapter 4] outlines the approach of the Netherlands Environmental Assessment Agency (PBL), which sees expert judgment and problem framings as essential components of uncertainty assessment. Aspinall [2010] suggests using methods of structured expert elicitation.

In light of the issues raised above, how should uncertainty in climate science be communicated to decision-makers? The most prominent framework for communicating uncertainty is the IPCC’s, which is used throughout the Fifth Assessment Report (AR5), is explicated in the ‘Guidance Note for Lead Authors of the IPCC Fifth Assessment Report on Consistent Treatment of Uncertainties’ and further explicated in [Mastrandrea et al. 2011]. The framework appeals to two measures for communicating uncertainty. The first, a qualitative ‘confidence’ scale, depends on both the type of evidence and the degree of agreement amongst experts. The second measure is a quantitative scale for representing statistical likelihoods (or more accurately, fuzzy likelihood intervals) for relevant climate/economic variables. The following statement exemplifies the use of these two measures for communicating uncertainty in AR5: ‘The global mean surface temperature change for the period 2016–2035 relative to 1986–2005 is similar for the four RCPs and will likely be in the range 0.3°C to 0.7°C (medium confidence). [IPCC 2013] A discussion of this framework can be found in Adler and Hirsch Hadorn [2014], Budescu et al. [2014], Mach et al. [2017], and Wüthrich [2017].

At this point, it should also be noted that the role of ethical and social values in relation to uncertainties in climate science is controversially debated. Winsberg [2012] appeals to complex simulation modelling to argue that it is infeasible for climate scientists to produce results that are not influenced by their ethical and social values. More specifically, he argues that assignments of probabilities to hypotheses about future climate change are inﬂuenced by ethical and social values because of the way these values come into play in the building and evaluating of climate models. Parker [2014] contends that pragmatic factors rather than social or ethical values often play a role in resolving these modelling choices. She further objects that Winsberg’s focus on precise probabilistic uncertainty estimates is misguided; coarser estimates like those used by the IPCC better reflect the extent of uncertainty and are less influenced by values. She concludes that Winsberg has exaggerated the influence of ethical and social values here but suggests that a more traditional challenge to the value-free ideal of science fits the climate case. Namely, one could argue that estimates of uncertainty are themselves always somewhat uncertain, and that the decision to offer a particular estimate of uncertainty thus might appropriately involve value judgments [compare, Douglas 2009].

7. Conceptualising Decisions Under Uncertainty

What is the appropriate reaction to climate change? How much should we mitigate? To what extent should we adapt? And what form should adaptation take? Should we build larger water reserves? Should we adapt houses, and our social infrastructure more generally, to a higher frequency of extreme weather events like droughts, heavy rainfalls, floods, and heatwaves, as well as the increased incidence of extremely high sea levels or the more frequent occurrence of particularly hot days are extreme weather events? The decisions that we make in response to these questions have consequences affecting both individuals and groups at different places and times. Moreover, the circumstances of many of these decisions involve uncertainty and disagreement that is sometimes both severe and wide-ranging, concerning not only the state of the climate (as discussed above) and the broader social consequences of any action or inaction on our part, but also the range of actions available to us and what significance we should attach to their possible consequences. These considerations make climate decision-making both important and hard. The stakes are high, and so too are the difficulties for standard decision theory—plenty of reason for philosophical engagement with this particular application of decision theory.

Let us begin by looking at the actors in the climate domain and the kinds of decision problems that concern them. When introducing decision theory, it is common to distinguish three main domains: individual decision theory (which concerns the decision problem of a single agent who may be uncertain of her environment), game theory (which focuses on cases of strategic interaction amongst rational agents), and social choice theory (which concerns procedures by which a number of agents may ‘think’ and act collectively). All three realms are relevant to the climate-change predicament, whether the concern is adapting to climate change or mitigating climate change or both.

Determining the appropriate agential perspective and type of engagement between agents is important, because otherwise decision-modelling efforts may be in vain. For instance, it may be futile to focus on the plight of individual citizens when the power to affect change really lies with states. It may likewise be misguided to analyse the prospects for a collective action on climate policy, if the supposed members of the group do not see themselves as contributing to a shared decision that is good for the group as a whole. It would also be misleading to exclude from an individual agent’s decision model the impact of others who perceive that they are acting in a strategic environment. This is not, however, to recommend a narrow view of the role of decision models-that they must always represent the decisions of agents as they see them, and can never be aspirational; the point is rather that we should not employ decision models with particular agential framings in a naïve way.

Getting the agential perspective right is just the first step in framing a decision problem so that it presents convincing reasons for action. There remains the task of representing the details of the decision problem from the appropriate epistemic and evaluative perspective. Our focus is individual decision theory, for reasons of space, and because most decision settings ultimately involve the decision of an individual, whether this be a single person or a group acting as an individual.

The standard model of (individual) decision-making under uncertainty used by decision theorists derives from the classic work of von Neumann and Morgenstern [1944] and Leonard Savage [1954]. It treats actions as functions from possible states of the world to consequences, these being the complete outcomes of performing the action in question in that state of the world. All uncertainty is taken to be uncertainty about the state of the world and is quantified by a single probability function over the possible states, where the probabilities in question measure either objective risk or the decision maker’s degrees of belief (or a combination of the two). The relative value of consequences is represented by an interval-scaled utility function over these consequences. Decision-makers are advised to choose the action with maximum expected utility (EU); where the EU for an action is the sum of the probability-weighted utility of the possible consequences of the action.

It is our contention that this model is inadequate for many climate-oriented decisions, because it fails to properly represent the multidimensional nature and severity of the uncertainty that decision-makers face. To begin with, not all the uncertainty that climate decision-makers face is empirical uncertainty about the actual state of the world (state uncertainty). There may be further empirical uncertainty about what options are available to them and what are the consequences of exercising each option for each respective state (option uncertainty). In what follows we use the term ‘empirical uncertainty’ to cover both state uncertainty and option uncertainty. Furthermore, decision-makers face a non-empirical kind of uncertainty-ethical uncertainty-about what values to assign to possible consequences.

Let us now turn to empirical uncertainty. As noted above, standard decision theory holds that all empirical uncertainty can be represented by a probability function over the possible states of the world. There are two issues here. The first is that confining all empirical uncertainty to the state space is rather unnatural for complex decision problems such as those associated with climate change. In fact, decision models are less convoluted if we allow the uncertainty about states to depend on the actions that might be taken (compare, Richard Jeffrey’s [1965] expected utility theory), and if we also permit further uncertainty about what consequence will arise under each state, given the action taken (an aspect of option uncertainty). For instance, consider a crude version of the mitigation decision problem faced by the global planner: it may be useful to depict the decision problem with a state-space partition in terms of possible increases in average global temperature over a given time period. In this case, our beliefs about the states (how likely they each are) would be conditional on the mitigation option taken. Moreover, for each respective mitigation option, the consequence arising in each of the states depends on further uncertain features of the world, for instance the extent to which, on average, regional conditions would be favourable to food production and whether social institutions would facilitate resilience in food production.

The second issue is that using a precise probability function to represent uncertainty about states (and consequences) can misrepresent the severity of this uncertainty. For instance, even if one assumes that the position of the scientific community may be reasonably well represented by a precise probability distribution over the state space, conditional on the mitigation option, precise probabilities over the possible food productions and other economic consequences, given this option and average global temperature rise, are less plausible. Note that the global social planner’s mitigation decision problem is typically analysed in terms of a so-called Integrated Assessment Model (IAM), which does indeed involve dependencies between mitigation strategies and both climate and economic variables. There is some disparity in the representation of empirical uncertainty: Nordhaus’s [2008] reliance on ‘best estimates’ for parameters like climate sensitivity can be compared with Stern’s [2007] use of ‘confidence intervals’. But these are relatively minor differences. Critics argue that all extant IAMs inadequately represent the uncertainty surrounding projections of future wealth under the status quo and alternative mitigation strategies [see Weitzman 2009, Frisch 2013, Stern 2013]. In particular, both Nordhaus [2008] and Stern [2007] controversially assume increasing wealth over time (or positive consumption growth rate) even for the status quo where nothing is done to mitigate climate change.

Popular among philosophers is the use of sets of probability functions to represent severe uncertainty surrounding decision states/consequences, whether the uncertainty is due to evidential limitations or due to evidential/expert disagreement. This is a minimal generalisation of the standard decision model, in the sense that probability measures still feature: roughly, the more severe the uncertainty, the more probability measures over the space of possibilities needed to conjointly represent the epistemic situation (see, for instance, Walley [1991]). For maximal uncertainty all possibilities are on a par-they are effectively assigned probability [0, 1]. Indeed it is a strength of the imprecise probability representation that it generalises the two extreme cases, that is, the precise probabilistic as well as the possibilistic frameworks. (See Halpern [2003] for a thorough treatment of frameworks, both qualitative and quantitative, for representing uncertainty.) In some contexts, it may be suitable to weight the possible probability distributions in terms of plausibility (as required for some of the decision rules discussed below). The weighting approach may in fact match the IPCC’s representation of the uncertainty surrounding decision-relevant climate and economic variables. Indeed, an important question is whether and how the IPCC’s representation of uncertainty can be translated into an imprecise probabilistic framework, as discussed here and in the next section. An alternative to the aforementioned proposal is that the IPCC’s confidence and likelihood measures for relevant variables should be combined to form an unweighted imprecise set of probability distributions, or even a precise probability distribution, suitable for input into an appropriate decision model.

Decision makers face uncertainty not only about what will or could happen, but also about what value to attach to these possibilities-in other words, they face ethical uncertainty. Such value or ethical uncertainty can have a number of different sources. The most important ones arise in connection with judgments about how to distribute the costs and benefits of mitigation and adaptation amongst different regions and countries, about how to take account of persons whose existence depends on what actions are chosen now, and about the degree to which future wellbeing should be discounted. (For discussion and debate about the ethical significance of various climate outcomes, particularly at the level of global rather than regional or national justice, see the articles in Gardiner et al.’s [2010] edited collection, Climate Ethics.) Of these, the latter has been the subject of the most debate, because of the extent to which (the global planner’s) decisions about how drastically to cut carbon emissions are sensitive to the discount rate used in evaluating the possible outcomes of doing so (as highlighted in Broome [2008]). Discounting thus provides a good illustration of the importance of ethical uncertainty.

In many economic models, a discount rate is applied to a measure of total wellbeing at different points in time (the ‘pure rate of time preference’), with a positive rate implying that future wellbeing carries less weight in the evaluations of options than present wellbeing. Note that the overall ‘social discount rate’ in economic models is the sum of the pure rate of time preference and a second term pertaining to the discounting of goods or consumption rather than wellbeing per se. See Broome [1992] and Parfit [1984] for helpful discussions of the reasons for discounting goods that do not imply discounting wellbeing. (The consumption growth rate is an important component of this second discounting term that is subject to empirical uncertainty, as discussed above; see Greaves [2017] for an examination of all the assumptions underlying the ‘social discount rate’ and its role in the standard economic method for evaluating policy options.) Many philosophers regard any pure discounting of future wellbeing as completely unjustified from an objective point of view. This is not to deny that temporal location may nonetheless correlate with features of the distribution of wellbeing that are in fact ethically significant. If people will be better off in the future, for instance, it is reasonable to be less concerned about their interests than those of the present generation, much as one might prioritise the less well-off within a single generation. But the mere fact of a benefit occurring at a particular time cannot be relevant to its value, at least from an impartial perspective.

Economists do nonetheless often discount wellbeing in their policy-oriented models, although they disagree considerably about what pure rate of time preference should be used. One view, exemplified by the Stern Review and representing the impartial perspective described above, is that only a very small rate (in the order of 0.5%) is justified, and this on the grounds of the small probability of the extinction of the human population. Other economists, however, regard a partial rather than an impartial point of view more appropriate in their models. A view along these lines, exemplified by Nordhaus [2007] and Arrow [1995a], is that the pure rate of time preference should be determined by the preferences of current people. But typical derivations of average pure time discounting from observed market behaviour are much higher than those used by Stern (around 3% by Nordhaus’s estimate). Although the use of this data has been criticised for providing an inadequate measure of people’s reasoned preferences (see, for example, Sen [1982], Drèze and Stern [1990], Broome [1992]), the point remains that any plausible method for determining the current generation’s attitude to the wellbeing of future generations is likely to yield a rate higher than that advocated by the Stern Review. To the extent that this debate about the ethical basis for discounting remains unresolved, there will be ethical uncertainty about the discount rate in climate policy decisions. This ethical uncertainty may be represented analogously to empirical uncertainty-by replacing the standard precise utility function with a set of possible utility functions.

8. Managing Uncertainty

How should a decision-maker choose amongst the courses of action available to her when she must make the choice under conditions of severe uncertainty? The problem that climate decision-makers face is that, in these situations, the precise utility and probability values required by standard EU theory may not be readily available.

There are, broadly speaking, three possible responses to this problem.

(1) The decision-maker can simply bite the bullet and try to settle on precise probability and utility judgements for the relevant contingencies. Orthodox decision theorists argue that rationality requires that decisions be made as if they maximise the decision maker’s subjective expectation of benefit relative to her precise degrees of belief and values. Broome [2012, 129] gives an unflinching defence of this approach: “The lack of firm probabilities is not a reason to give up expected value theory […] Stick with expected value theory, since it is very well-founded, and do your best with probabilities and values.” This approach may seem rather bold, not least in the context of environmental decision making. Weitzman [2009], for instance, argues that whether or not one assigns non-negligible probability to catastrophic climate consequences radically changes the assessment of mitigation options. Moreover, in many circumstances there remains the question of how to follow Broome’s advice: How should the decision-maker settle, in a non-arbitrary way, on a precise opinion on decision-relevant issues in the face of an effectively ‘divided mind’? There are two interrelated strategies: she can deliberate further and/or aggregate conflicting views. The former aims for convergence in opinion, while the latter aims for an acceptable compromise in the face of persisting conflict. (For a discussion of deliberation see Fishkin and Luskin [2005]; for more on aggregation see, for instance, Genest and Zidek [1986], Mongin [1995], Sen [1970], List and Puppe [2009]. There is a comparatively small formal literature on deliberation, a seminal contribution being Lehrer and Wagner’s [1981] model for updating probabilistic beliefs.)

(2) The decision-maker can try to delay making a decision, or at least postpone parts of it, in the hope that her uncertainty will become manageable as more information becomes available, or as disagreements resolve themselves through a change in attitudes. The basic motive for delaying a decision is to maintain flexibility at zero cost (see Koopmans [1962], Kreps and Porteus [1978], Arrow [1995b]). Suppose that we must decide between building a cheap but low sea wall or a high, but expensive, one, and that the relative desirability of these two courses of action depends on unknown factors, such as the extent to which sea levels will rise. In this case it would be sensible to consider building a low wall first but leave open the possibility of raising it in the future. If this can be done at no additional cost, then it is clearly the best option. In many adaptation scenarios, the analogue of the ‘low sea wall’ may in fact be social-institutional measures that enable a delayed response to climate change, whatever the details of this change turn out to be. In many cases, however, the prospect of cost-free postponement of a decision (or part thereof) is simply a mirage, since delay often decreases rather than increases opportunities due to changes in the background environment. This is often true for climate-change adaptation decisions, not to mention mitigation decisions.

(3) The decision-maker can employ a different decision rule to that prescribed by EU theory; one that is much less demanding in terms of the information it requires. A great many different proposals for such rules exist in the literature, involving more or less radical departures from the orthodox theory and varying in the informational demands they make. It should be noted from the outset that there is one widely-agreed rationality constraint on these non-standard decision rules: ‘(EU)-dominated options’ are not admissible choices, that is, if an option has lower expected utility than another option according to all permissible pairs of probability and utility functions, then the former dominated option is not an admissible choice. This is a relatively minimal constraint, but it may well yield a unique choice of action in some decision scenarios. In such cases, the severe uncertainty is not in fact decision relevant. For example, it may be the case that, from the global planner’s perspective, a given mitigation option is better than continuing with business as usual, whatever the uncertain details of the climate system. This is even more plausible to the extent that the mitigation option counts as a ‘win-win’ strategy [Maslin and Austin 2012], that is, to the extent that it has other positive impacts, say, on air quality or energy security, regardless of mitigation results. In many more fine-grained or otherwise difficult decision contexts, however, the non-EU-dominance constraint may exclude only a few of the available options as choice-worthy.

A consideration that is often appealed to in order to further discriminate between options is caution. Indeed, this is an important facet of the popular but ill-defined Precautionary Principle. (The Precautionary Principle is referred to in the IPCC [2014b] ARC-5 WGII report. See, for instance, Gardiner [2006] and Steele [2006] for discussion of what the Precautionary Principle does/could stand for.) Cautious decision rules give more weight to the ‘down-side’ risks; the possible negative implications of a choice of action. The Maxmin-EU rule, for instance, recommends picking the action with greatest minimum expected utility (see Gilboa and Schmeidler [1989], Walley [1991]). The rule is simple to use, but arguably much too cautious, paying no attention at all to the full spread of possible expected utilities. The α-Maxmin rule, in contrast, recommends taking the action with the greatest α-weighted sum of the minimum and maximum expected utilities associated with it. The relative weights for the minimum and maximum expected utilities can be thought of as reflecting either the decision maker’s pessimism in the face of uncertainty or else their degree of caution (see Binmore [2009]). (For a comprehensive survey of non-standard decision theories for handling severe uncertainty in the economics literature, see Gilboa and Marinacci [2012]. For applications to climate policy see Heal and Millner [2014])

A more informationally-demanding set of rules are those that draw on considerations of confidence and/or reliability. The thought here is that an agent is more or less confident about the various probability and utility functions that characterise her uncertainty. For instance, when the estimates derive from different models or experts, the decision maker may regard some models as better corroborated by available evidence than others or else some experts as more reliable than others in their judgments. In these cases, it is reasonable, ceteris paribus, to favour actions of which you are more confident that they will have beneficial consequences. One (rather sophisticated) way of doing this is to weight each of the expected utilities associated with an action in accordance with how confident you are about the judgements supporting them and then choose the action with the maximum confidence-weighted expected utility (see Klibanoff et al. [2005]). This rule is not very different from maximising expected utility and indeed one could regard confidence weighting as an aggregation technique rather than an alternative decision rule. But considerations of confidence may be appealed to even when precise confidence weights cannot be provided. Gärdenfors and Sahlin [1982/ 1988], for instance, suggest simply excluding from consideration any estimates that fall below a reliability threshold and then picking cautiously from the remainder. Similarly, Hill [2013] uses an ordinal measure of confidence that allows for stake-sensitive thresholds of reliability that can then be combined with varying levels of caution. This rule has the advantage of allowing decision-makers to draw on the confidence grading of scientific claims adopted by the IPCC (see Bradley et al [2017]).

One might finally distinguish decision rules that are cautious in a slightly different way-that compare options in terms of ‘robustness’ to uncertainty, relative to a problem-specific satisfactory level of expected utility. Better options are those that are more assured of having an expected utility that is good enough or regret-free, in the face of uncertainty. The ‘information-gap theory’ developed by Ben-Haim [2001] provides one formalisation of this basic idea that has proved popular in environmental management theory. Another prominent approach to robust decision-making is that developed by Lempert, Popper and Bankes [2003]. These two frameworks are compared in Hall et al. [2012]. Recall that the uncertainty in question may be multi-faceted, concerning probabilities of states/outcomes, or values of final outcomes. Most decision rules that appeal to robustness assume that a best estimate for the relevant variables is available, and then consider deviations away from this estimate. A robust option is one that has a satisfactory expected utility relative to a class of estimates that deviate from the best one to some degree; the wider the class in question, the more robust the option. Much depends on what expected utility level is deemed satisfactory. For mitigation decision making, one salient satisfactory level of expected utility is that associated with a 50% chance of average global temperature rise of 2 degrees Celsius or less. Note that one may otherwise interpret any such mitigation temperature target in a different way, namely as a constraint on what counts as a feasible option. In other words, mitigation options that do not meet the target are simply prohibited options, not suitable for consideration. For adaptation decisions, the satisfactory level would depend on local context, but roughly speaking, robust options are those that yield reasonable outcomes for all the inopportune climate scenarios that have non-negligible probability given some range of uncertainty. These are plausibly adaptation options that focus on resilience to any and all of the aforesaid climate scenarios, perhaps via the development of social institutions that can coordinate responses to variability and change. (Robust decision-making is endorsed, for instance, by Dessai et al. [2009] and Wilby and Dessai [2010], who indeed associate this kind of decision rule with resilience strategies. See also Linkov and others [2014] for discussion of resilience strategies vis-à-vis risk management.)

9. Conclusion

This article reviewed, from a philosophy of science perspective, issues and questions that arise in connection with climate science. Most of these issues are the subject matter of ongoing research, and they indeed deserve further attention. Rather than repeating these points, we would like to mention a topic that has not received the attention that it deserves: the epistemic significance of consensus in the acceptance of results. As the controversy over the Cook et al. [2013] paper shows, many people do seem to think that the level of expert consensus is an important reason to believe in climate change given that they themselves are not expert; and conversely, attacking the consensus and sowing doubt is a classic tactic of the other side. The role of consensus in the context of climate change deserves more attention than it has received hitherto, but for some discussions about consensus see (Inmaculada de Melo-Martín, Kristen Intemann, 2014).

10. Glossary

Attribution (of climate change): The process of evaluating the relative contributions of multiple causal factors to a change or event with an assignment of statistical confidence.

Boundary conditions: Values for any variable which affect the system but which are not directly output by the calculations.

Calibration: The process of estimating values of model parameters which are most consistent with observations.

Climate model: A representation of certain aspects of the climate system.

Detection (of climate change): The process of demonstrating that climate or a system affected by climate has changed in some defined statistical sense without providing a reason for that change.

Double counting: The use of data for both calibration and confirmation.

Expected utility (for an action): The sum of the probability-weighted utility of the possible consequences of the action.

External conditions (of the climate system): Conditions that influence the state of the Earth such as the amount of energy received from the sun.

Initial conditions: A mathematical descriptions of the state of the climate system at the beginning of the period being simulated.

Internal variability: The phenomenon that climate variables such as temperature and precipitation would change over time due to the internal dynamics of the climate system even in the absence of changing external conditions.

Null hypothesis: The expected behaviour of the climate system in the absence of changing external influences.

Projection: The prediction of a climate model that is conditional on a certain forcing scenario.

Proxy data: The data for climate variables that derived from observing natural phenomena such as tree rings, ice cores and ocean sediments.

Robustness (of a result): A result is robust if separate (ideally independent) models or lines of evidence lead to the same conclusion.

Use novel data: Data that are used for confirmation and have not been used for calibration.

11. References and Further Reading

Adler C. E. and G. Hirsch Hadorn. (2014). The IPCC and treatment of uncertainties: topics and sources of dissensus. Wiley Interdisciplinary Reviews: Climate Change 5.5, 663-676.
Arrow K. J. (1995b). A Note on Freedom and Flexibility. Choice, Welfare and Development. (eds. K. Basu, P. Pattanaik, and K. Suzumura), 7-15. Oxford: Oxford University Press.
Arrow K. J. (1995a). ‘Discounting Climate Change: Planning for an Uncertain Future. Lecture given at Institut d’Économie Industrielle, Université des Sciences Sociales, Toulouse.’ <http://idei.fr/doc/conf/annual/paper_1995.pdf>
Aspinall W. (2010). A route to more tractable expert advice. Nature 463, 294-295.
Ben-Haim Y. (2001). Information-Gap Theory: Decisions Under Severe Uncertainty, 330 pp. London: Academic Press.
Betz G. (2009). What range of future scenarios should climate policy be based on? Modal falsificationism and its limitations. Philosophia Naturalis 46, 133-158.
Betz G. (2010). What’s the worst case?. Analyse und Kritik 32, 87-106.
Binmore K. (2009). Rational Decisions, 216 pp. Princeton, NJ: Princeton University Press.
Bishop C. H. and G. Abramowitz. (2013). Climate model dependence and the replicate Earth paradigm. Climate Dynamics 41, 885-900.
Bradley, R, Helgeson, C. and B. Hill (2017). Climate Change Assessments: Confidence, Probability and Decision, Philosophy of Science 84(3): 500-522.
Bradley, R, Helgeson, C. and B. Hill (2018). Combining Probability with Qualitative Degree-of-Certainty Assessment. Climatic Change 149 (3-4): 517-525,
Broome J. (2012). Climate Matters: Ethics in a Warming World, 192 pp. New York: Norton.
Broome J. (1992). Counting the Cost of Global Warming, 147 pp. Cambridge: The White Horse Press.
Broome J. (2008). The Ethics of Climate Change. Scientific American 298, 96-102.
Budescu, D. V., H. Por, S. B. Broomell and M. Smithson. (2014). The interpretation of IPCC probabilistic statements around the world. Nature Climate Change 4, 508-512.
Cohn T. A. and H. F. Lins. (2005). Nature’s style: naturally trendy. Geophysical Research Letters 32, L23402.
Cook J. et al. (2013). Quantifying the consensus on the anthropogenic global warming in the scientific literature. Environmental Research Letters 8, 1-7.
Daron J. D. and D. Stainforth. (2013). On predicting climate under climate change. Environmental Research Letters 8, 1-8.
de Melo-Martín I., and K. Intemann (2014). Who’s afraid of dissent? Addressing concerns about undermining scientific consensus in public policy developments. Perspectives on Science 22.4, 593-615.
Dessai S. et al. (2009). Do We Need Better Predictions to Adapt to a Changing Climate? Eos 90.13, 111-112.
Dessler A. (2011). Introduction to Modern Climate Change. Cambridge: Cambridge University Press.
Drèze J., and Stern, N. (1990). Policy reform, shadow prices, and market prices. Journal of Public Economics 42.1, 1-45.
Douglas H. (2009). Science, Policy, and the Value-Free Ideal. Pittsburgh: Pittsburgh University Press.
Fishkin J. S., and R. C. Luskin. (2005). Experimenting with a Democratic Ideal: Deliberative Polling and Public Opinion. Acta Politica 40, 284-298.
Frank D., J. Esper, E. Zorita and R. Wilson. (2010). A noodle, hockey stick, and spaghetti plate: A perspective on high-resolution paleoclimatology. Wiley Interdisciplinary Reviews: Climate Change 1.4, 507-516.
Frigg R. P., D. A. Stainforth and L. A. Smith. (2013). The Myopia of Imperfect Climate Models: The Case of UKCP09. Philosophy of Science 80.5, 886-897.
Frigg R. P., D. A. Stainforth and L. A. Smith. (2015). An Assessment of the Foundational Assumptions in High-Resolution Climate Projections: The Case of UKCP09 2015, draft under review.
Frigg R. P., S. Bradley, H. Du and L. A. Smith. (2014a). Laplace’s Demon and the Adventures of His Apprentices. Philosophy of Science 81.1, 31-59.
Frisch M. (2013). Modeling Climate Policies: A Critical Look at Integrated Assessment Models. Philosophy and Technology 26, 117-137.
Frisch, M. (2015). Tuning climate models, predictivism, and the problem of old evidence. European Journal for Philosophy of Science 5.2, 171-190.
Gärdenfors P. and N.-E. Sahlin. [1982] (1988). Unreliable probabilities, risk taking, and decision making. Decision, Probability and Utility, (eds. P. Gärdenfors and N.-E. Sahlin), 313-334. Cambridge: Cambridge University Press.
Gardiner S. (2006). A Core Precautionary Principle. The Journal of Political Philosophy 14.1, 33-60.
Gardiner S., S. Caney, D. Jamieson, H. Shue (2010). Climate Ethics: Essential Readings. Oxford: Oxford University Press
Genest C. and J. V. Zidek. (1986). Combining Probability Distributions: A Critique and Annotated Bibliography. Statistical Science 1.1, 113-135.
Gilboa I. and M. Marinacci. (2012). Ambiguity and the Bayesian Paradigm. Advances in Economics and Econometrics: Theory and Applications, Tenth World Congress of the Econometric Society (eds. D. Acemoglu, M. Arellano and E. Dekel), 179-242 Cambridge: Cambridge University Press.
Gilboa I. and D. Schmeidler. (1989). Maxmin expected utility with non-unique prior. Journal of Mathematical Economics 18, 141-153.
Greaves, H. (2017). Discounting for public policy: A survey. Economics and Philosophy 33.3, 391-439.
Hall J. W., Lempert, R. J., Keller, K., Hackbarth, A., Mijere, C., McInerney, D. J. (2012). Robust Climate Policies Under Uncertainty: A Comparison of Robust Decision-Making and Info-Gap Methods. Risk Analysis 32.10, 1657-1672.
Halpern J. Y. (2003). Reasoning About Uncertainty, 483 pp. Cambridge, MA: MIT Press.
Heal. G. and A. Millner (2014) Uncertainty and Decision Making in Climate Change Economics. Review of Environmental Economics and Policy 8:120-137.
Hegerl G. C., O. Hoegh-Guldberg, G. Casassa, M. P. Hoerling, R. S. Kovats, C. Parmesan, D. W. Pierce, P. A. Stott. (2010). Good Practice Guidance Paper on Detection and Attribution Related to Anthropogenic Climate Change. Meeting Report of the Intergovernmental Panel on Climate Change Expert Meeting on Detection and Attribution of Anthropogenic Climate Change (eds. T. F. Stocker, C. B. Field, D. Qin, V. Barros, G.-K. Plattner, M. Tignor, P. M. Midgley and K. L. Ebi. Bern). Switzerland: IPCC Working Group I Technical Support Unit, University of Bern.
Held I. M. (2005). The Gap between Simulation and Understanding in Climate Modeling. Bulletin of the American Meteorological Society 80, 1609-1614.
Hill B. (2013). Confidence and Decision. Games and Economic Behavior 82, 675-692.
Hulme M., S. Dessai, I. Lorenzoni and D. Nelson. (2009). Unstable Climates: exploring the statistical and social constructions of climate. Geoforum 40, 197-206.
IPCC. (2013). Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge and New York: Cambridge University Press.
IPCC. (2014). Climate Change 2014: Impacts, Adaptation, and Vulnerability. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge and New York: Cambridge University Press.
Jeffrey R. (1965). The Logic of Decision, 231 pp. Chicago: University of Chicago Press.
Jun M., R. Knutti, D. W Nychka. (2008). Local eigenvalue analysis of CMIP3 climate model errors. Tellus A 60.5, 992-1000.
Katzav J. (2013). Severe testing of climate change hypotheses. Studies in History and Philosophy of Philosophy of Modern Physics 44.4, 433-441.
Katzav J. (2014). The epistemology of climate models and some of its implications for climate science and the philosophy of science. Studies in History and Philosophy of Modern Physics 46, 228-238.
Katzav, J. & W. S. Parker (2018). Issues in the Theoretical Foundations of Climate Science. Studies in History and Philosophy of Modern Physics 63, 141-149.
Klibanoff P., M. Marinacci and S. Mukerji. (2005). A smooth model of decision making under ambiguity. Econometrica 73, 1849-1892.
Klintman M. (2019). Knowledge Resistance: How We Avoid Insight From Others. Manchester: Manchester University Press.
Knutti R., R. Furrer, C. Tebaldi, J. Cermak, and G. A. Meehl. (2010). Challenges in Combining Projections from Multiple Climate Models. Journal of Climate 23.10, 2739-2758.
Koopmans T. C. (1962). On flexibility of future preference. Cowles Foundation for Research in Economics, Yale University, Cowles Foundation Discussion Papers 150.
Kreps D. M. and E. L. Porteus. (1978). Temporal resolution of uncertainty and dynamic choice theory. Econometrica 46.1, 185-200.
Lahsen M. (2005). Seductive Simulations? Uncertainty Distribution Around Climate Models. Social Studies of Science 35.6, 895-922.
Lehrer K. and Wagner, C. (1981). Rational Consensus in Science and Society, 165 pp. Dordrecht: Reidel.
Lempert R. J., Popper, S. W., Bankes, S. C. (2003). Shaping the Next One Hundred Years: New Methods for Quantitative Long-Term Policy Analysis, 208 pp. Santa Monica, CA: RAND Corporation, MR-1626-RPC.
Lenhard J. and E. Winsberg. (2010). Holism, entrenchment, and the future of climate model pluralism. Studies in History and Philosophy of Modern Physics 41, 253-262.
Linkov I. et al. (2014). Changing the resilience program. Nature Climate Change 4, 407-409.
List C. and C. Puppe. (2009). Judgment aggregation: a survey. Oxford Handbook of Rational and Social Choice (eds. P. Anand, C. Puppe and P. Pattanaik). Oxford: Oxford University Press.
Lorenz E. (1995). Climate is what you expect. Prepared for publication by NCAR. Unpublished, 1-33.
Lloyd E. A. (2010). Confirmation and robustness of climate models. Philosophy of Science 77, 971-984.
Lloyd E. A. (2015). Model robustness as a confirmatory virtue: The case of climate science. Studies in History and Philosophy of Science 49, 58-68.
Lloyd E. A. (2009). Varieties of Support and Confirmation of Climate Models. Proceedings of the Aristotelian Society Supplementary Volume LXXXIII, 217-236.
Lloyd, E., N. Oreskes (2019). Climate Change Attribution: When Does it Make Sense to Add Methods? Epistemology & Philosophy of Science 56.1, 185-201.
Lusk, G. (2017). The Social Utility of Event Attribution: Liability, Adaptation, and Justice-Based Loss and Damage. Climatic Change 143, 201–12.
Mach, K. J., M. D. Mastrandrea, P. T. Freeman, and C. B. Field (2017). Unleashing Expert Judgment in Assessment. Global Environmental Change 44, 1–14.
Mann M. E., R. S. Bradley and M.K. Hughes (1998). Global-scale temperature patterns and climate forcing over the past six centuries. Nature 392, 779-787.
Maslin M. and P. Austin. (2012). Climate models at their limit?. Nature 486, 183-184.
Mastrandrea M. D., K. J. Mach, G.-K. Plattner, O. Edenhofer, T. F. Stocker, C. B. Field, K. L. Ebi, and P. R. Matschoss. (2011). The IPCC AR5 guidance note on consistent treatment of uncertainties: a common approach across the working groups. Climatic Change 108, 675-691.
McGuffie K. and A. Henderson-Sellers. (2005). A Climate Modelling Primer, 217 pp. New Jersey: Wiley.
McIntyre S. and R. McKitrick. (2003). Corrections to the Mann et. al. (1998) proxy data base and northern hemispheric average temperature series. Energy & Environment 14.6, 751-771.
Mongin P. (1995). Consistent Bayesian Aggregation. Journal of Economic Theory 66.2, 313-51.
Nordhaus W. D. (2007). A Review of the Stern Review on the Economics of Climate Change. Journal of Economic Literature 45.3, 686-702.
Nordhaus W. C. (2008). A Question of Balance, 366 pp. New Haven, CT: Yale University Press.
Oreskes N. and E. M. Conway. (2012). Merchants of Doubt: How a Handful of Scientists Obscured the Truth on Issues from Tobacco Smoke to Global Warming, 355 pp. New York: Bloomsbury Press.
Oreskes N. (2007) The Scientific Consensus on Climate Change: How Do We Know We’re Not Wrong? Climate Change: What It Means for Us, Our Children, and Our Grandchildren (eds. J. F. C. DiMento and P. Doughman), 65-99. Boston: MIT Press.
Oreskes N., K. Shrader-Frechette and K. Belitz. (1994). Verification, validation, and confirmation of numerical models in the Earth Science. Science New Series 263.5147, 641-646.
Parfit D. (1984). Reasons and Persons, 560 pp. Oxford: Clarendon Press.
Parker W. S. (2009). Confirmation and Adequacy for Purpose in Climate Modelling. Aristotelian Society Supplementary Volume 83.1 233-249.
Parker W. S. (2010). Comparative Process Tracing and Climate Change Fingerprints. Philosophy of Science 77, 1083-1095.
Parker W. S. (2011). When Climate Models Agree: The Significance of Robust Model Predictions. Philosophy of Science 78.4, 579-600.
Parker W. S. (2013). Ensemble modeling, uncertainty and robust predictions. Wiley Interdisciplinary Reviews: Climate Change 4.3, 213-223.
Parker W. S. (2014). Values and Uncertainties in Climate Prediction, Revisited. Studies in History and Philosophy of Science Part A 46, 24-30.
Petersen A. C. (2012). Simulating Nature: A Philosophical Study of Computer-Simulation Uncertainties and Their Role in Climate Science and Policy Advice, 210 pp. Boca Raton, Florida: CRC Press.
Resnik M. (1987). Choices: an introduction to decision theory, 221 pp. Minneapolis: University of Minnesota Press.
Savage L. J. (1954). The Foundations of Statistics, 310 pp. New York: John Wiley & Sons.
Sen A. (1982). Approaches to the choice of discount rate for social benefit–cost analysis. Discounting for Time and Risk in Energy Policy (ed. R. C. Lind), 325-353. Washington, DC: Resources for the Future.
Sen A. (1970). Collective Choice and Social Welfar. San Francisco: Holden-Day Inc.
Sexton D. M. H., J. M. Murphy, M. Collins and M. J. Webb. (2012). Multivariate Probabilistic Projections Using Imperfect Climate Models. Part I: Outline of Methodology. Climate Dynamics 38, 2513-2542.
Sexton D. M. H., and J. M. Murphy. (2012). Multivariate Probabilistic Projections Using Imperfect Climate Models. Part II: Robustness of Methodological Choices and Consequences for Climate Sensitivity. Climate Dynamics 38, 2543-2558.
Shackley S., P. Young, S. Parkinson and B. Wynne. (1998). Uncertainty, Complexity and Concepts of Good Science in Climate Change Modelling: Are GCMs the Best Tools? Climatic Change 38, 159-205.
Smith L. A. and N. Stern. (2011). Uncertainty in science and its role in climate policy. Phil. Trans. R. Soc. A 369.1956, 4818-4841.
Spiegelhalter D. J. and H. Riesch. (2011). Don’t know, can’t know: embracing deeper uncertainties when analysing risks. Phil. Trans. R. Soc. A 369, 4730-4750.
Stainforth D. A., M. R. Allen, E. R. Tredger and L. A. Smith. (2007a). Confidence, Uncertainty and Decision-support Relevance in Climate Predictions. Philosophical Transactions of the Royal Society A 365, 2145-2161.
Stainforth D. A., T. E. Downing, R. Washington, A. Lopez and M. New. (2007b). Issues in the Interpretation of Climate Model Ensembles to Inform Decisions. Philosophical Transactions of the Royal Society A 365, 2163-2177.
Steele K. (2006). The precautionary principle: a new approach to public decision-making?. Law Probability and Risk 5, 19-31.
Steele K. and C. Werndl. (2013). Climate Models, Confirmation and Calibration. The British Journal for the Philosophy of Science 64, 609-635.
Steele K. and C. Werndl. forthcoming (2015). The Need for a More Nuanced Picture on Use-Novelty and Double-Counting. Philosophy of Science.
Stern N. (2007). The Economics of Climate Change: The Stern Review, 692 pp. Cambridge: Cambridge University Press.
Stern, N. (2013). The Structure of Economic Modeling of the Potential Impacts of Climate Change: Grafting Gross Underestimation of Risk onto Already Narrow Scientific Models. Journal of Economic Literature 51.3, 838-859.
Thompson, Erica, Roman Frigg and Casey Helgeson. Expert Judgment for Climate Change Adaptation, Philosophy of Science 83(5), 2016, 1110-1121,
von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behaviour, 739 pp. Princeton: Princeton University Press.
Walley P. (1991). Statistical Reasoning with Imprecise Probabilities, 706 pp. New York: Chapman and Hall.
Weitzman M. L. (2009). On Modeling and Interpreting the Economics of Catastrophic Climate Change. The Review of Economics and Statistics 91.1, 1-19.
Werndl C. (2015). On defining climate and climate change. The British Journal for the Philosophy of Science, doi:10.1093/bjps/axu48.
Wilby R. L. and S. Dessai. (2010). Robust adaptation to climate change. Weather 65.7, 180-185.
Weisberg Michael. (2006). Robustness Analysis. Philosophy of Science 73, 730-742.
Winsberg E. (2012). Values and Uncertainties in the Predictions of Global Climate Models. Kennedy Institute of Ethics Journal 22, 111-127.
Winsberg, E. 2018. Philosophy and Climate Science. Cambridge: Cambridge University Press.
Winsberg, E and W. M. Goodwin (2016). The Adventures of Climate Science in the Sweet Land of Idle Arguments. Studies in History and Philosophy of Modern Physics 54, 9-17.
Worrall J. (2010). Error, Tests, and Theory Confirmation. Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability, and the Objectivity and Rationality of Science (eds. D. G. Mayo and A. Spanos), 125-154. Cambridge: Cambridge University Press.
Wüthrich, N. (2017). Conceptualizing Uncertainty: An Assessment of the Uncertainty Framework of the Intergovernmental Panel on Climate Change. In EPSA15 Selected Papers, 95-107. Cham: Springer.

Author Information

Richard Bradley
London School of Economics and Political Science
UK

Roman Frigg
London School of Economics and Political Science
UK

Katie Steele
Australian National University
Australia

Erica Thompson
London School of Economics and Political Science
UK

Charlotte Werndl
University of Salzburg
Austria
and
London School of Economics and Political Science
UK

Causation

The question, “What is causation?” may sound like a trivial question—it is as sure as common knowledge can ever be that some things cause another, that there are causes and they necessitate certain effects. We say that we know that what caused the president’s death was an assassin’s shot. But when asked why, we will most certainly reply that it is because the latter was necessary for the former—which is an answer that, upon close philosophical examination, falls short of veracity. In a less direct way, the president’s grandmother’s giving birth to his mother was necessary for his death too. That, however, we would not describe as this death’s cause.

The first section of this article states the reasons why we should care about causation, including those that are non-philosophical. Sections 2 and 3 define the axis of the division into ontological and semantic analyses, with the Kantian and skeptical accounts as two alternatives. Set out there is also Hume’s pessimistic framework for thinking about causation—since before we ask what causation is, it is vital to consider whether we can come to know it at all.

Section 4 examines the semantic approaches, which analyze what it means to say that one thing causes another. The first, the regularity theories, nonetheless turns out to be problematic when dealing with unrepeatable and implausibly enormous cases, among many. Some of these theories limit the ambitions of Lewis’s theory of causation as a chain of counterfactual dependence, and also suffer from the causal redundancy and causal transitivity objections. Although the scientifically-minded interventionists try to reconnect our will to talk in terms of causation with our agency, probability theories accommodate the indeterminacy of quantum physics and relax the strictness of exceptions-unfriendly regularity accounts. Yet they risk falling into the trap of confounding causation and probability.

The next section brings us back to ontology. Since causation is hardly a particular entity, nominalists define it with recurrence over and above instances. Realists bring forward the relation of necessitation, seemingly in play whenever causation occurs. Dispositionalism claims that to cause means to dispose to happen. Process theories base their analysis on the notions of process and transmission—for instance, of energy, which might capture well the nature of causation in the most physical sense.

Another historically significant family of approaches is the concern of Section 6, which examines how Kant removes causation from the domain of things-in-themselves to include it in the structure of consciousness. This has also inspired the agency views which claim agency is inextricably tied up with causal reasoning.

The last, seventh section, deals with the most skeptical work on causation. Some, following Bertrand Russell, have tried to get rid of the concept altogether, believing it a relic of a past and timeworn metaphysical speculation. Pluralism and thickism see the ill fate of any attempt at defining causation in that what the word can mean is in fact a bundle of different concepts, or not any single and meaningful one at all.

What Is Causation and Why Do People Care?
Hume’s Challenge
A Family Tree of Causal Theories
Semantic Analyses
Ontological Stances
Kantian Approaches
1. Kant Himself
2. Agency Views
Skepticism
1. Russellian Republicanism
2. Pluralism and Thickism
References and Further Reading

1. What Is Causation and Why Do People Care?

Causation is a live topic across a number of disciplines, due to factors other than its philosophical interest. The second half of the twentieth century saw an increase in the availability of information about the social world, the growth of statistics and the disciplines it enables (such as economics and epidemiology), and the growth of computing power. This led, at first, to the prospect of much-improved policy and individual choice through analysis of all this data, and especially in the early twenty-first century, to the advent of potentially useful artificial intelligence that might be able to achieve another step-change in the same direction. But in the background of all of this lurks the specter of causation. Using information to inform goal-directed action often seems to require more than mere extrapolation or projection. It often seems to require that we understand something of the causal nature of the situation. This has seemed painfully obvious to some, but not to others. Increasing quantities of information and abilities to process it force us to decide whether or not causation is part of this march of progress or an obstacle on the road.

So much for why people care about causation. What is this thing that we care about so much?

To paraphrase the great physicist Richard Feynman, it is safe to say that nobody understands causation. But unlike quantum physics, causation is not a useful calculating device yielding astoundingly accurate predictions, and those who wish to use causal reasoning for any actual purpose do not have the luxury of following Feynman’s injunction to “shut up and calculate”. The philosophers cannot be pushed into a room and left to debate causation; the door cannot be closed on conceptual debate.

The remainder of this section offers a summary of the main elements of disagreement. The next section presents a “family tree” of different historical and still common views on the topic, which may help to make some sense of the state of the debate.

Some philosophers have asked what causation is, that is, they have asked an ontological question. Some of these have answered that it is something over and above (or at least of a different kind from) its instances: that there is a “necessitation relation” that is a universal rather than a particular thing, and in which cause-effect pairs participate, or of which they partake, or something similar “in virtue” of which they instantiate causation (Armstrong, 1983). These are realists about causation (noting that others discussed in this paragraph are also realists in a more general sense, but not about universals). Others, perhaps a majority, believe that causation is something that supervenes upon (or is ultimately nothing over and above) its instances (Lewis, 1983; Mackie, 1974). These are nominalists. Yet others believe that it is something somewhat different from either option: a disposition, or a bundle of dispositions, which are taken to be fundamental (Mumford & Anjum, 2011). These are dispositionalists.

Second, philosophers have sought a semantic analysis of causation, trying to work out what “cause” and cognates mean, in some deeper sense of “meaning” than a dictionary entry can satisfy. (It is worth bearing in mind, however, that the ontological and semantic projects are often pursued together, and cannot always be separated.) Some nominalists believe it is a form of regularity holding between distinct existences (Mackie, 1974). These are regularity theorists. Others, counterfactual theorists, believe it is a special kind of counterfactual dependence between distinct existences (Lewis, 1973a), and others hold that causes raise the probability of their effects in a special way (Eells, 1991; Suppes, 1970). Among counterfactual theorists are various subsets, notably interventionists (for example, Woodward, 2003) and contrastivists (for example, Schaffer, 2007). There is also an overlapping subset of thinkers with a non-philosophical motivation, and sometimes background, who develop technical frameworks for the purpose of performing causal inference and, in doing so, define causation, thus straying into the territory of offering semantic analysis (Hernán & Robins, 2020; Pearl, 2009; Rubin, 1974). Out of kilter with the historical motivation of those approaching counterfactual theorizing from a philosophical angle, some of those coming from practical angles appear not to be nominalists (Pearl & Mackenzie, 2018). Yet others, who may or may not be nominalists, hold that causation is a pre-scientific or “folk science” notion which, like “energy”, should be mapped onto a property identified by our current best science, even if that means deviating from the pre-scientific notion (Dowe, 2000).

Third, there are those who take a Kantian approach. While this is an answer to ontological questions about causation, it is reasonably treated in a separate category, different from the ontological approach mentioned first above in this section, because the question Kant tried to answer is better summarized not as “What sort of thing is causation?” but “Is causation a thing at all?” Kant himself thought that causation is a constitutive condition of experience (Kant, 1781), thus not a part of the world, but a part of us—a way we experience the world, without which experience would be altogether impossible. Late twentieth-century thinkers suggested that causation is not a necessary precondition of all experience but, more modestly, a dispositional property of us to react in certain ways—a secondary property, like color—arising from the fact that we are agents (Menzies & Price, 1993).

The fourth approach to causation is, in a broad sense, skeptical. Thus some have taken the view that it is a redundant notion, one that ought to be dispensed with in favor of modern scientific theory (Russell, 1918). Such thinkers do not have a standard name but might reasonably be called republicans, following a famous line of Bertrand Russell’s (see the first subsection of section 7.). Some (pluralists) believe that there is no single concept of causation but a plurality of related concepts which we lump together under the word “causation” for some reason other than that there is such a thing as causation (Cartwright, 2007; Stapleton, 2008). Yet another view, which might be called thickism and which may or may not be a form of pluralism, holds that causal concepts are “thick”, as some have suggested for ethical concepts (Anscombe, 1958; although Anscombe did not use this term herself). That is, the fundamental referents of causal judgements are not causes, but kicks, pushes, and so forth, out of which there is no causal component to be abstracted, extracted, or meaningfully studied (Anscombe, 1969; Cartwright, 1983).

Cutting across all these positions is a question as to what the causal relata are, if indeed causation is a relation at all. Some say they are events (Lewis, 1973a, 1986); others, aspects (Paul, 2004); or others, facts (Mellor, 1995), among other ideas.

Disagreement about fundamentals is great news if you are a philosopher, because it gives you plenty to work on. It is a field of engagement that has not settled into trench warfare between a few big guns and their troops. It is indicative of a really fruitful research area, one with live problems, fast-paced developments, and connections with real life—that specter that lurks in the background of philosophy seminar rooms and lecture halls, just as causation lurks in the background of more practical engagements.

However, confusion about fundamentals is not great news if you are trying to make the best sense of the data you have collected, looking for guidance on how to convince a judge that your client is or is not liable, trying to make a decision about whether to ban a certain food additive or wondering how your investment will respond to the realization of a certain geopolitical risk. It is certainly not helpful if one is trying to decide what will be the most effective public health measure to slow the spread of an epidemic.

2. Hume’s Challenge

David Hume posed the questions that all the ideas discussed in the remainder of this article attempt to answer. He had various motivations, but a fair abridgement might be as follows.

Start with the obvious fact that we frequently have beliefs about what will happen, or is happening elsewhere right now, or has happened in the past, or, more grandly, what happens in general. One of Hume’s examples is that the sun will rise tomorrow. An example he gives of a general belief is that bread, in general, is nourishing. How do we arrive at these beliefs?

Hume argues that such beliefs derive from experience. We believe the sun rises because we have experienced it rising on all previous mornings. We believe bread is nourishing because it has always been nourishing when we have encountered it in our experience.

However, Hume argues that this is an inadequate justification on its own for the kind of inference in question. There is no contradiction in supposing that the sun will simply not rise tomorrow. This would not be logically incompatible with previous experience. Previous experience does not render it impossible. On the contrary, we can easily imagine such a situation, perhaps use it as the premise for a story, and so forth. Similar remarks apply to the nourishing effects of bread, and indeed to all our beliefs that cannot be justified logically (or mathematically, if that is different) from some indisputable principles.

In arguing thus, Hume might be understood as reacting to the rationalist component of the emerging scientific worldview, that component that emphasized the ability of the human mind to reach out and understand. Descartes believed that through the exercise of reason we could obtain knowledge of the world of experience. Newton believed that the world of experience was indeed governed by some kind of mathematical necessity or numerical pattern, which our reason could uncover, and thus felt able to draw universal conclusions from a little, local data. Hume rejected the confidence characteristic of both Descartes and Newton. Given the central role that this confidence about the power of the human mind played in the founding of modern science, Hume, and empiricists more generally, might be seen as offering not a question about common sense inferences, but a foundational critique of one of the central impulses of the entire scientific enterprise—perhaps not how twentieth and twenty-first-century philosophers in the Anglo-American tradition would like to see their ancestry and inspiration.

Hume’s argument was simple and compelling and instantiated what appears to be a reasonably novel argumentative pattern or move. He took a metaphysical question and turned it into an epistemological one. Thus he started with “What is necessary connection?” and moved on to “How do we know about necessary connection?”

The answer to the latter question, he claimed, is that we do not know about it at all, because the only kind of necessity we can make sense of is that of logical and mathematical necessity. We know about the necessity of logic and mathematics through examining the relevant “ideas”, or concepts, and seeing that certain combinations necessitate others. The contrary would be contradictory, and we can test for this by trying to imagine it. Gandalf is a wizard, and all wizards have staffs; we cannot conceive of these claims being true and yet Gandalf being staff-less. Once we have the ideas indicated in those claims, Gandalf’s staff ownership status is settled.

Experience, however, offers no necessity. Things happen, while we do not perceive their being “made” to happen. Hume’s argument to establish this is the flip side of his argument in favor of our knowledge of a priori truths. He challenges us to imagine causes happening without their usual effects: bread not to nourish, billiard balls to go into orbit when we strike them (this example is a somewhat augmented form of Hume’s own one), and so forth. It seems that we can do this easily. So we cannot claim to be able to access necessity in the empirical world in this way. We perceive and experience constant conjunction of cause and effect and we may find it fanciful to imagine stepping from a window and gently floating to the ground, but we can do it, and sometimes do so, both deliberately and involuntarily (who has not dreamed they can fly?). But Hume agrees with Descartes that we cannot even dream that two and two make five (if we clearly comprehend those notions in our dream—of course one can have a fuzzy dream in which one accepts the claim that two and two make five, without having the ideas of two, plus, equals and five in clear focus).

Hume’s skepticism about our knowledge of causation leads him to skepticism about the nature of causation: the metaphysical question is given an epistemological treatment, and then the answer returned to the metaphysical question is epistemologically motivated. His conclusion is that, for all we can tell, there is no necessary connection, there is only a series of constant conjunctions, usually called regularities. This does not mean that there is no causal necessity, only that there is no reason to believe that there is. For the Enlightenment project of basing knowledge on reason rather than faith, this is devastating.

The constraint of metaphysical speculation by epistemological considerations remains a central theme of twenty-first century philosophy, even if it has somewhat loosened its hold in this time. But Hume took his critique a step further, with further profound significance for this whole philosophical tradition. He asked what we even mean by “cause”, and specifically, by that component of cause he calls “necessary connection”. (He identifies two others: temporal order and spatiotemporal contiguity. These are also topics of philosophical and indeed physical debate, but are less prominent in early twenty-first century philosophy, and thus are not discussed in this article.) He argues that we cannot even articulate what it would be for an event in the world we experience to make another happen.

The argument reuses familiar material. We have a decent grasp on logical necessity; it is the incoherence of the denial of the necessity in question, which we can easily spot (in his view). But that is not the necessary connection we seek. But, a question remains open, what other kind of necessity could there be? If it does not involve the impossibility of what is necessitated, then in what sense is it necessitated? This is not a rhetorical question; it is a genuine request for explanation. Supplying one is, at best, difficult; at worst, it is impossible. Some have tried (several attempts are discussed throughout the remainder of the article) but most have taken the view that it is impossible. Hume’s own explanation is that necessary connection is nothing more than a feeling, the expectation created in us by endless experience of same cause followed by same effect. Granted, this is a meaning for “necessary connection”; but it is one that robs “necessary” of anything resembling necessity.

The move from “What is X?” to “What does our concept of X mean?” has driven philosophers even harder than the idea that metaphysical speculation must be epistemologically constrained—partly because philosophical knowledge was thought for a long time to be constrained to knowledge of meanings; but that is another story (see Ch 10 of Broadbent, 2016).

This is the background to all subsequent work on causation as rejuvenated by the Anglo-American tradition, and also to the practical questions that arise. The ideas that we cannot directly perceive causation, and that we cannot reason logically from cause to effect, have repeatedly given rise to obstacles in science, law, policy, history, sports, business, politics—more or less any “world-oriented” activity you can think of. The next section summarizes the ways that people have understood this challenge: the most important questions they think it raises and their answers to these questions.

3. A Family Tree of Causal Theories

Here is a diagram indicating one possible way of understanding the relationship between different historically significant and still influential theories of, and approaches to, and even understandings of the philosophical problems posed by causation—and indeed some approaches that do not think the problems are philosophical at all.

Figure 1. A “family tree” of theories of causation

At the top level are approaches to causation corresponding to the kinds of questions one might deem important to ask about it. At the second level are theories that have been offered in response to these questions. Some of these theories have sub-theories which do not really merit their own separate level, and are dealt with in this article as variations on a theme (each receiving treatment in its own subsection).

Some of these theories motivate each other, in particular, nominalism and regularity theories often go hand-in-hand. Others are relatively independent, while some are outright incompatible. These compatibility relationships themselves may be disputed.

Two points should be noted regarding this family tree. First, an important topic is absent: the nature of the causal relata. This is because any stance about their nature does not constitute a position about causation on its own; it cuts across this family tree and features importantly in some theories but not in others. While some philosophers have argued that it is very important (Mellor, 1995, 2004; Paul, 2004; Schaffer, 2007), and featured it centrally in their theories of causation (second level on the tree), it does not feature centrally in any approach to causation (top level on the tree), except that insofar as everyone agrees that the causal relata, whatever they are, must be distinct to avoid mixing up causal and constitutive facts. The topic is skipped in this article because, while it is interesting, it is somewhat orthogonal.

The second point to note about this family tree is that others are possible. There are many ways one might understand twenty-first-century work on causation, and thus there are other “family trees” implicit in other works, including other introductions to the topic. One might even think that no such family tree is useful. The one presented above is a tool only one that the reader might find useful, but it should ultimately be treated as itself a topic for debate, dispute, amendment, or rejection.

4. Semantic Analyses

Semantic analyses of causation seek to give the meaning of causal assertions. They typically take “c causes e” to be the exemplary case, where “c” and “e” may be one of a number of things: facts, events, aspects, and so forth. (Here, lower case letters c and e are used to denote some particular cause and effect respectively. Upper case letters C and E refer to classes and yield general causal claims, as in “smoking causes lung cancer”.) Whatever they are, they are universally agreed to be distinct, since otherwise we would wrongly confuse constitutive with causal relations. My T-shirt’s having yellow bananas might end up as a cause of its having yellow shapes on it, for example, which is widely considered to be unacceptable—because it is quite different from my T-shirt’s yellow bananas causing the waitress bringing me my coffee to stare.

The main three positions are regularity theories, probabilistic theories, and counterfactual theories.

a. Regularity Theories

The regularity theory implies that causes and effects are not usually one-off pairs, but recurrent. Not only is the coffee I just drank causing me to perk up, but drinking coffee often has this effect. The regularity view claims that two claims suffice to explain causation: the fact that causes are followed by their effects, plus the fact that cause-effect pairs happen a lot. On the other hand, coincidental pairings do not typically recur. I scratched my nose while drinking the coffee, and this scratching was followed by me perking up. But nose-scratching is not generally followed by perking up. Whereas coffee-drinking is. Coffee-drinking and perking up are part of a regularity; in Hume’s phrase they are constantly conjoined. Which cannot be said about nose-scratching and perking up.

Obviously, the tool needs sharpening. Most of the Cs that we encounter are not always followed by Es, and most of the Es that we encounter are not always (that is, not only) caused by Cs. The assassin shoots (c) the president, who dies (e). But assassins often miss. Moreover, presidents often die of things other than being shot.

David Hume is sometimes presented as offering a regularity theory of causation (usually on the basis of Pt 5 of Hume, 1748), but this is crude at best and downright false at worst (Garrett, 2015). More plausibly, he offered regularities as the most we can hope for in ontology of causation, that is, as the basis of any account of what there might be “in the objects” that most closely corresponds to the various causal notions we have. But his approach to semantics was revisionary; he took “cause” to express a feeling that the experience of regularity produces in us. Knowing whether such regularities continue in the objects beyond our experience requires that we know of some sort of necessary connection sustaining the regularity. And the closest thing to necessary connection that we know about is regularity. We are in a justificatory circle.

It was John Stuart Mill who took Hume’s regularity ontology and turned it into a regularity theory of causation (Mill, 1882). The first thing he did was to address the obvious point that causes and effects are not constantly conjoined, in either direction. He confined the direction of constancy to the cause-effect direction, so that causes are always followed by their effects, but effects need not be necessarily preceded by the same causes. He expanded the definition of “cause” to include the enormousness that suffices for the effect. So, if e is the president’s death, then to say that c caused e is not to say that Es are always preceded by Cs, but rather that Cs are always followed by Es. Moreover, when we speak of the president’s being shot as the cause, we are being casual and strictly inaccurate. Strictly speaking, c is not the cause, but c*, being the entirety of things that were in place, including the shot, such that this entirety of things is sufficient for the president’s death. Strictly speaking, c* is the cause of e. There is no mysterious necessitation invoked because “sufficient” here just means “is always followed by”. When the wind is as it was, and the president is where he was, and the assassin aims so, and the gun fires thus, and so on and so forth, the president always dies, in all situations of this kind.

Mill thought that exceptionless regularities could be achieved in this way. In fact, he believed that the “law of causality”, being the exceptionless regularity between causes (properly defined) and effects, was the only true law (Mill, 1882). All the laws of science, he believed, had exceptions: objects falling in air do not fall as Newton’s laws of motion say they should, for example (this example is not Mill’s own). But objects released just so, at just such a temperature and pressure, with just such a mass, shape and surface texture, always fall in this way. Thus, according to Mill, the law of causality was the fundamental scientific law.

This theory faces a number of objections, even setting aside the lofty claims about the status of the “law of causality”. The subsubsections below discuss four of them.

i. The Problem of Implausibly Enormous Cases

To be truly sufficient for an effect, a cause must be enormous. It must include everything that, if on another occasion it is different, yields an overall condition that is followed by a different effect. It is questionable that “cause” is reasonably understood as referring to such an enormousness.

Enormousness poses problems for more than just the analysis of the common meaning of “cause”. It also makes it unclear how we can arrive at and use knowledge of causes. These are such gigantic things that they are bound to be practically unknowable to us. What makes our merry inference from a shot by an ace assassin who has never yet missed to the imminent death of the president is not the fact that the assassin has never yet missed, since this constancy is incidental; the causal regularity is between huge preceding conditions. In the previous cases in this section where the assassin shot, these may well not have been at all the same.

It is not clear that such objections are compelling, however. The idea of Mill’s account concerns the nature of causation and not our knowledge of it, much less our casual inferences, which might well depend on highly contingent and local regularities, which might be underwritten by truly causal ones without instantiating them. Mill himself provides a lengthy discussion of the use of causal language to pick out one part of the whole cause. As for getting the details right, Mill’s central idea seems to admit of other implementations, and an advocate would want to try these.

There was a literature in the early-to-middle twentieth century trying, in effect, to mend Mill’s account so as to get the blend of necessity and sufficiency just right for correctly characterizing the semantics of “cause”, against a background assumption that Millian regularity was the ultimate ontological truth about causation. This literature took its final forms in Jonathan Mackie’s INUS analysis (Mackie, 1974).

Mackie offered more than one account of causation. His INUS analysis was an account of causation “in the objects”, that is, an account in the Humean spirit of offering the closest possible objective characterization of what we appear to mean by causal judgements, without necessarily supposing that causal judgements are ultimately meaningful or that they ultimately refer to anything objective.

Mackie’s view was that a cause was an insufficient but necessary part of an unnecessary but sufficient condition for the effect. Bear in mind that “necessary” and “sufficient” are to be understood materially, non-modally, as expressing regularities: “x is necessary for y” means “y is always accompanied (or in the causal case, preceded) by y” and “x is sufficient for y” means “x is always accompanied (or in the causal case, followed) by y”. If we set aside temporal order, necessity and sufficiency are thus inter-definable; for x to be sufficient for y is for y to be necessary for x, and vice versa.

Considering our assassin, how does his shot count as a cause, according to the INUS account?

Take the I of INUS first. The assassin’s shot was clearly Insufficient for the president’s death. The president might suddenly have dipped his head to bestow a medal on a citizen (Forsyth, 1971). All sorts of things can and do intervene on such occasions. Shots of this nature are not universally followed by deaths. c is Insufficient for e.

Second, take the N. The shot is clearly Necessary in some sense for the death. In that situation, without the shot, there would have been no death. In strict regularity-talk, such situations are not followed by deaths in the absence of a shot. At the same time, we can hardly say that shots are required for presidents to die; most presidents find other ways to discharge this mortal duty. Mackie explains this limited necessity by saying not that c is Necessary for e, but that c is a Necessary part of a larger condition that preceded e.

Moving to the U, this larger condition is Unnecessary for the effect. There are plenty of presidential deaths caused by things other than shots, as just discussed; this was the reason we saw for not saying that the shot is necessary for the death. c is an Insufficient part but Necessary part of an Unnecessary condition for e.

Finally, the S. The condition of which c is an unnecessary part (so far as the occurrence of e is concerned), but it is sufficient. e happens—and it is no coincidence that it does. In strict regularity talk, every such condition is followed by an E. There is no way for an assassin to shoot just so, in just those conditions, which include the non-ducking of the president, his lack of a bullet proof vest, and so forth, and for the president not to die. Thus c is an Insufficient but Necessary part of an Unnecessary but Sufficient condition for e. To state it explicitly:

c is a cause of e if and only if c is a necessary but insufficient part of an unnecessary but sufficient condition for e.

In essence, Mackie borrows Mill’s “whole cause” idea, but drops the implausible idea that “cause” strictly refers to the “whole cause”. Instead, he makes “cause” refer to a part of the whole cause, one that satisfies the special conditions.

As well as addressing the problem of enormousness, which is fundamentally a plausibility objection, Mackie intends his INUS account to address the further and probably more pressing objections which follow.

ii. The Problem of the Common Cause

An immediate problem for any regularity account of causation is that, just as effects have many causes, causes also have many effects, and these effects may accompany each other very regularly. Recall Mill’s clarification that effects need not be constantly preceded by the same causes, and that “constant conjunction” was in this sense directional: same causes are followed by same effects, but not vice versa. This is strongly intuitive—as the saying goes, there is more than one way to skin a cat. Essentially, Mill tells us that we do not have to worry that effects are not always preceded by the same causes.

However, we are still left in a predicament, even with this unidirectional constant conjunction of same-cause-same-effect. When combined with the fact that a single cause always has multiple effects, we seem to land up with the result that constant conjunctions will also obtain between these effects. Cs are always followed by E₁s, and Cs are always followed by E₂s. So, whenever there is a C, we have an E₁ and an E₂, meaning that whenever we have an E₁, we have an E₂, and vice versa.

How does a regularity theory get out of this without dropping the fundamental analytical tool it uses to distinguish cause from coincidence, the unfailing succession of same effect on same cause, knowing that the singular “effect” should actually be substituted with the plural “effects”?

Here is an example of the sort of problem for naïve regularity theories that Mackie’s account is supposed to solve. My alarm sounds, and I get out of bed. Shortly afterwards, our young baby starts to scream. This happens daily: the alarm wakes me up, and I get out of bed; but it also wakes the baby up. I know that it is not my getting out of bed that causes the baby to scream. How? Because I get out of bed in the night at various other times, and the baby does not wake up on those occasions; because my climbing out of bed is too quiet for a baby in another room to hear; and for other such reasons. Also, even when I sleep through the alarm (or try to), the baby wakes up. But what if the connections were as invariable as each other—there were no (or equally many) exceptions?

Consider this classic example. The air pressure drops, and my barometer’s needle indicates that there will be a storm. There is a storm. My barometer’s needle dropping obviously does not cause the storm. But, as a reliable storm-predictor, it is followed by a storm regularly—that is the whole point of barometers.

Mackie’s INUS theory supplies the following answer. The barometer’s falling is not an INUS condition for the storm’s occurrence, because situations that are exactly similar except for the absence of a barometer can and do occur. The falling of the barometer may be a part of a sufficient condition for the storm to occur, but it is not a necessary part of that condition. Storms happen even when no barometer is there to predict them. (Likewise, the storm is not an INUS condition for the barometer falling, in case that is a worry despite the temporal order, because barometers can be induced to fall in vacuum chambers.)

Thus the intuition I have in the alarm/baby case is the correct one; the regularity between alarm and baby waking is persistent regardless of my getting out of bed, and that between my getting out of bed and the baby waking fails in otherwise similar occasions where there is no alarm.

However, this all depends on a weakening of the initial idea behind the regularity theory, since it amounts to accepting that there are many cases of apparent causation without underlying regularity, which are therefore true, not in virtue of match-strikes being followed by flames, but for a more complicated reason, albeit one that makes use of the notion of regularity. Hume’s idea that we observe like causes followed by like effects suffers a blow, and together with it, the epistemological motivation of the regularity theory, as well as its theoretical elegance. It is to this extent a concession on the part of the regularity theory. There are other cases where we do want to say that c causes e even though Cs are not always followed by Es.

In fact, such is the majority of cases. Striking the match causes it to light even though many match-strikes fail to produce a spark, breaking the match, and so forth. There are similar scenarios in which the match is struck but there is no flame; yet the apparent conclusion that the match strike does not cause the flame cannot be accepted. Perhaps we must insist that the scenarios differ because the match is not struck exactly so, but now we are not analyzing the meaning of “striking the match caused it to light”, since we are substituting an unknown and complicated event for “striking the match”, for the sake of insisting that causes are always followed by their effects—which is a failing of the analytical tool.

Common cause situations thus present prima facie difficulties for the regularity account. Mackie’s account may solve the problem; nonetheless, if there were an account of causation that did not face the problem in the first place, or that dealt with the problem with less cost to the guiding idea of the regularity approach and with less complication, it would be even more attractive. This is one of the primary advantages claimed by the two major alternatives, counterfactual and probabilistic accounts, which are discussed in their two appropriate subsections below.

iii. The Problem of Overdetermination

As noted in the subsubsection on the problem of the common cause, many effects can be caused in more than one way. A president may be assassinated with a bullet or a poison. The regularity theory can deal with this easily by confining the relevant kind of regularity to one direction. In Mackie’s account, causes are not sufficient for their effects, which may occur in other ways. But the problem has another form. If an effect may occur in more than one way, what is to stop more than one of these ways from being present at the same time? Assassin 1 shoots the president, but Assassin 2’s on-target bullet would have done the job if Assassin 1 had missed. c causes e, but c’ would have caused e otherwise.

Such situations are referred to by various names. This article uses the term redundancy as a catch-all for any situation like this, in which a cause is “redundant” in the sense that the effect would have occurred without the actual cause. (Strictly, all that is required is that the cause might have occurred, because the negation of “would not” is “might” (Lewis, 1973b).) Within redundancy, we can distinguish symmetric from asymmetric overdetermination. Symmetric overdetermination occurs when two causes appear absolutely on a par. Suppose two assassins shoot at just the same time, and both bullets enter the president’s heart at just the same time. Either would have sufficed, but in the event, both were present. Neither is “more causal”. The example is not contrived. Such situations are quite common. You and I both shout “Look out!” to the pedestrian about to step in front of a car, and both our shouts are loud enough to cause the pedestrian to jump back. And so forth.

In asymmetric overdetermination, one of the events is naturally regarded as the cause, while the other is not, but both are sufficient in the circumstances for the effect. One is a back-up, which would have caused the effect had the actual cause not done so. For example, suppose that Assassin 2 had fired a little later than Assassin 1, and that the president was already dead by the time Assassin 2’s bullet arrived. Assassin 2’s shot did not kill the president, but had Assassin 1 not shot (or had he not shot accurately enough), Assassin 2’s shot would still have killed the president. Such cases are more commonly referred to as preemption, which is the terminology used in this article since it is more descriptive: the first cause preempts the second one. Again, preemption examples need not be contrived or far-fetched. Suppose I shout “Look out!” a moment after you, but still soon enough for the pedestrian to step back. Your shout caused the pedestrian to step back, but had you not shouted, my shout would have caused the pedestrian to step back. There is nothing outlandish about this; such things happen all the time.

The difficulty here is that there should be two INUS conditions where there is one. Assassin 1’s shot is a necessary part of a sufficient condition for the effect. But so is Assassin 2’s shot. However, Assassin 1’s shot is the true cause.

In the symmetric overdetermination case, one may take the view that they are both equally causes of the effect in question. However, there is still the preemption case, where Assassin 1 did the killing and not Assassin 2. (If you doubt this, imagine they are a pair of competitive twins, counting their kills, and thus racing to be first to the president in this case; Assassin 1 would definitely insist on chalking this one up as a win).

Causal redundancy has remained a thorn in the side of all mainstream analyses of causation, including the counterfactual account (see the appropriate subsection). What makes it so troubling is that we use this feature of causation all the time. Just as we exploit the fact that causes have multiple effects when we are devising measuring instruments, we exploit the fact that we can bring a desired effect about in more than one way every time we set up a failsafe mechanism, a Plan B, a second line of defense, and so forth. Causal redundancy is no mere philosopher’s riddle: it is a useful part of our pragmatic reasoning. Accounting for the fact that we use “cause” in situations where there is also a redundant would-be cause thus seems central to explicating “cause” at all.

iv. The Problem of Unrepeatability

This is less discussed than the problems of the common cause and overdetermination, but it is a serious problem for any regularity account. The problem was elegantly formulated by Bertrand Russell, who pointed out that, once a cause is specified so fully that its effect is inevitable, it is at best implausible and perhaps (physically) impossible that the whole cause occur more than once (Russell, 1918). The fundamental idea of the regularity approach is that cause-effect pairs instantiate regularities in a way that coincidences do not. This objection tells against this fundamental idea. It is not clear what the regularity theorist can reply. She might weaken the idea of regularity to admit of exceptions, but then the door is open to coincidences, since my nose-scratching just before the president’s death might be absent on another such occasion, and yet this might no longer count against its candidacy for cause. At any rate, the problem is a real one, casting doubt on the entire project of analyzing causation in terms of regularity.

We might respond by substituting a weaker notion than true sufficiency: something like “normally followed by”. Nose-scratchings are not normally followed by presidents’ deaths. However, this is not a great solution for regularity theories, because (a) the weaker notion of sufficiency is a departure from the sort of clarity that regularity theorists would otherwise celebrate, and (b) a similar battery of objections will apply: we can find events that, by coincidence, are normally followed by others, merely by chance. Indeed, if enough things happen, so that there are enough events, we can be very confident of finding at least some such patterns of events.

b. Counterfactual Theories

Mackie developed a counterfactual theory of the concept of causation, alongside his theory of causation in objects as regularity. However, at almost exactly the same time, a philosopher at the other end of his career (near the start) developed a theory sharing deep convictions about the fundamental nature of regularities, the priority of singular causal judgements, and the use of counterfactuals to supply their semantics, and yet setting the study of causation on an entirely new path. This approach dominated the philosophical landscape for nearly half a century since the time of writing, not only as a prominent theory of causation, but as an outstanding piece of philosophical work, and thus served as an exemplar for analytic metaphysicians, as a central part of the 1970s story of the emboldening of analytic metaphysics, following years in exile while positivism reigned.

David Lewis’s counterfactual theory of causation (Lewis, 1973a) starts with the observation that, commonly, if the cause had not happened, the effect would not have happened. To build a theory from this observation, Lewis had three major tasks. First, he had to explain what “would” means in this context; he had to provide a semantics for counterfactuals. Second, he had to deal with cases where counterfactuals appear to be true without causation being present, so that counterfactual dependence appears not to be sufficient for causation (since if it were, a lot of non-causes would be counted as causes). Third, he had to deal with cases where it appears that, if the cause had not happened, the effect would still have happened anyway: cases of causal redundancy, where counterfactual dependence appears not to be necessary for causation.

For a considerable period of time, the consensus was that Lewis had succeeded with the first two tasks but failed the third. In the early years of the twenty-first century, however, the second task—establishing that counterfactuals are sufficient for causation—also received critical scrutiny.

Lewis’s theory of causation does not state that effect counterfactually depends on cause, but rather, that c causes e if and only if there is a causal chain running from c to e whose links consist in a chain of counterfactual dependence. The reason for the use of chains is explained by the need to respond to the problem of preemption, as explained in the subsection covering the problem of causal redundancy. Counterfactual dependence is thus not a necessary condition for causation. However, it is a sufficient condition, since whenever we do find counterfactual dependence (of the “right sort”), we find causation. On his view, counterfactual dependence is thus sufficient but not necessary for causation; what is necessary is a chain of counterfactual dependence, but not necessarily the overarching dependence of effect on cause.

The best way to understand Lewis’s theory is through his responses to problems (as he himself sets it out). This is the approach taken in the remainder of this subsection.

i. The Problems of Common Cause, Enormousness and Unrepeatability

Lewis takes his theory to be able to deal easily with the problem of the common cause, which he parcels with another problem he calls the problem of effects. This is the problem that causes might be thought to counterfactually depend on their effects as well as the other way around. Not so, says Lewis, because counterfactual dependence is almost always forward-tracking (Lewis, 1973a, 1973b, 1979). The cases where it is not are easily identifiable, and these occurrences of counterfactual dependence are not apt for analyzing causation, just as counterfactuals representing constitutive relations (such as “If I were not typing, I would not be typing fast”) are not apt.

Lewis’s argument for the ban on backtracking is as follows. Suppose a spark causes a fire. We can imagine a situation where, with a small antecedent change, the fire does not occur. This change may involve breaking a law of nature (Lewis calls such changes “miracles”) but after that, the world may roll on exactly as it would under our laws (Lewis, 1979). This world is therefore very similar to ours, differing in one minor respect.

Now consider what we mean when we start a sentence with “If the fire had not occurred…” By saying so, we do not mean that the spark would not have occurred either. For otherwise, we would also have to suppose that the wire was never exposed, and thus that the careless slicing of a workman’s knife did not occur, and therefore that the workman was more conscientious, perhaps because his upbringing was different, and that of his parents before him, and…? Lewis says: that cannot be. When we assert a counterfactual, we do not mean anything like that at all. Rather, we mean that the spark still occurred, along with most other earlier events; but for some or other reason, the fire did not.

Why this is so is a matter of considerable debate, and much further work by Lewis himself. For these purposes, however, all that is needed is the idea that, by the time when the fire occurs, the spark is part of history, and there will be some other way to stop the fire—some other small “miracle”—that occurs later, and thus preserves a larger degree of historical match with the actual world, rendering it more similar.

The problem of the common cause is then solved by way of a simple parallel. It might appear that there is counterfactual dependence between the effects of a common cause: between barometer falling and storm, for example. Not so. If the barometer had not fallen, the air pressure, which fell earlier, would remain fallen; and the storm would have occurred anyway. If the barometer had not fallen, that would be because some tiny little “miracle” would have occurred shortly beforehand (even Lewis’s account requires at least this tiny bit of backtracking, and he is open about that.) This would lead to its not falling when it should. In a nutshell, if the barometer had not fallen, it would have been broken.

Put that way, the position does not sound so attractive; on the contrary, it sounds somewhat artificial. Indeed, this argument, and Lewis’s theory of causation as a whole, depend heavily on a semantics for counterfactuals according to which the closest world at which the antecedent is true determines the truth of the counterfactual. If the consequent is true at that world, the counterfactual is true; otherwise, not. (Where the antecedent is false, we have vacuous truth.) This semantics is complex and subject to many criticisms, but it is also an enormous intellectual achievement, partly because a theory of causation drops out of it virtually for free, or so it appears when the package is assembled. There is no space here to discuss the details of Lewis’s theory of counterfactuals (for critical discussions see in particular: Bennett, 2001, 2003; Elga, 2000; Hiddleston, 2005), but if we accept that theory, then his solution to the problem of effects follows easily.

Lewis deals even more easily with the problems of enormousness and unrepeatability that trouble regularity theories. The problem of enormousness is that, to ensure a truly exceptionless regularity, we must include a very large portion of the universe indeed into the cause (Mill’s doctrine of the “whole cause”). According to Mill, strictly speaking, this is what “cause” means. But according to common usage, it most certainly is not what “cause” means: when I say that the glass of juice quenched my thirst, I am not talking about the Jupiter, the Andromeda galaxy, and all the other heavenly bodies exerting forces on the glass, the balance of which was part of the story of the glass raising to my lips. I am talking about a glass of juice.

The counterfactual theory deals with this easily. If I had not drunk the juice, my thirst would not have been quenched. This is what it means to say that drinking the juice caused my thirst to be quenched; which is what I mean when I say that it quenched my thirst. There is no enormousness. There are many other causes, because there are many other things that, had they not been so, would have resulted in my thirst not being quenched. But, Lewis says, a multiplicity of causes is no problem; we may have all sorts of pragmatic reasons for singling some out rather than others, but these do not have implications for the underlying concept of cause, nor indeed for the underlying causal facts.

The problem of unrepeatability was that, once inflated to the enormous scale of a whole cause, it becomes incredible that such things recur at all, let alone regularly. Again, there is no problem here: ordinary events like the drinking of juice can easily recur.

While later subsubsections discuss the problems that have proved less tractable for counterfactual theories, we should firstly note that even if we set aside criticisms of Lewis’s theory of counterfactuals, his solution to the problem of the common cause is far less plausible on its own terms than Lewis and his commentators appear to have appreciated. It is at least reasonable to suggest that we use barometers precisely because they track the truth of what they predict (Lipton, 2000). It does not seem wild to think that if the barometer had not fallen, the storm would not after all have been going to occur: more naturally, the storm would not after all have been impending. Lewis’s theory implies that in the nearest worlds where the barometer does not fall, my picnic plans would have been rained out. If I believed that, I would immediately seek a better barometer.

Empirical evidence suggests that there is a strong tendency for this kind of reasoning in situations where causes and their multiple effects are suitably tightly connected (Rips, 2010). Consider a mechanic wondering why the car will not start. He tests the lights which also do not work. So he infers that it is probably the battery. It is. But in Lewis’s closest world where the lights do work, the battery is still flat: an outrageous suggestion for both the mechanic and any reasonable similarity-based semantics of counterfactuals (for another instance of this objection see Hausman, 1998). Or, if not, then he must accept that the lights’ not lighting causes the car not to start (or vice versa). Philosophers are not usually very practical and sometimes this shows up; perhaps causation is a particularly high-risk area in this regard, given its practical utility.

ii. The Problem of Causal Redundancy

If Assassin 1 had not shot (or had missed) then the president still would (or might) have died, because Assassin 2 also shot. Recall that two important kinds of redundancy can be distinguished (as discussed in the subsubsection on the problem of the common cause). One is symmetric overdetermination, where the two bullets enter the heart at the same time. Lewis says that in this case our causal intuitions are pretty hazy (Lewis, 1986). That seems right; imagine a firing squad—what would we say about the status of Soldier 1’s bullet, Soldier 2’s bullet, Soldier 3’s, … when they are all causally sufficient but none of them causally necessary? We would probably want to say that it was the whole firing squad that was the cause of the convict’s death. So we should in those residual overdetermination cases that cannot be dealt with in other ways, says Lewis. Assassin 1 and Assassin 2 are together the cause. The causal event is the conjunction of these two events. Had that event not occurred, the effect would not have occurred. Lewis runs into some trouble with the point that the negation of a conjunction is achieved by negating just one of its conjuncts, and thus Assassin 1’s not shooting is enough to render the conjunctive event absent—even if Assassin 2 had still shot and the president would still have died. Lewis says that we have to remove the whole event when we are assessing the relevant counterfactuals.

This starts to look less than elegant; it lacks the conviction and sense of insight that characterize Lewis’s bolder propositions. However, our causal intuitions are so unclear that we should take the attitude that spoils go to the victor (meaning, the account that has solved all the cases where our intuitions are clear). Even if this solution to symmetric overdetermination is imperfect, which Lewis does not admit, the unclarity of our intuitions would mean that there is no basis to contest the account that is victorious in other areas.

Preemption is the other central kind of causal redundancy, and it has proved a persistent problem for counterfactual approaches to causation. It cannot be set aside as a “funny case” in the way of symmetric overdetermination, because we do have clear ideas about the relevant causal facts, but they do not play nicely with counterfactual analysis. Assassins 1 and 2 may be having a competition as to who can chalk up more “kills”, in which case they will be deeply committed to the truth of the claim that the preempting bullet really did cause the death, despite the subsequent thudding of the loser’s bullet into the presidential corpse. A second later or a day later—it would not matter from their perspective.

Lewis’s attempted solution to the problem of preemption seeks, once again, to apply features of his semantics for counterfactuals. The two features applied are non-transitivity and, once again, non-backtracking.

Counterfactuals are unlike indicative conditionals in not being transitive (Lewis, 1973b, 1973c). For indicatives, the pattern If A then B, if B then C, therefore if A then C is valid. But not so for counterfactuals. If Bill had not gone to Cambridge (B), he would have gone to Oxford (C); and if Bill had been a chimpanzee (A), he would not have gone to Cambridge (B). If counterfactuals are transitive, then it can be concluded that, if Bill had been a chimpanzee (A), he would have gone to Oxford (C). Notwithstanding its prima facie appeal, this argument has not been found compelling, and the moral usually drawn is that transitivity fails for counterfactuals.

Lewis thus suggests that causation consists in a chain of counterfactual dependence, rather than a single counterfactual. Suppose we have a cause c and an effect e, connected by a chain of intermediate events d₁, … d_n. Lewis says: it can be false that if c had not occurred then e would not have occurred, yet true that c causes e, provided that there are events d₁, … d_n such that if c had not occurred then d₁ would not have occurred, and… if d_n (henceforth d_n is simply called d for readability) had not occurred, then e would not have occurred.

This is one step of the solution, because it provides for the effect to fail to counterfactually depend upon Assassin 1’s shot, yet Assassin 1’s shot to still be the cause. Provides for, but does not establish. The obvious remaining task is to establish that there is a chain of true counterfactuals from Assassin 1’s shot to the president’s death—and, if there is, that there is not also a chain from Assassin 2’s shot.

This is where the second deployment of a resource from Lewis’s semantics for counterfactuals comes into play (and this is sometimes omitted from explanations of how Lewis’s solution to preemption is supposed to work). His idea is that, at the time of the final event in the actual causal chain, d, the would-be causal chain has already been terminated, thanks to something in the actual causal chain. d* has already failed to happen, so to speak: its time has passed. So “~d → ~e” is true, because d* would not occur in the absence of d. ~d-worlds where d* occurs are further than some worlds where ~d* as in actuality.

This solution may work for some cases; these have become known as early preemption cases. But it does not work for others, and these have become known as late preemption. Consider the moment when Assassin 1’s bullet strikes the president, and suppose that this is the last event, d, in the causal chain from Assassin 1’s shot c to the president’s death e. Then ask what would have happened if this event had not happened—by a small miracle the bullet deviated at the last moment, for example. At that moment, Assassin 2’s bullet was speeding on its lethal path towards the president. On Lewis’s view, after the small miracle by which Assassin 1’s bullet does not strike (after ~d), the world continues to evolve as if the actual laws of nature held. So Assassin 2’s bullet strikes the president a moment later, killing him.

Various solutions have been tried. We might specify the president’s death very precisely, as the death that occurred just then, a moment earlier than the death that would have occurred had Assassin 2’s bullet struck; and the angle of the bullet would have been a bit different; and so forth. In short: that death would not have occurred, but for Assassin 1, even if some other, similar death, involving the same person and a similar cause, would have occurred in its place. But Lewis himself provides a compelling response, which is simply that this is not at all what phrases like “the president died” or “the death of the president” refer to when we use them in a causal statement. Events may be more or less tightly specified, and there can be a distinction drawn between actual and counterfactual deaths, tightly specified. But that is not the tightness of specification we actually use in this causal judgement, as in many others.

A related idea is to accept that the event of the president’s death is the same in the actual and counterfactual cases, but to appeal to small differences in the actual effect that would have happened if the actual cause had been a bit different. Therefore, in very close worlds, where Assassin 1 shot just a moment earlier or later, but still soon enough to beat Assassin 2, or where a fly in the bullet’s path had caused just a miniscule deviation, or similar ones, the death would have been just minutely different. It still counts as the same death-event, but with just slightly different properties. Influence is what Lewis calls this counterfactual co-variance of event properties, and he suggests that a chain of influence connects cause and effect, but not preempted cause and effect (Lewis, 2004a).

However, there even seem to be cases where influence fails, notably the trumping cases pressed in particular by Jonathan Schaffer (2004). Merlin casts a spell to turn the prince into a frog at the stroke of midnight. Morgana casts the same spell, but at a later point in the day. It is the way of magic, suppose, that the first spell cast is the one that operates; had Morgana cast a spell to turn the prince into a toad instead, the prince would nevertheless have turned into a frog, because Merlin’s earlier spell takes priority. Yet she in fact specified a frog. If Merlin had not cast his spell, the prince would still have turned into a frog—and there would have been no difference at all in the effect. There is no chain of influence.

We do not have to appeal to magic for such instances. I push a button to call an elevator, which duly illuminates, but even so, an impatient or unobservant person arriving directly after me pushes it again. The elevator arrives. It does so in just the same way and in just the same time as if I had not pushed it, or had pushed it just a tiny moment earlier or later, more or less forcefully, and so forth. In today’s world, where magic is rarely observed, electrical mediation of cause and effects is a fruitful hunting ground for cases of trumping.

There is a large literature on preemption, because the generally accepted conclusion is that, despite Lewis’s extraordinary ingenuity, the counterfactual analysis of causation cannot be completed. Many philosophers are still attracted to a counterfactual approach: indeed it is an active area of research outside philosophy (as in interdisciplinary work), offering as it does a framework for technical development and thus for operationalization in the business of inferring causes. But for analyzing causation—for providing a semantic analysis, for saying what “causation” means—there is general acceptance that some further resource is needed. Counterfactuals are clearly related to causation in a tight way, but the nature of that connection still appears frustratingly elusive.

iii. A New Problem: Causal Transitivity

Considerably more could be said about counterfactual analysis of causation; it dominated philosophical attention for decades, and drew more attention than any other approach after superseding the regularity theories in the 1970s. Since discussions of preemption dried up, attention has shifted to the less controversial claim that counterfactual dependence is sufficient for causation. One is briefly introduced here: transitivity.

In Lewis’s account, and more broadly, causation is often supposed to be transitive, even though counterfactual dependence is not. This is central to Lewis’s response to the problem of preemption. It also seems to tie with the “non-discriminatory” notion of cause, according to which my grandmother’s birth is among the causes, strictly speaking, of my writing these words, even if we rarely mention it.

To say that a relation R is transitive is to say that if R(x,y) and R(y,z) then R(x,z). There seem to be cases showing that causation is not like this after all. Hiker sees a boulder bounding down the slope towards him, ducks, and survives. Had the boulder not bounded, he would not have ducked, and had he not ducked, he would have died. There is a chain of counterfactual dependence, and indeed a chain of causation. But there is not an overarching causal relation. The bounding boulder did not cause Hiker’s survival.

Cases of this kind, known as double prevention, have provoked various solutions, not all of which involve an attempt to “fix” the Lewisian approach. Ned Hall suggests that there are two concepts of causation, which conflict in cases like this (Hall, 2004). Alex Broadbent suggests that permitting backtracking counterfactuals in limited contexts permits introducing as a necessary condition on causation the dependence of cause on effect, which cases of this kind fail (Broadbent, 2012). But their significance remains unclear.

c. Interventionism

There is a very wide range of other approaches to the analysis of causation, given the apparent dead ends that the big ideas of regularity and counterfactual dependence have reached. Some develop the idea of counterfactual dependence, but shift the approach from conceptual analysis to something less purely conceptual, more closely related to causal reasoning, in everyday and scientific contexts, and perhaps more focused on investigating and understanding causation than producing a neat and complete theory. Interventionism is the most well-known of these approaches.

Interventionism starts with the idea that causation is fundamentally connected to agency: to the fact that we are agents who make decisions and do things in order to bring about the goals we have decided upon. We intervene in the world in order to make things happen. James Woodward sets out to remove the anthropocentric component of this observation, to devise a characterization of interventions in broadly speaking objective terms, and to use this as the basis for an account of how causal reasoning works—meaning, it manages to track how the world works, and thus enables us to make things happen (Woodward, 2003, 2006).

Woodward’s interests are thus focused on causal explanation in particular, trying to answer the questions of what causal explanations amount to, what information they carry, what they mean. The notion of explanation he arrived at is analyzed and unpacked in detail. The target of analysis shifts from “c causes e”, not merely to “c explains e” (which was the target of much previous work in the philosophy of explanation), but to a full paragraph of explanation of why and how the temperature in a container increases when the volume is reduced in terms of the ideal gas law and the kinetic theory of heat.

Interventionism offers a different approach to thinking about causation, and perhaps the most difficult thing for someone approaching it from the perspective of the Western philosophical canon is to work out what exactly it achieves, or aims to achieve. It does not tell us precisely what causation itself is. It may help us understand causation; but if it does, the upshot does not fall short of being problematic—being a series of interesting observations, akin to those of J. L. Austin and the ordinary language philosophers, or an operationalizable concept of causation, one that might be converted into a fully automatic causal reasoning “module” to be implemented in a robot. The latter appears to be the goal of some in the non-philosophical world, such as Judea Pearl. Such efforts are ambitious and interesting, potentially illuminating the nature of causal inference, even if this potential is yet to be fully realized, and even if a question of significance so long as implementation remains hard to conceive.

Perhaps what interventionist frameworks offer is a language for talking about causation more precisely. So it is with Pearl, who is also a kind of interventionist, holding that causal facts can be formally represented in diagrams called Directed Acyclic Graphs displaying counterfactual dependencies between variables (Pearl, 2009; Pearl & Mackenzie, 2018). These counterfactual dependencies are assessed against what would happen if three was an intervention, a “surgical”, hypothetical one, to alter the value of only a (or some) specified variable(s). Formulating causal hypotheses in this way is meant to offer mathematical tools for analyzing empirical data, and such tools have indeed been developed by some, notably in epidemiology. In epidemiology, the Potential Outcomes Approach, which is a form of interventionism and a relative of Woodward’s philosophical account, attracts a devoted following. The primary insistence of its followers is on the precise formulation of causal hypotheses using the language of interventions (Hernán, 2005, 2016; Hernán & Taubman, 2008), which is a little ironic, given that a basis for Woodward’s philosophical interventionism was the idea of moving away from the task of strictly defining causation. The Potential Outcomes Approach constitutes a topic of intense debate in epidemiology (Blakely, 2016; Broadbent, 2019; Broadbent, Vandenbroucke, & Pearce, 2016; Krieger & Davey Smith, 2016; Vandenbroucke, Broadbent, & Pearce, 2016; VanderWeele, 2016), and its track record of actual discoveries remains limited; its main successes have been in re-analyzing old data which was wrongly interpreted at the time, but where the mistake is either already known or no longer matters.

If this sounds confusing, that is because it is. This is a very vibrant area of research. Those interested in interventionism are strongly advised not to confine themselves to the philosophical literature but to read at least a little of Judea Pearl’s (albeit voluminous) corpus, and engage with the epidemiological debate on the Potential Outcomes Approach. Even if it is yet to see its most concise and conceptually organized formulation on which work is ongoing, the initial lack of organization of a field of study is indicative of its ongoing development—exactly the kind of field one who is looking to make a mark, or at least a contribution, should take an interest in. Once the battle lines are drawn up, and the trenches are dug, the purpose of the entire war is called into question.

d. Probabilistic Theories

Probabilistic theories (for example: Eells, 1991; Salmon, 1993; Suppes, 1970) start with the idea that causes raise the probability of their effects. Striking a match may not always be followed by its lighting, but certainly makes it more likely; whereas coincidental antecedents, such as my scratching my nose, do not.

Probabilistic theories originate in part as an attempt to soften the excesses of regularity theories, given the absence of observable exceptionless regularities. More importantly, however, they are motivated by the observation that the world itself may be fundamentally indeterministic, if quantum physics withstands the test of time. A probabilistic theory could cope with a deterministic world as well as an indeterministic one, but a regularity theory could not. Moreover, given the shift in odds towards an indeterministic universe, the fights about regularity start to look scholastic, concerning the finer details of a superstructure whose foundations, never before critically examined, have crumbled upon exposure to the fresh air of empirical science.

Probabilistic approaches may be combined with other accounts, such as agency approaches (Price, 1991). Alternatively, probability may be taken as the primary analytical tool, and this approach has given rise to its own literature on probabilistic theories.

The first move of a probabilistic theory is to deal with the problem that effects raise the probability of other effects of a shared cause. To do so, the notion of screening off is introduced (Suppes, 1970). A cause has many effects, and conditionalizing on the cause alters their probabilities even if we hold the other effects fixed. But not so if we conditionalize on an effect. The probability of the storm occurring, given that the air pressure does not fall, is lower than the probability given that the air pressure does fall, even if we hold fixed the falling of the barometer; and vice versa. But if we hold fixed the air pressure falling (at, say 1 atmosphere, as in actuality) while conditionalizing on the barometer, we do not see any difference in the probability of the storm in case the barometer falls than in case it does not.

To unpack this a bit, consider all the occasions on which air pressure has fallen, all those on which barometers have fallen, and all those on which storms have occurred (and barometers have been present). The problem could then be stated like this. When air pressure falls, storm occurrences are very much more common than when it does not. Moreover, storm occurrences are very much more common in cases where barometers have fallen than in cases where they (have been present but) have not. Thus it appears that both air pressure and barometers cause storms. But, a question prompts, do they truly do so? Or is this one a case of spurious causation?

The screening-off solution says you should proceed as follows. First, consider how things look when you hold the barometer status fixed. In cases where the barometer does not fall, but air pressure does, storm occurrences are more frequent than in cases where neither the barometer falls nor does air pressure. Likewise in cases where barometers do fall. Now hold fixed air pressure status, considering first those cases where air pressure does not fall, but barometers do—storms are not more common there. Among cases where air pressure does fall, storms are not more common in cases where barometers do fall than in cases where they do not.

Thus, air pressure screens off the barometer falling from the storm. Once you settle on the behavior of the air pressure, and look only at cases where the air pressure behaves in a certain way, the behavior of the barometer is irrelevant to how commonly you find storms. On the other hand, if you settle on a certain barometer behavior, the status of the air pressure remains relevant to how commonly you encounter storms.

This asymmetry determines the direction of causation. Effects raise the probability of their causes, and indeed of other effects—that is why we can perform causal inference, and can infer the impending storm from the falling barometer. But causes “screen off” their effects from each other, while effects do not: the probability of the storm stops tracking the behavior of the barometer as soon as we fix the air pressure, which screens the storm from the barometer; whereas the probability of the storm continues to track the air pressure even when we fix the barometer (and likewise for the barometer when we fix the storm).

One major source of doubt about probabilistic theories is simply that probability and causation are different things (Gillies, 2000; Hesslow, 1976; Hitchcock, 2010). Causes may indeed raise probabilities of effects, but that is because causes make things happen, not because making things happen and raising their probabilities are the same thing. This general objection may be motivated by various counterexamples, of which perhaps the most important are chance-lowering causes.

Chance-lowering causes reduce the probability of their effects, but nonetheless cause them (Dowe & Noordhof, 2004; Hitchcock, 2004). Taking birth control pills reduces the probability of pregnancy. But it is not always a cause of non-pregnancy. Suppose that, as it happens, reproductive cycles are the cause. Or suppose that there is an illness causing the lack of pregnancy. Or suppose a man takes the pills. In such cases, provided the probability of pregnancy is not already zero, the pill may reduce the probability of pregnancy (albeit slightly), while the cause may be something else. In another well-worn example, a golfer slices a ball which veers off the course, strikes a tree, and bounces in for a hole in one. Slicing the ball lowered the probability of a hole in one but nonetheless caused it. Many attempts to deal with chance-lowering causes have been made, but none has secured general acceptance.

5. Ontological Stances

Ontological questions concern the nature of causation, meaning, in a phrase that is perhaps equally obscure, the kind of thing it is. Typically, ontological views of causation seek not only to explain the ontological status for its own sake, but to incorporate causation into a favored ontological framework.

There is a methodological risk in starting with, for example, “I’m a realist…” and then looking for a way to make sense of causation from this perspective. The risk is similar to that of a scientist who begins committed to a hypothesis and looks for a way to confirm it. This approach can be useful, leading to ingenuity in the face of discouraging evidence, and has led to some major scientific breakthroughs (such as Newtonian mechanics and germ theory, to take two quite different examples). It does not entail confirmation bias; indeed, the breakthrough cases are characterized by an obsession with the evidence that does not seem to fit, and by dissatisfaction with a weight of extant confirming evidence that might have convinced a lesser investigator. (Darwin’s sleepless nights about the male peacock’s tail amount to an example; the male peacock’s tail is a cumbersome impediment to survival, and Darwin had not rest until he found an explanation in terms of a mechanism differing from straightforward natural selection, namely, sexual selection.) However, in less genius hands, setting out to show how your theory can explain the object of investigation carries an obvious risk of confirmation bias; indeed, sometimes it turns the activity into something that does not deserve to be called an investigation at all. Moreover it can make for frustrating discussions.

One question about “the nature of causation” is whether causation is something that exists over and above particular things that are causally related, in any sense at all. Nominalism says no, realism says yes, and dispositionalism seeks to explain causation by realism about dispositions, which are things that nominalists would not countenance, but that are different from universals (or at least from the necessitation relation that realists endorse). Process theories offer something different again, seeking to identify a basis for causation in our current best science, thus remaining agnostic (within certain bounds) on larger metaphysical matters, and merely denying the need for causal theory to engage metaphysical resources (as do causal realism and dispositionalism) or to commit to a daunting reductive project (as does nominalism).

a. Nominalism

Nominalists believe that there is nothing (or very little) other than what Lewis calls “distinct existences” (Lewis, 1983, 1986). According to nominalism, causation is obviously not a particular thing because it recurs. So it is not a thing at all, existing over and above its particular instances.

The motivation for nominalism is the same as the motivation for regularity theories, that is, David Hume’s skeptical attack on necessary connection. The nominalist project is to show that sense can be made of causation, and knowledge of it obtained, without this notion. Ultimately, the goal is to show that (or at least to show to what extent) the knowledge that depends on causal knowledge is warranted.

Nominalism thus depends fundamentally on the success of the semantic project, which is discussed in the previous section. Attacks on those projects amount to attacks on, or challenges for, nominalism. They are not rehearsed here. The remainder of this section considers alternatives to nominalism.

b. Realism

Realists believe that there are real things, usually called universals, that exist in abstraction from particulars. Nominalists deny this. The debate is one of the most ancient in philosophy and this article is not the place to introduce it. Here, the topic is realism and nominalism about causation.

Realists believe that there is something often called the necessitation relation which holds between causes and effects, but not between non-causal pairs. Nominalists think that there is no such thing, but that causation is just some sort of pattern among causes and effects, for instance, that causes are always followed by their effects, distinguishing them from mere coincidences (see the subsection on regularity theories).

Before continuing, a note on the various meanings of “realism” is necessary. It is important not to confuse realism about causation (and, similarly, about laws of nature) with metaphysical realism. To be realist about something is to assert its mind-independent existence. In the case of universals, the debate is about whether they exist aside from particulars. The emphasis is on existence. In debates about metaphysical realism, the emphasis is on mind-independence. The latter is contrasted with relativist positions such as epistemic relativism, according to which there are no facts independent of a knower (Bloor, 1991, 2008), or Quine’s ontological relativity, according to which reference is relative to a frame of reference (Quine, 1969), which is best understood as either being or arising from a conceptual framework.

Nominalists may or may not be metaphysical anti-realists of one or another kind. In fact, unlike Quine (a nominalist, that is, an anti-realist about universals, and also a metaphysical anti-realist), the most prominent opponents of nominalism about causation (which is a kind of causal anti-realism) are metaphysical realists. For instance, the nominalist David Lewis believes that there is nothing (or relatively little) other than what he calls distinct existences, but he is realist about these existences (Lewis, 1984). In this area of the debate about causation, however, broad metaphysical realism is a generally accepted background assumption. The question is then whether or not causation is to be understood as some pattern of distinct existences, whether actual or counterfactual, or whether on the contrary it is to be understood as a universal: the “necessitation relation”.

The classic statements of realism about causation are by David Armstrong and Michael Tooley (Heathcote & Armstrong, 1991; Tooley, 1987). These also concern laws of nature, which, on their accounts, underlie causal relations. The core of such accounts of laws and causation is the postulation of a kind of necessity that is not logical necessity. In other words, they refuse to accept Hume’s skeptical arguments about the unintelligibility or unknowability of non-logical necessity (which are presented in the subsection on regularity theories). On Armstrong’s view, there is a second-order universal he calls the necessitation relation which relates first order universals, which are regular properties and relations such as being a massive object or having a certain velocity relative to a given frame of reference. If it is a law that sodium burns with a yellow flame, that means that the necessitation relation holds between the universals (or complexes of them) denoted by the predicates “is sodium” and “burns with a yellow flame”. Being sodium and burning necessitate a yellow flame.

Causal relations derive from the laws. The burning sodium causes there to be a yellow flame, because of the necessitation relation that holds between the universals. Where there is sodium, and it burns, there must be a yellow flame. The kind of necessity is not logical, and nor is it strictly exceptionless. But there is a kind of necessity, nonetheless.

How, exactly, are the laws meant to underlie causal relations? Michael Tooley considers the relation between causation and laws, on the realist account of both, in detail (Tooley, 1987). But even if it can be answered, the most obvious question for realism about universals is what exactly they are (Heathcote & Armstrong, 1991).

For the realist account of causation, saying what universals are is particularly important. That is because the necessitation relation seems somewhat unlike other universals. Second order universals such as, for example, shape, of which particular shapes partake, are reasonably intelligible. I have a grasp on what shape is, even if I struggle to say what it is apart from giving examples of actual shapes. At least, I think I know what “shape” means. But I do not know what “necessitates” means. David Lewis puts the point in the following oft-cited passage:

The mystery is somewhat hidden by Armstrong’s terminology. He uses ‘necessitates’ as a name for the lawmaking universal N; and who would be surprised to hear that if F ‘necessitates’ G and a has F, then a must have G? But I say that N deserves the name of ‘necessitation’ only if, somehow, it really can enter into the requisite necessary connections. It can’t enter into them just by bearing a name, any more than one can have mighty biceps just by being called ‘Armstrong’. (Lewis, 1983, p. 366)

Does realism help with the problems that nominalist semantic theories encounter? One advantage of realism is that it makes semantics easy. Causal statements are made true by the obtaining, or not, of the necessitation relation between cause and effect. This relation holds between the common cause of two effects, but not between the effects; between the preempting, but not the preempted, cause and the effect. Classic problems evaporate; they are an artefact of the need arising from nominalism to analyze causation in terms of distinct events, a project that realists are too wise to entertain.

But that may, in a way, appear as cheating. For it hardly sounds any different from the pre-theoretic statement that causes cause their effects, while effects of a common cause do not cause each other, and that preempted events are not causes.

One way to press this objection is to look at whether realism assists people who face causal problems, outside of philosophical debate. When people other than metaphysicians encounter difficulties with causation, they do not typically find themselves assisted by the notion of a relation of necessitation. Lawyers may apply a counterfactual “but for” test: but for the defendant’s wrongful act, would the harm have occurred? In doing so, they are not adducing more empirical evidence, but offering a different way to approach, analyze, or think through the evidence. They do not, however, find it useful to ask whether the defendant’s wrongful act necessitated the harm. In cases where the “but for” test fails, other options have been tried, including asking whether the wrongful act made the harm more probable; and scientific evidence is sometimes adduced to confirm that there is, in general, a possibility that the wrongful act could have caused the harm. But lawyers never ask anything like: did the defendant’s act necessitate the harm? Not only would this seem far too strong for any prosecution in its right mind to introduce; it would also not seem particularly helpful. The judge would almost certainly want help in understanding “necessitate”, which in non-obvious cases sounds as obscure and philosophical-sounding as “cause”, and then we would be back with the various legal “tests” that have been constructed.

The realist might reply that metaphysics is not noted for its practical utility, and that the underlying metaphysical explanation for regularities and counterfactuals is the existence of a necessitation relation. Fair enough, but it is interesting that offering counterfactuals or probabilities in place of causal terms is thought to elucidate them, and that there is not a further request to elucidate the counterfactuals or probabilities; whereas there would be a further request (presumably) to explicate necessitation. Realists seem to differ not just from nominalists but from everyone else in seeing universals as explaining all these things, while not seeing any need for further explication of universals.

c. Dispositionalism

Dispositionalism is a relatively newly explored view, aiming to take a different tack from nominalism and realism (Mumford & Anjum, 2011). Dispositions are fundamental constituents of reality on this view (Mumford, 1998). Counterfactuals are to be understood in terms of dispositions (and not the other way round (Bird, 2007)). Causation may also be explained in this way, and without dog-legging through counterfactuals, which averts the problems attendant on counterfactual analyses of causation.

To cause an effect is, in essence, to dispose the effect to happen. Effects do not have to happen. But causes dispose them to. This is how their probabilities are raised. This is why, had the cause not occurred, the effect would not have occurred.

The literature on dispositionalism is relatively new and developing in the 21^st century, with the position receiving a book-length formulation only in the 2010s (see Mumford & Anjum, 2011). Interested readers are invited to consult that work, which offers a much more useful introduction to the subtleties of this new approach than is effected here.

d. Process Theories

A further approach which has developed an interesting literature but which is not treated in detail in this article is the process approach. Wesley Salmon suggested that causation be identified with some physical quantity or property, which he characterized as the transmission of a “mark” from cause to effect (Salmon, 1998). This idea was critiqued and then developed by Phil Dowe, who suggested that the transmission of energy should be identified as the underlying physical quantity (Dowe, 2000). Dowe’s approach has the merits of freeing itself from the restrictions of conceptual analysis, while at the same time solving some familiar problems. Effects of a common cause transmit no energy to each other. Preempted events transmit no energy to the effects of the preempting causes, which, on the other hand, do so.

The attraction of substituting a scientific concept, or a bundle of concepts, for causation is obvious. Such treatments have proved fruitful for other pre-theoretic notions like “energy” and offer to fit causation into a worldview which, arguably (see the subsection on Russellian Republicanism), does not appear in our best scientific accounts of reality.

On the other hand, the account does face objections. Energy is in fact transmitted from Assassin 2’s shot to the president, as light bounces off and travels faster than a speeding bullet. Accounts like Dowe’s must be careful to specify the right physical process in order to remain plausible as accounts of causation and then to justify the choice of this particular process on some objective, and ultimately scientific, basis. There is also the problem that, in ordinary talk, we often regard absences or lacks as causes. It is my lack of organizational ability that caused me to miss the deadline. Whether absences can cause is a contested topic (Beebee, 2004; Lewis, 2004b; Mellor, 2004) and one reason for this is that they appear to be a problem for this account of causation.

6. Kantian Approaches

a. Kant Himself

Kant responded to Hume by taking further the idea that causation is not part of the objective world (Kant, 1781).

Hume argued that the only thing in the objects was regularity, and that this fell far short of fulfilling any notion of necessary connection. He further argued that our idea of necessary connection was merely a feeling of expectation. But Hume was (arguably) a realist about the world, and about the regularities it contains, even if he doubted our justification for believing in regularities and doubted that causation was anything in the world beyond a feeling we sometimes get.

Kant, however, took a different view of the world itself, of which causation is meant to be a part. His view is transcendental idealism, the view that space and time are ways in which we experience the world, but not features of the world itself. According to this view, the world exists but it is wholly unknowable. The world constrains what we experience, but what we experience does not tell us about what it is like in itself, that is, independent of how we experience it.

Within this framework, Kant was an empirical realist. That is to say, given the constraints that the noumenal world imposes on what we experience, there are facts about how the phenomenal world goes. Facts about this world are not simply “up to us”. They are partly determined by the noumenal world. But they are also partly determined by the ways we experience things, and thus we are unable to comprehend those things in themselves, apart from the ways we experience them. A moth bangs into a pane of glass, and cannot simply fly through it; the pane of glass constrains it. But clearly the moth’s perceptual modalities also constrain what kind of thing it takes the pane of glass to be. Otherwise, it would not keep flying into the glass.

Kant argued that causation is not an objective thing, but a feature of our experience. In fact, he argued that causation is essential to any kind of experience. The ordering of events in time only amounts to experience if we can distinguish within the general flow of events or of sensory experiences, some streams that are somehow connected. We see a ship on a river. We look away, and look back a while later, to see the ship further down the river (the example is discussed in the Second Analogy in Kant’s Critique of Pure Reason). Only if we can see this as the same ship, moved further along the river, can we see this as a ship and a river at all. Otherwise it is just a series of frames, no more comprehensible than a row of impressionist paintings in an art gallery.

Kant used causation as the exemplar of a treatment he extended to shape, number, and various other apparent features of reality which, in his view, are actually fundamental elements of the structure of experience. His argument that causation is a necessary component of all experience is no longer compelling. It seems that very young children have experience, but not much by way of a concept of causation. Some animals may be able to reason causally, but some clearly cannot, or at least cannot to any great extent. It is a separate question whether they have experience, and some seem to. Thus he seems to have over-extended his point. On the other hand, the insight that there is a fundamental connection between causation and some aspect of us and our engagement with the world may have something to it, and this has subsequently attracted considerable attention.

b. Agency Views

On agency views of causation, the fact that we are agents is inextricably tied up with the fact that we have a causal concept, think causally, make causal judgements, and understand the world as riddled with causality. Agents have goals, and seek to bring them about, through exercising what at least to them seems like their free will. They make decisions, and they do something about them.

Agency theories have trouble providing answers to certain key questions which renders them very unpopular. If a cause is a human action, then what of causes that are not human actions, like the rain causing the dam to flood—if such events are causes by analogy, then that prompts the troublesome questions for agency theories—in what respect are things like rain analogous to human actions? Did someone or something decide to “do” the rain? If not, then in what does the analogy consist?

The most compelling response to these questions lies in the work of Huw Price, beginning with a paper he co-wrote with Peter Menzies (Menzies & Price, 1993). They argue that causation is (or is like) a secondary property, like color. Light comes in various wavelengths, some of which we can perceive. We are able to differentiate among wavelengths to some level of accuracy. This differentiated perception is what we call “color”. We see color, not “as” wavelengths of light (whatever exactly that would be), but as a property of the things off which light bounces or from which it emanates. Color is thus not just a wavelength of light: it is a disposition that we have to react in a certain way to a wavelength of light; alternatively, it is a disposition of light to provoke a certain reaction in us.

Causation, they suggest, is a bit like this. It has some objective basis in the world; but is also depends on us. It is mediated not by our being conscious beings, as in the case of color, but by our being agents. Certain patterns of events in the world, or at least certain features of the world, produce a “cause-effect” response in us. We cannot just choose what causes what. At the same time, this response is not reducible to features of the world alone; our agency is part of the story.

This approach deals with the anthropomorphism objection by retaining the objective basis of causes and effects, while confirming that the interpretation of this objective basis as causal is contributed by us due to the fact that we are agents.

This approach is insufficiently taken up in the literature, and there is not a well-developed literature of objections and responses, beyond the point that the approach remains suggestive and not completely made out. Price has argued for a perspectivalism about causation, arguing that entropic or other asymmetries account for the asymmetries that we project onto time and counterfactual dependence.

Yet this is a sensible direction of exploration, given our inability to observe causation in objects, and our apparent failure to find an objective substitute. It departs from the kind of realism that is dominant in early twenty-first century philosophy of causation, but perhaps that departure is due.

7. Skepticism

a. Russellian Republicanism

Bertrand Russell famously argued that causation was “a relic of a bygone age, surviving, like the monarchy, only because it is erroneously supposed to do no harm” (Russell, 1918). He advanced arguments against the Millian regularity view of causation that was dominant at the time, one of which is the unrepeatability objection discussed above in this article. But his fundamental point is a simple one: our theories of the fundamental nature of reality have no place for the notion of cause.

One response is simply to deny this, and to point out that scientists do use causal language all the time. It is however doubtful that this defense deflects the skeptical blow. Whether or not physicists use the word “cause”, there is nothing like causation in the actual theories which are expressed by equations. As Judea Pearl points out, mathematical equations are symmetrical (Pearl & Mackenzie, 2018). You can rearrange them to solve for different variables. They still say the same thing, in all their arrangements. They express a functional relationship between variables. Causation, on the other hand, is asymmetric. The value of the causal variable(s) sets the value of the effect variable(s). In a mathematical equation, however, “setting” is universal. If one changes the value of the pressure in a fixed mass of gas, then, according to the ideal gas law, either volume or temperature must change (or both). But there is no way to increase the pressure except through adjusting the volume or temperature. The equations do not tell us that.

A better objection might pick up on this response by saying that this example shows that there are causal facts. If physics does not capture them, then it should.

This response is not particularly plausible at a fundamental level, where the prospect of introducing such an ill-defined notion as cause into the already rather strange world of quantum mechanics is not appealing. But it might be implemented through a reductionist strategy. Huw Price offers something like this, suggesting that certain asymmetries, notably the arrow of time, might depend jointly upon our nature as agents, and the temporal asymmetry of the universe. Such an approach is compatible with Russell’s central insight but dispenses with his entertaining, overly enthusiastic, dismissal of the utility of causation. It remains useful, despite being non-fundamental; and its usefulness can be explained. This is perhaps the most promising response to Russell’s observation, and one that deserves more attention and development in the literature.

b. Pluralism and Thickism

Pluralists believe that there is no single concept of causation, but a plurality of related concepts which we lump together under the word “causation” (Anscombe, 1971; Cartwright, 1999). This view tends to go hand-in-hand with a refusal to accept a basic premise of Hume’s challenge, which is that we do not observe causation. We do observe causation, say the objectors. We see pushes, kicks, and so forth. Therefore, they ask, in what sense are we not observing causation?

This line of thought is compelling to some but somewhat inscrutable to many, who remain convinced that pushes and kicks look just the same as coincidental sequences like the sun coming out just before the ball enters the goal or the shopping cart moves—until we have learned, from experience, that there is a difference. Thus, most remain convinced that Hume’s challenge needs a fuller answer. Most also agree with Hume that there is something that causes have in common, and that one needs to understand this if one is to distinguish the kicks and pushes of the world from the coincidences.

A related idea, a form of pluralism, one might call thickism. In an ethical context, some have proposed the existence of “thick” ethical concepts characterized by their irreducibility into an evaluative and factual component. (This is a critique of another Humean doctrine, the fact-value distinction.) Thus, generosity is both fundamentally good and fundamentally an act of giving. It is not a subset of acts of giving defined as those which are good; some of these might be rather selfish, but better than nothing; others might be gifts of too small a kind to count as generous; others might be good for other reasons, because they bring comfort rather than because they are generous (bringing a bunch of flowers to a sick person is an act of kindness but not really generosity). Generosity is thick.

The conclusion one might draw from the existence of thick concepts is that there is not (or not necessarily) a single property binding all the thick concepts together, and thus that it is fruitless to try to identify or analyze it. Similar remarks might be applied to causes. Transitive verbs are commonly causal. To push the cart along is not analyzable into, say, to move forward and at the same time to cause the cart to move. One could achieve this by having a companion push the cart when you move forward, and stop when you stop. Pushes (in the transitive sense) are causal, but the causal element cannot be extracted for analysis.

Against this contention is the point that, in a practical context, the extraction of causation seems exactly what is at issue. In the statistics-driven sciences, in law, in policy-decisions, the non-causal facts seem clear, but the causal facts not. The question is exactly whether the non-causal facts are accompanied by causation. There does seem to be an important place in our conceptual framework for a detached concept of cause, because we apply that concept beyond the familiar world of kicks and pushes. As for those familiar causes, the tangling up of a kind of action with a cause hardly shows that there is no distinction between causes and non-causes. If we do not call a push-like action a cause on one occasion (when my friend pushes the trolley according to my movements) while we do on another (when I push the trolley), this could just as easily be taken to show that we need a concept of causation to distinguish pushing from mere moving forward.

8. References and Further Reading

Anscombe, G. E. M. (1958). Modern moral philosophy. Philosophy, 33(124), 1–19.
Anscombe, G. E. M. (1969). Causality and Extensionality. The Journal of Philosophy, 66(6), 152–159.
Anscombe, G. E. M. (1971). Causality and Determination. Cambridge: Cambridge University Press.
Armstrong, D. (1983). What Is a Law of Nature? Cambridge: Cambridge University Press.
Beebee, H. (2004). Causing and Nothingness. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 291–308). Cambridge, Massachusetts: MIT Press.
Bennett, J. (2001). On Forward and Backward Counterfactual Conditionals. In G. Preyer, & F. Siebelt (Eds.), Reality and Humean Supervenience (pp. 177–203). Maryland: Rowman and Littleﬁeld.
Bennett, J. (2003). A Philosophical Guide to Conditionals. Oxford: Oxford University Press.
Bird, A. (2007). Nature’s Metaphysics. Oxford: Oxford University Press.
Blakely, T. (2016). DAGs and the restricted potential outcomes approach are tools, not theories of causation. International Journal of Epidemiology, 45(6), 1835–1837.
Bloor, D. (1991). Knowledge and Social Imagery (2nd ed.). Chicago: University of Chicago Press.
Bloor, D. (2008). Relativism at 30,000 feet. In M. Mazzotti (ed.), Knowledge as Social Order: Rethinking the Sociology of Barry Barnes (pp. 13–34). Aldershot: Ashgate.
Broadbent, A. (2012). Causes of causes. Philosophical Studies, 158(3), 457–476. https://doi.org/10.1007/s11098-010-9683-0
Broadbent, A. (2016). Philosophy for graduate students: Core topics from metaphysics and epistemology. In Philosophy for Graduate Students: Core Topics from Metaphysics and Epistemology. https://doi.org/10.4324/9781315680422
Broadbent, A. (2019). The C-word, the P-word, and realism in epidemiology. Synthese. https://doi.org/10.1007/s11229-019-02169-x
Broadbent, A., Vandenbroucke, J. P., & Pearce, N. (2016). Response: Formalism or pluralism? A reply to commentaries on “causality and causal inference in epidemiology.” International Journal of Epidemiology, 45(6), 1841–1851. https://doi.org/10.1093/ije/dyw298
Cartwright, N. (1983). Causal Laws and Effective Strategies. Oxford: Clarendon Press.
Cartwright, N. (1999). The Dappled World: A Study of the Boundaries of Science. Cambridge: Cambridge University Press.
Cartwright, N. (2007). Hunting Causes and Using Them: Approaches in Philosophy and Economics. New York: Cambridge University Press.
Dowe, P. (2000). Physical Causation. Cambridge: Cambridge University Press.
Dowe, P., & Noordhof, P. (2004). Cause and Chance: Causation in an Indeterministic World. London: Routledge.
Eells, E. (1991). Probabilistic Causality. Cambridge: Cambridge University Press.
Elga, A. (2000). Statistical Mechanics and the Asymmetry of Counterfactual Dependence. Philosophy of Science (Proceedings), 68(S3), S313–S324.
Forsyth, F. (1971). The Day of the Jackal. London: Hutchinson.
Garrett, D. (2015). Hume’s Theory of Causation. In D. C. Ainslie, & A. Butler (Eds.), The Cambridge Companion to Hume’s Treatise (pp. 69–100). https://doi.org/10.1017/CCO9781139016100.006
Gillies, D. (2000). Philosophical Theories of Probability. London: Routledge.
Hall, N. (2004). Two Concepts of Causation. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 225–276). Cambridge, Massachusetts: MIT Press.
Hausman, D. (1998). Causal Asymmetries. Cambridge: Cambridge University Press.
Heathcote, A., & Armstrong, D. M. (1991). Causes and Laws. Noûs, 25(1), 63–73. https://doi.org/10.2307/2216093
Hernán, M. A. (2005). Invited Commentary: Hypothetical Interventions to Define Causal Effects—Afterthought or Prerequisite? American Journal of Epidemiology, 162(7), 618–620.
Hernán, M. A. (2016). Does water kill? A call for less casual causal inferences. Annals of Epidemiology, 26(10), 674–680.
Hernán, M. A., & Robins, J. M. (2020). Causal Inference: What If. Retrieved from https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/
Hernán, M. A., & Taubman, S. L. (2008). Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity, 32, S8–S14.
Hesslow, G. (1976). Two Notes on the Probabilistic Approach to Causality. Philosophy of Science, 43(2), 290–292.
Hiddleston, E. (2005). A Causal Theory Of Counterfactuals. Australasian Journal of Philosophy, 39(4), 632–657.
Hitchcock, C. (2004). Routes, processes and chance-lowering causes. In P. Dowe, & P. Noordhof (Eds.), Cause and Chance (pp. 138–151). London: Routledge.
Hitchcock, C. (2010). Probabilistic Causation. Stanford Encyclopedia of Philosophy. Retrieved from https://plato.stanford.edu/archives/fall2010/entries/causation-probabilistic/
Hume, D. (1748). An Enquiry Concerning Human Understanding (1st ed.). London: A. Millar.
Kant, I. (1781). The Critique of Pure Reason (1st ed.).
Krieger, N., & Davey Smith, G. (2016). The ‘tale’ wagged by the DAG: broadening the scope of causal inference and explanation for epidemiology. International Journal of Epidemiology, 45(6), 1787–1808. https://doi.org/10.1093/ije/dyw114
Lewis, D. (1973a). Causation. Journal of Philosophy, 70 (17), 556–567.
Lewis, D. (1973b). Counterfactuals. Cambridge, Massachusetts: Harvard University Press.
Lewis, D. (1973c). Counterfactuals and Comparative Possibility. Journal of Philosophical Logic, 2(4), 418–446.
Lewis, D. (1979). Counterfactual Dependence and Time’s Arrow. Noûs, 13(4), 455–476.
Lewis, D. (1983). New Work for a Theory of Universals. Australasian Journal of Philosophy, 61(4), 343–377.
Lewis, D. (1984). Putnam’s Paradox. Australasian Journal of Philosophy, 62(3), 221–236.
Lewis, D. (1986). Philosophical Papers (vol. II). Oxford: Oxford University Press.
Lewis, D. (2004a). Causation as Inﬂuence. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 75–106). Cambridge, Massachusetts: MIT Press.
Lewis, D. (2004b). Void and Object. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 277–290). Cambridge, Massachusetts: MIT Press.
Lipton, P. (2000). Tracking Track Records. Proceedings of the Aristotelian Society ― Supplementary Volume, 74(1), 179–205.
Mackie, J. (1974). The Cement of the Universe. Oxford: Oxford University Press.
Mellor, D. H. (1995). The Facts of Causation. Abingdon: Routledge.
Mellor, D. H. (2004). For Facts As Causes and Eﬀects. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 309–324). Cambridge, Massachusetts: MIT Press.
Menzies, P., & Price, H. (1993). Causation as a Secondary Quality. The British Journal for the Philosophy of Science, 44(2), 187–203.
Mill, J. S. (1882). A System of Logic, Ratiocinative and Inductive (8th ed.). New York and Bombay: Longman’s, Green, and Co.
Mumford, S. (1998). Dispositions. Oxford: Oxford University Press.
Mumford, S., & Anjum, R. L. (2011). Getting Causes from Powers. London: Oxford University Press.
Paul, L. A. (2004). Aspect Causation. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 205–223). Cambridge, Massachusetts: MIT Press.
Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). Cambridge: Cambridge University Press.
Pearl, J., & Mackenzie, D. (2018). The Book of Why. New York: Basic Books.
Price, H. (1991). Agency and Probabilistic Causality. The British Journal for the Philosophy of Science, 42(2), 157–176.
Quine, W. V. (1969). Ontological Relativity and Other Essays. New York: Columbia University Press.
Rips, L. J. (2010). Two Causal Theories of Counterfactual Conditionals. Cognitive Science, 34(2), 175–221. https://doi.org/10.1111/j.1551-6709.2009.01080.x
Rubin, D. (1974). Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology, 66(5), 688–701.
Russell, B. (1918). On the Notion of Cause. London: Allen and Unwin.
Salmon, W. C. (1993). Probabilistic Causality. In E. Sosa, & M. Tooley (Eds.), Causation (pp. 137-153). Oxford: Oxford University Press.
Salmon, W. C. (1998). Causality and Explanation. Oxford: Oxford University Press.
Schaﬀer, J. (2004). Trumping Preemption. In J. Collins, N. Hall, & L. A. Paul (Eds.), Causation and Counterfactuals (pp. 59–74). Cambridge, Massachusetts: MIT Press.
Schaffer, J. (2007). The Metaphysics of Causation. Stanford Encyclopedia of Philosophy. Retrieved from https://plato.stanford.edu/archives/win2007/entries/causation-metaphysics/
Stapleton, J. (2008). Choosing What We Mean by “Causation” in the Law. Missouri Law Review, 73(2), 433–480. Retrieved from https://scholarship.law.missouri.edu/mlr/vol73/iss2/6
Suppes, P. (1970). A Probabilistic Theory of Causality. Amsterdam: North-Holland.
Tooley, M. (1987). Causation: A Realist Approach. Oxford: Clarendon Press.
Vandenbroucke, J. P., Broadbent, A., & Pearce, N. (2016). Causality and causal inference in epidemiology: the need for a pluralistic approach. International Journal of Epidemiology, 45(6), 1776–1786. https://doi.org/10.1093/ije/dyv341
VanderWeele, T. J. (2016). Commentary: On Causes, Causal Inference, and Potential Outcomes. International Journal of Epidemiology, 45(6), 1809–1816.
Woodward, J. (2003). Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press.
Woodward, J. (2006). Sensitive and Insensitive Causation. The Philosophical Review, 115(1), 1–50.

Author Information

Alex Broadbent
Email: abbroadbent@uj.ac.za
University of Johannesburg
Republic of South Africa

Kit Fine (1946—)

Kit Fine is an English philosopher who is among the most important philosophers of the turn of the millennium. He is perhaps most influential for reinvigorating a neo-Aristotelian turn within contemporary analytic philosophy. Fine’s prolific work is characterized by a unique blend of logical acumen, respect for appearances, ingenious creativity, and originality. His vast corpus is filled with numerous significant contributions to metaphysics, philosophy of language, logic, philosophy of mathematics, and the history of philosophy.

Although Fine is well-known for favoring ideas familiar from the neo-Aristotelian tradition (such as dependence, essence, and hylomorphism), his work is most distinctive for its methodology. Fine’s general view is that metaphysics is not best approached through the study of language Roughly put, Fine’s approach focuses on providing a rigorous account of the apparent phenomena themselves, and not just how we represent them in language or thought, prior to any attempt to discern the reality underlying them. Furthermore, a strong and ecumenical respect for the intelligible options, demands patience for the messy details, even when they resist tidying or systematization. All this leads to a steadfastness in refusing to allow epistemic qualms about how we know what we seem to know to interfere with our attempts to clarify just what it is that we seem to know.

This article surveys the wide variety of Fine’s contributions to philosophy, and it conveys what Fine’s distinctive methodology is and how it informs his contributions to philosophy.

Biography
Fine Philosophy
Metaphysics
1. Modality
2. Essence
3. Ontology
4. Mereology
5. Realism
6. Ground
7. Tense
Philosophy of Language
Logics and Mathematics
History
References and Further Reading

1. Biography

Fine was born in England on March 26, 1946. He earned a B.A. in Philosophy, Politics, and Economics at the University of Oxford in 1967. He was then appointed to a position at the University of Warwick. There he was mentored by Arthur Prior. Although Fine was never enrolled in a graduate program, his Ph.D. thesis For Some Proposition and So Many Possible Worlds was examined and accepted by William Kneale and Dana Scott just two years later.

Since then, Fine has held numerous academic appointments, including at: University of Warwick; St John’s College, University of Oxford; University of Edinburgh; University of California, Irvine; University of Michigan, Ann Arbor; and University of California, Los Angeles. Fine joined New York University’s philosophy department in 1997, where he is now Silver Professor and University Professor of Philosophy and Mathematics. He is currently also a Distinguished Research Professor at the University of Birmingham. Fine also held visiting positions at: Stanford University; University of Toronto; University of Arizona; Australian National University; University of Melbourne; Princeton University; Harvard University; New York University at Abu Dhabi; University of Aberdeen; and All Souls College, University of Oxford.

He has served the profession as an editor or an editorial board member of Synthese; The Journal of Symbolic Logic; Notre Dame Journal of Formal Logic; and Philosophers’ Imprint.

Fine’s contributions to philosophy have been recognized by numerous awards, including a Guggenheim Foundation Fellowship, American Council of Learned Societies Fellowship, Fellow of the American Academy of Arts and Sciences, Fellow at the National Center for the Humanities, Corresponding Fellow at the British Academy, an Anneliese Maier Research Award from the Alexander von Humboldt Foundation, and a Leibowitz Award (with Stephen Yablo).

Fine’s corpus is enormous. By mid-2020 he had published over 130 journal articles, 5 books. At least half a dozen articles and 8 monographs are forthcoming. His work is at once of both great breadth and depth, spanning many core areas of philosophy and engaging its topics with great erudition and technical sophistication. His trailblazing work is highly original, rarely concerned with wedging into topical or parochial debates but rather with making novel advances to the field in creative and unexpected ways. This article de-emphasizes his technical contributions, and it focuses upon his more distinctive or influential work.

2. Fine Philosophy

When engaging with the work of any prolific philosopher exhibiting great breadth and originality, it is tempting to look for some core “philosophical attractors” that animate, unify, or systematize their work. These attractors may then serve as useful aids to understanding their work and highlighting its most distinctive features.

Perhaps the most familiar form a philosophical attractor might take is that of a doctrine. These “doctrinal attractors” are polarized, pulling in some views while repelling others. Their “magnetic” tendencies are what systematize a thinkers’ thought. In the history of modern philosophy, two obvious examples are Spinoza and Leibniz. Their commitment to the principle of sufficient reason, the doctrine that everything has a reason or cause, underwrites vast swaths of their respective philosophies (Spinoza 1677; Leibniz 1714). A good example in the twentieth century is David Lewis. One can scarcely imagine understanding Lewis’s philosophy without placing at its core the doctrines of Humean supervenience and modal realism (Lewis 1986).

Another form a philosophical attractor might take is that of a methodology. These methodological attractors are also polarized, but they exert their force less on views and more on which data to respect and which to discard, which distinctions to draw and which to ignore, how weighty certain considerations should be or not, and the like. Hume is an example in the history of modern philosophy. His commitment to respecting only that which makes an observable difference guides much of his philosophy (Hume 1739). Saul Kripke is an example in the twentieth century. One can scarcely imagine understanding his philosophy without placing at its core a respect for common sense and intuitions about what we should say of actual and counterfactual situations (Kripke 1972).

There is no question that Fine is well-known for his association with certain doctrines or topics. These include: actualism, arbitrary objects, essentialism, ground, hylomorphism, modalism, procedural postulationism, semantic relationism, (formerly) supervaluationism, three-dimensionalism, and truthmaker semantics. But as important as these may be to understanding Fine’s work, they do not serve individually or jointly as doctrinal attractors in the way that, for example, Humean supervenience or modal realism did so vividly for Lewis.

Instead, Fine’s work is better understood in terms of a distinctive “Finean” cluster of methodological attractors. Fine himself has not spelled out the details of the cluster explicitly. But some explicit discussion of it can be found in his early work (1982c: §A2). There are also discussions suggestive of the cluster scattered across many of his later works. But perhaps the strongest impression emerges by osmosis from sustained engagement with a range of his work.

The Finean cluster may be roughly summarized by the following methodological “directives”:

1. Provide a rigorous account of the appearances first before trying to discern the reality underlying them.
2. Focus on the phenomenon itself and not just how we represent or express it in language or thought.
3. Respect what’s at issue by not allowing worries about what we can mean from preventing us from accepting the intelligibility of notions that strike us as intelligible.
4. Be patient with the messy details even when they resist tidying or systematization.
5. Don’t allow epistemic worries about how we know what we seem to know interfere with or distract us from clarifying what it is that we seem to know.

Some of these directives interact or overlap. Even so, separating them helps highlight their different emphases. Bearing them in mind both individually and jointly is crucial to understanding Fine’s distinctive approach to the vast array of topics covered in his work.

Sometimes the influence of the directives is rather explicit. For example, the first directive clearly influences Fine’s views on realism and the nature of metaphysics. Implicit in this directive is a distinction between appearance and reality. Fine suggests that each is the focus of its own branch of metaphysics. Naïve metaphysics studies the appearances whereas foundational metaphysics studies their underlying reality. Because we have not yet achieved rigorous clarification of the appearances, Fine believes it would be premature to investigate the reality underlying them.

Other times, however, the directives exert their influence in more implicit ways. To illustrate, consider the first directive’s emphasis on providing a rigorous account of the appearances. Although Fine’s tremendous technical skill is clear in his work in formal logic, it also suffuses his philosophical work. Claims or ideas are often rigorously formalized in appendices or sometimes in the main text. Even when Fine’s prose is informal at the surface, it is evident that his technical acuity and logical rigor support it from beneath.

The second directive is perhaps most evident in Fine’s focus on the phenomena. Even in our post-positivistic times, some philosophers still lose their nerve when attempting to do metaphysics and, instead, retreat to our language or thought about it. An aversion to this is implicit throughout Fine’s work. Sometimes Fine makes his aversion explicit (2003a: 197):

…in this paper…I have been concerned, not with material things themselves, but with our language for talking about material things. I feel somewhat embarrassed about writing such a strongly oriented linguistic paper in connection with a metaphysical topic, since it is my general view that metaphysics is not best approached through the study of language.

Behind Fine’s remarks is a view that the considerations relevant to language often differ from those relevant to its subject matter. Only confusion can result from this sort of mismatch. So Fine’s apology is perhaps best explained by his unapologetic insistence that our interest is in the phenomena. However esoteric or unruly they may be, we should boldly resist swapping them out for the pale shadows they cast in language or thought.

The third directive is implicit in Fine’s frequent objections to various doctrines for not properly respecting the substantiveness, or even intelligibility, of certain positions. To illustrate, Fine defends his famous counterexamples against modal conceptions of essence by applying the third directive (1994b: 5):

Nor is it critical to the example that the reader actually endorse the particular modal and essentialist claims to which I have made appeal. All that is necessary is that he should recognize the intelligibility of a position which makes such claims.

Even if the claims are incorrect, their intelligibility is still enough to establish that there is a genuine non-modal conception of essence. Considerations like these illustrate Fine’s ecumenical approach. But this ecumenicity does not imply that anything goes, as Fine makes clear elsewhere when discussing fundamentality (2013a: 728):

Of course, we do not want to be able to accommodate any old position on what is and is not fundamental. The position should be coherent and it should perhaps have some plausibility. It is hard to say what else might be involved, but what seems clear is that we should not exclude a position simply on the grounds that it does not conform to our theory…

There appears to be a sort of humility driving Fine’s applications of the fourth directive. Philosophy aspires to the highest standards of clarity, precision, and rigor. This is why philosophical progress is so hard to achieve, and so modest when it is achieved. Thus, at least at this early stage of inquiry, there is a sort of arrogance in justifying one’s disregard for certain positions by appealing to one’s doctrinal commitments. Perhaps this also explains the scarcity of doctrinal attractors in Fine’s work.

The fourth directive often manifests in Fine’s work as an openness—perhaps even a fondness—for drawing many subtle distinctions. To some extent, this is explained by Fine’s keen eye for detail and his respect for nuance. But a deeper rationale derives from an interaction between the first two directives. For if these subtle distinctions belong to the appearances, then we must ultimately expect a rigorous account of the latter to include the former. This helps explain Fine’s patient and sustained interest in these distinctions, even when they resist analysis, raise difficulties of their own, or are just unpopular.

The fifth directive helps explain what might otherwise seem like a curious gap in Fine’s otherwise broad corpus. With only a few exceptions (2005d; 2018a), Fine has written little directly on epistemology. When Fine’s work indirectly engages epistemology, it is often with ambivalence. And epistemic considerations rarely play any serious argumentative role. For example, one scarcely finds him ever justifying a claim by arguing that it would be easier to know than its competitors. Fine’s distance from epistemic concerns does not stem from any disdain for them. It rather stems from the influence of the other directives. It would be premature to attempt to account for our knowledge of the appearances prior to providing a rigorous account of what they are. As Fine has sometimes quipped in conversation, “Metaphysics first, epistemology last”.

3. Metaphysics

Fine is widely regarded as having played a pivotal role in the recent surge of interest in broadly neo-Aristotelian metaphysics. It is, however, not easy to say just what neo-Aristotelian metaphysics is. One might characterize it as a kind of resistance to the “naturalistic” approaches familiar in much of late 20^th century metaphysics. Granted, it is not straightforward how those approaches fit within the Aristotelian tradition. But the complexities of Aristotle’s own approach to metaphysics and the natural world suggest that any such characterization is, at best, clumsy and oversimplified. Another characterization of neo-Aristotelian might associate it with certain distinctive topics, including essence, substance, change, priority, hylomorphism, and the like. Granted, these topics do animate typical examples of neo-Aristotelian metaphysics. But it is also clear that these topics are not its exclusive property. Perhaps the best way to characterize neo-Aristotelian metaphysics is to engage with the metaphysics of one of its most influential popularizers and practitioners in contemporary times: Kit Fine.

What is metaphysics? Fine believes it is the confluence of five features (2011b). First, the subject of metaphysics is the nature of reality. But physics, mathematics, aesthetics, epistemology, and many other areas of inquiry are also concerned with the nature of reality. What distinguishes metaphysics from them is its aim, its methods, its scope, and its concepts. The aim of metaphysics is to provide a foundation for what there is. The method of metaphysics is characteristically apriori. The scope of metaphysics is as general as can be. And the concepts of metaphysics are transparent in the sense that there is no “gap” between the concept itself and what it is about.

The distinction between appearance and reality plays a prominent role in Fine’s conception of metaphysics (1982c: §A2; 2017b). Given such a distinction, one aim of metaphysics is to characterize how things are in reality. In Aristotelian fashion, this begins with the appearances. We start with how things appear, and the task is then to vindicate the appearances as revelatory of the underlying reality, or else to explain away the appearances in terms of some altogether different underlying reality. Both the revelatory and the reductionist projects presuppose the appearances, and so it is vital to get straight on what they are first. Fine calls this project naïve metaphysics. Only once adequate progress has been made on the naïve metaphysics of a subject will we be in a position to consider how it relates to fundamental reality. Fine calls this second project foundational metaphysics. Much of Fine’s work in metaphysics is best regarded as contributing to the naïve metaphysics of various topics (modality, part/whole, persistence) or to clarifying what conceptual tools (essence, reality, ground) will be needed to relate naïve metaphysics to foundational metaphysics. As Fine puts it (2017b: 108):

In my own view, the deliverances of foundational metaphysics should represent the terminus of philosophical enquiry; and it is only once we have a good handle on the corresponding questions within naïve metaphysics, with how things appear, that we are in any position to form an opinion on their reality.

Fine often suggests doubts about our having made anywhere near enough progress in naïve metaphysics to embark yet on foundational metaphysics. Because Fine suspects it would be premature to pursue foundational metaphysics at this early (for philosophy!) stage of inquiry, one should resist interpreting his work as pronouncing upon the ultimate nature of reality or the like. These sentiments are microcosmic embodiments of the five directives characterizing Fine’s philosophical approach.

a. Modality

Much of Fine’s earliest work focused on technical questions within formal logic, especially modal logic. But in the late 1970’s, Fine’s work began increasingly to consider applications of formal methods—especially the tools of modal logic—to the philosophy of modality. This shift produced a variety of influential contributions.

One of Fine’s earliest contributions to modality was to develop an ontological theory of extensional and intensional entities (1977b). The theory assumes a familiar possible worlds account of its intensional entities: properties are sets of world-individual pairs, propositions are sets of worlds, and so on. This approach is often taken to disregard any internal “structure” in the entities for which it accounts. But Fine resourcefully argues that a great deal of “structure” may still be discerned, including existence, being qualitative, being logical, having individual constituents, and being essentially modal. This work, together with Fine’s developments of Prior’s form of actualism (1977a), prefigured the recent debate between necessitists who assert that necessarily everything is necessarily something and contingentists who deny this (Williamson 2013b).

Fine continued exploring the applications of modal logic in the work that followed. The technical development of first-order modal theories is explored in one trio of papers (1978a; 1978b; 1981b). A second trio of papers explores applications of first-order modal theories to the formalization of various metaphysical theories of sets (1981a), propositions (1980), and facts (1982b). The second trio contains a wealth of distinctions and arguments. Some of them, with the benefit of hindsight, prefigure what would later become some of Fine’s more influential ideas.

For one example, the formalizations in 1981a are explicitly intended to capture plausible essentialist views about the identity or nature of sets. It is not difficult to view some of Fine’s remarks in this paper as anticipating his later celebrated set-theoretic counterexamples to the modal theory of essence (1994b).

For another example, 1982b argues against the still-common view that identifies facts with true propositions. The proposition that dogs bark exists regardless of whether dogs bark, whereas the fact that dogs bark exists only if they do.

In discussing these and related topics, Fine also introduced a general argumentative strategy against a variety of controversial metaphysical views. To illustrate, consider a modal variant of the preceding view that identifies possible facts with possibly true propositions. Suppose possible objects are abstracta. If a possible object is thus-and-so, then possibly it is actually thus-and-so. In particular, a possible donkey is possibly an actual donkey. Now, an actual donkey is a concrete object. So, we then have an abstract object—a possible donkey—that is possibly concrete. But no abstracta is possibly concrete. And so not all possible objects are abstracta. This sort of argument can also be used to show that possible facts are not propositions and that possible worlds are not abstract.

Fine’s work on modality is animated by a commitment to modal actualism (see his introduction to 2005b). This combines two theses. The first, modalism, is that modal notions are intelligible and irreducible to non-modal notions. The second, actualism, is that actuality is prior to mere possibility.

One of modalism’s most infamous detractors was Quine. Fine provides detailed reconstructions of Quine’s arguments against the intelligibility of de re modality and equally detailed criticisms of them (1989c; 1990). Quine’s arguments and Fine’s criticisms involve disentangling delicate issues concerning the modal question of de re modality, the semantic question of singular (or direct) reference, and the metaphysical question of transworld identity. These issues, according to Fine, have often been conflated in the literature (2005e).

One of the main problems facing actualism is to explain how to make sense of discourse about the merely possible, or “possibilist discourse”, given that mere possibilia are ultimately unreal. Fine takes up the challenge of reducing possibilist discourse to actualist discourse in a series of articles (1977a; 1985c; 2002b). A notable theme of Fine’s reductive strategy is a resistance to “proxy reduction”. Roughly, a proxy reduction attempts to reduce items of a target domain by associating them one-by-one with items from a more basic domain. In this case, a proxy reduction of possibilist discourse would reduce a merely possible object by associating it with an actual object. Although it is often assumed that reduction must proceed in this way by “proxy”, Fine argues that it needn’t. Instead, Fine pursues a different approach. The idea is to reduce the claim that a possible object has a feature to the claim that possibly some object (actually) has that feature. Thus, the claim that Wittgenstein’s possible daughter loathed philosophy is reduced to the claim that possibly Wittgenstein’s daughter (actually) loathed philosophy. This is not a proxy reduction because it does not associate Wittgenstein’s possible daughter with any actual object. Criticisms of the approach from Williamson 2013b and others recently prompted Fine to develop a new “suppositional” approach (2016c).

Although modalists often distinguish between various kinds of modality, they have often thought that the varieties can ultimately be understood in terms of a single kind of modality. Fine, however, argues against this sort of “monism” about modality (2002c). Modality is, instead, fundamentally diverse. There are, argues Fine, at least three diverse and irreducible modal domains: the metaphysical, the normative, and the nomological.

In addition to this diversity in the modal domains, Fine also argues that there is diversity within a given modal domain (2005c). This emerges in considering a puzzle of how it is possible that Socrates is a man but does not exist, given that it is necessary that Socrates is a man but possible that Socrates does not exist. Just as there is a distinction between sempiternal truths that hold at each time (for example, ‘Trump lies or not’) and eternal truths that hold regardless of the time (for example, ‘2+2=4’), so too there are worldly necessities that hold at each world (for example, ‘Trump lies or not’) and unworldly or transcendent necessities that hold regardless of the world (for example, ‘2+2=4’). The puzzle can then be resolved by taking ‘Socrates is a man’ to be an unworldly necessity while taking ‘Socrates does not exist’ to be a worldly (contingent) possibility. The distinction between worldly and unworldly necessities provides for three grades of modality. The unextended grade concerns the purely worldly necessities, the extended grade concerns the purely worldly necessities and the purely unworldly necessities, and the superextended grade concerns “hybrids” of the first two grades. Fine argues that the puzzle’s initial appeal depends upon confusing these three grades of modality.

b. Essence

Perhaps one of Fine’s most well-known contributions to metaphysics is to rehabilitate the notion of essence. A notable antecedent was Kripke 1972. Positivism’s antipathy to metaphysics was still exerting much influence on philosophy when Kripke powerfully advocated for the legitimacy of a distinctively metaphysical notion of modality. Kripke used this notion to suggest various essentialist theses. Among them were that a person’s procreative origin was essential to them and that an artifact’s material origin was essential it. These essentialist theses, however, were usually taken to be theses of metaphysical necessity. The implicit background conception of essence was accordingly modal. On one formulation of it, an item has some feature essentially just in case it is necessary that it has that feature. Thus, Queen Elizabeth’s procreative origin is essential to her just in case it is necessary that she have that origin.

One of Fine’s distinctive contributions to rehabilitating essence was to argue against the modal conception of it (1994b). To do so, Fine introduced what is now a famous example. Consider the singleton set {Socrates} (the set whose sole member is Socrates). It is necessary that, if this set exists, then it has Socrates as a member. And so, by the modal conception, the set essentially has Socrates as a member. But, Fine argues, on plausible assumptions, it is also necessary that Socrates is a member of {Socrates}. And so, by the modal conception, it follows that Socrates is essentially a member of {Socrates}. This, however, is highly implausible: it is no part of what Socrates is that he should be a member of any set whatsoever. Fine raises a battery of similar counterexamples to the modal conception. His diagnosis of where it goes awry is that it is insensitive to the source of necessity. It lies in the nature of the singleton {Socrates}, not Socrates, that it has Socrates as a member. This induces an asymmetry in essentialist claims: {Socrates} essentially contains Socrates, but it is not the case that Socrates is essentially contained by {Socrates}. No modal conception of essence can capture this asymmetry because the two claims are both equally necessary.

Even if the modal conception of essence fails, it is not as if essence and modality are unconnected. Indeed, Fine provocatively suggests a reversal of the traditional connection. Whereas the modal approach attempted to characterize essence in terms of modality, Fine suggests instead that metaphysical necessities hold in virtue of the essences of things (1994b).

Whether or not this suggestion is correct, separating essence from modality already implies that the study of essence cannot be subsumed under the study of modality. Instead, it would seem essence must be studied as a subject in its own right. Toward this end, Fine discusses a wealth of distinctions involving essence including the distinctions between constitutive and consequential essence, immediate and mediate essence, and more (1994d).

An especially important application of essence is to the notion of ontological dependence. What something is may depend upon what another thing is. In this ontological sense of dependence, a set may depend on its members, or an instance of a feature may depend upon the individual bearing it. Fine has explored this notion of ontological dependence in detail and used to provide a characterization of substance (1995b). Additionally, he has also developed the formal logic and semantics of essence (1995a; 2000c).

c. Ontology

Ontology is often taken to concern what there is, or what exists. Some, however, have argued that there is a significant difference between being (what there is) and what exists. When being and existence are distinguished, it is often to claim that some things that have being nevertheless do not exist.

A recurring theme in Fine’s work is an openness to consider the being or nature of items regardless of whether they exist (1982b: §1; 1982c: §E1). This is most evident in the case of items that we are convinced do not exist. Like many others, Fine believes that, ultimately, there are no non-existents. But, perhaps unlike many others, Fine also believes that this is no obstacle to exploring their status or their nature (1982c). Fine’s explorations of this are rich in distinctions. The three most prominent are between Platonism and empiricism, literalism and contextualism, and internalism and externalism. The Platonist says non-existents do not depend on us or our activities, whereas the empiricist says they do. The literalist says non-existents literally have the properties they are said to have (for example, Sherlock Holmes literally lives in London), whereas the contextualist says instead that these properties are at most only had in a relevant context (namely, the Holmes stories). The internalist individuates non-existents solely in terms of the properties they have “internally” to the contexts in which they occur, whereas the externalist does not. Fine believes that all eight combinations of views are possible. But he focuses on developing and arguing against the four internalist views. A notable counterexample Fine gives to internalism is a story in which we imagine twins Dum and Dee who are indiscernible internally to the story but are nevertheless distinct. Two follow-up papers developing and defending externalism (Fine’s own favored combination conjoins empiricism, contextualism, and externalism) and comparing it to alternatives were planned but have not yet appeared (although 1984a further discusses related issues in the context of a critical review).

Behind Fine’s openness to considering the being or nature of items regardless of whether they exist is a general conception of ontology (2009). At least since Quine 1948, the dominant view has been that ontology’s central question, “What exists?”, should be understood as the question “What is there?”, and that this in turn should be understood as a quantificational question. Thus, to ask “Do numbers exist?” is to ask “Is there an x such that x is a number?”. Fine argues against this approach. One difficulty is that it seems to mischaracterize the logical form of ontological claims. Suppose we wish to answer “Yes, numbers exist”. It does not seem adequate to the answer that merely some number, say 13, exists. But that is all that is required for the quantificational answer to be correct. Instead, it seems our answer must be that all the numbers exist. This answer has the form “For every x, if x is a number, then x exists”. If ‘x exists’ is understood in the Quinean way in terms of a quantifier (namely: x exists =_df. ∃y(x = y)), then it expresses a triviality that fails to capture the intended significance of the ontological question. Fine suggests that the intended significance can be restored by appealing to the notion of reality. The ontological, as opposed to quantificational question “Do numbers exist?” asks whether it is true that “For every x, if x is a number, then there is some feature that, in reality, x has”. This question is not answered by basic mathematical facts, but instead by whether numbers are part of the facts constituting reality.

Many ontologies are “constructional”. Some of their objects are accepted for being constructs of other accepted objects (perhaps with some objects as “given”: accepted but not on the basis of anything else). For example, we may accept subatomic particles into our ontology because they “construct” atoms, and we may also accept forests into our ontology because they are “constructed by” trees. Fine pursues an abstract study of constructional ontologies (1994e). The theory Fine develops can distinguish between actual and possible ontologies, as well as between absolute and relativist ontologies.

Relations have long puzzled philosophers. An especially difficult class of relations are those that appear to be non-symmetric. Unrequited love provides an example: although Scarlett loves Rhett, Rhett does not love Scarlett. It may seem that the relation loves is “biased” in that its first relatum is the lover and the second relatum the beloved. But it seems we must also recognize a converse is loved by relation “biased” in that its first relatum is the beloved and the second relatum the lover. Now, when Scarlett loves Rhett, is this because Scarlett and Rhett in this order stand in the loves relation, or because Rhett and Scarlett in that order stand in the is loved by relation? It seems we must say at least one, but either alone is arbitrary and both together is profligate. Fine develops a solution in terms of unbiased or “neutral” relations (2000b).

d. Mereology

Fine has made a variety of important contributions to abstract mereology (the theory of part and whole) as well as to its applications to various sorts of objects. Sometimes the term ‘mereology’ is used for a specific theory of mereology, namely classical extensional mereology. But an important theme in Fine’s work on mereology is to argue that this theory, and indeed much other thinking on mereology, is unduly narrow. Instead, Fine believes there is a highly general mereological framework that may accommodate a plurality of notions of part-whole (2010c). Different notions of part-whole correspond to different operations that may compose wholes from their parts. The notion of fusion from classical extensional mereological is but one of these compositional operations (and not a uniquely interesting one, he thinks). But there are other compositional operations that may apply even to abstract objects outside space and time. For example, the set-builder operation may be regarded as building a whole (the set) from its parts (its members). (Unlike Lewis 1991’s similar suggestion, Fine does not take the set-builder operation to be the fusion operation.) Fine contends that the general mereological framework for the plurality can be developed in abstraction from any of these particular applications of it.

Much of Fine’s work on mereology, however, has concerned its application to the objects of ordinary life and, in particular, to material things. Many have wanted to regard a material thing as identical with its matter. Perhaps the main objection to this view is the sheer wealth of counterexamples. A statue may be well-made although its matter is not. Fine has defended counterexamples like these at length (2003a). Even if a material thing and its matter are not identical, it may still seem as if they can occupy the same place at the same time. After all, the statue is now where its matter is. And some, including Locke, in 1689, have claimed that it is impossible for any two things (at least of the same sort) to occupy the same place at the same time. But Fine presents counterexamples even to this Lockean thesis (2000a). One can imagine, for instance, two letters being written on two sides of the same sheet of paper (or even written using the same words but which have dual meanings). The two letters then coincide but are distinct.

Even if material things are not identical to their matter, it may still be maintained that they are somehow aggregates of their matter. An aggregate of objects exists at a place or at a time exactly whenever or wherever some of those objects do too. If a quantity of gold, for example, is an aggregate of its left and right parts, then the quantity will exist whenever its left or right parts exist and wherever its left or right parts exist. But, Fine argues, if the left part is destroyed, the quantity will cease to exist although the aggregate will not. In general, then, ordinary material things are not aggregates but are instead compounds (1994a).

These considerations extend to how material things persist through time. A common view is that they persist by having (material) temporal parts. This view takes the existence of objects in time to be essentially like their extension in space: aggregative. Objects persist through time in much the same way as events unfold. But Fine argues, partly on the basis of mereological considerations, that this delivers highly implausible results, and suggests that instead we must recognize that the existence of objects in time is fundamentally different than their extension in space (Fine [1994a]; 2006a).

The lesson Fine draws from the preceding considerations is that a material thing neither is identical with, nor a mere aggregation of, its matter. Instead, Fine believes that the correct mereology of material things will be a version of hylomorphism: a material thing will be a compound of matter and form (2008a). Fine’s first applications of hylomorphism to acts, objects, and events provides an early glimpse of its broad scope (1982a). But the full breadth of its scope only emerged with Fine’s development of a general hylomorphic theory (1999). Its key notion is that of an embodiment. An embodiment may be either timeless (rigid) or temporal (variable). A rigid embodiment r = a,b,c,…/R is the object resulting from the objects a,b,c,… being in the relation R. A rigid embodiment is a hylomorphic compound that exists timelessly just in case its “matter” (the objects a,b,c,…) is in the requisite “form” (the relation R). So, for example, the statue (r) is identical with the hylomorphic compound of its clay parts (a,b,c,…) in the form of a statue (R). By contrast, a variable embodiment corresponds to a principle uniting its manifestations across times. Thus, a variable embodiment v = /V/ is a function V from times to things (which may themselves be rigid embodiments). Thus, for example, the statue over time (v) is a series of states at a time.

e. Realism

Fine has made influential contributions to debates about realism (2001). In general, the realist claims that some domain (for example, the mental or the moral) is real, whereas the antirealist claims that it is unreal. Although debates between realists and antirealists are common throughout philosophy, a precise and general characterization of their debate has been elusive. Fine argues against a variety of approaches familiar from the literature before settling on a distinctively metaphysical approach. What makes it distinctively metaphysical is its essential appeal to a metaphysical (as opposed to epistemic, conceptual, or semantic) notion of reality as well as to relatedly metaphysical notions of factuality and ground.

We may illustrate Fine’s approach by example. Set aside the moral error-theorist who believes that there are no moral facts whatsoever. Suppose, instead, that there are moral facts. One of them might be, we may suppose, that pointless torture is morally wrong. Moral realists and antirealists alike may agree that this fact is moral for containing some moral constituents (such as the property moral wrongness). And, unlike the error-theorist, they may agree that this fact obtains. What they dispute, however, is the fact’s status as real or unreal. Antirealism may come in either of two forms. The antirealist reductionist may, for example, accept the moral fact but insist that it is grounded in non-moral, naturalist facts that do not contain any moral constituents. The moral fact is unreal because it is grounded in non-moral facts. And the antirealist nonfactualist may, for example, accept the moral fact but insist that it is “nonfactual” in the sense that it does not represent reality but is rather a sort of “projection” of our attitudes, expressions, activities, or practices. The moral fact is unreal because it is neither real nor grounded in what is real. By contrast, the realist position consists in taking the moral fact as neither reducible nor nonfactual. The dispute between the realist, the antirealist reductionist, and the antirealist nonfactualist therefore turns on considerations of what grounds the moral facts. And, in general, debates over realism are, in effect, debates over what grounds what and therefore may be settled by determining what grounds what.

The framework Fine devised for debates over realism has proven rich in its implications. For one illustration, the metaphysical notion of reality figures prominently in other parts of Fine’s philosophy. Fine believes that the notion of reality plays a prominent role in ontological questions. And Fine uses the notion of reality to characterize the debate in the philosophy of time over the reality or unreality of tense. But the notion of ground provides an even more vivid illustration. In addition to ground’s central role in realist debates, it has itself become a topic of intense interest of its own.

f. Ground

Ground, as Fine conceives of it, is a determinatively explanatory notion. To say that Aristotle’s being rational and his being animal grounds his being a rational animal is to say that Aristotle is a rational animal because of, or in virtue of, his being rational and his being animal. Not only do questions of ground enjoy a prominent place in realist debates, but also within philosophy as a whole. Are moral facts grounded in naturalist facts? Are mental facts grounded in physical facts? Are facts of personal identity grounded in facts of psychological continuity? These and other questions of ground are among the biggest and most venerable questions in philosophy.

It is therefore a curiosity of recent times that ground has become a “hot topic” with a rapidly-expanding literature (Raven 2020). This is perhaps partly explained by the anti-metaphysical sentiments that swept over 20^th century analytic philosophy. Although philosophers did not entirely turn their backs on questions of ground, the anti-metaphysical sentiments created a climate in which many felt the need to reinterpret them as questions of another sort (such as conceptual analysis, supervenience, or truthmaking). Fine, however, played a highly influential role in changing this climate. This is partly because Fine’s work not only discussed ground in its application to other topics (such as realism), but also treated ground as a topic worthy of study in its own right (see Raven 2019 for further discussion). Fine provided a detailed exploration of ground, introducing many now familiar distinctions of ground and its connections to related topics, such as essence (2012c). Additionally, Fine has developed the so-called “pure logic” of ground (2012d). He also problematized ground by discovering some puzzles involving ground and its relation to classical logic (2010b).

Although Fine had recognized certain similarities between essence and ground, he was initially inclined to separate them (2012c: 80):

The two concepts [essence and ground] work together in holding up the edifice of metaphysics; and it is only by keeping them separate that we can properly appreciate what each is on its own and what they are capable of doing together.

But not long after, Fine changed his view (2015b: 297):

I had previously referred to essence and ground as the pillars upon which the edifice of metaphysics rests…, but we can now see more clearly how the two notions complement one another in providing support for the very same structure.

The unification appeals to a conception of constitutively necessary and sufficient conditions on arbitrary objects (1985d). For example, for true belief to be essential to knowledge is for it to be a constitutively necessary condition on an arbitrary person’s knowing something that they truly believe it. And, for another example, for a set’s having no members to ground its being identical with the null set is for it to be a constitutively sufficient condition on an arbitrary set’s having no members that it is identical with the null set.

This previous example illustrates an identity criterion: a statement of the conditions in virtue of which two items are the same. Many philosophers have been tempted to reject identity criteria for being pointless, trivial, or unintelligible. But Fine argues against such rejections and, instead, defends the intelligibility and, indeed, the potential substantivity of identity criteria by appealing to ground and arbitrary objects (2016b). Roughly, an identity criterion states that, given two arbitrary objects, they are the same when the fact that they are identical is grounded in the fact that they satisfy a specified condition. For example, given two arbitrary sets, they are the same when their identity is grounded in their having the same members.

g. Tense

One striking application of Fine’s work on realism and ground is to the philosophy of time. McTaggart 1908 notoriously argued for the unreality of time. Although McTaggart’s argument generated considerable discussion, the general impression has been that whatever challenge it posed to the reality of time can somehow be met. Fine argues that the challenge lurking within McTaggart’s argument is more formidable than usually thought (2005f, of which 2006b is an abridgement). Taking inspiration from McTaggart, Fine formulates his own argument against the reality of tense. The argument relies on four assumptions that each make essential appeal to the notion of reality:

Realism	Reality is constituted (at least, in part) by tensed facts.
Neutrality	No time is privileged, the tensed facts that constitute reality are not oriented towards one time as opposed to another.
Absolutism	The constitution of reality is an absolute matter, not relative to a time or other form of temporal standpoint.
Coherence	Reality is not contradictory; it is not constituted by facts with incompatible content.

Reality contains some tensed facts (Realism). Because things change, these will be diverse. Although you are reading, you aren’t always reading. So, one of these tensed facts is that you are reading whereas another of them is that you are not reading. None of these tensed facts are oriented toward any particular time (Neutrality). Nor do they obtain relative to any particular time (Absolutism). So reality is constituted by incompatible facts. But reality cannot be incoherent like that (Coherence). And so the four assumptions conflict. The antirealist reaction is to reject Realism, and so the reality of time. The realist accepts Realism, and so must reject another assumption. The challenge is to explain which. The “standard” realist denies Neutrality by privileging the present time. But Fine argues that there are two overlooked “nonstandard” responses. The relativist denies Absolutism, and so takes the constitution of reality to be irreducibly relative to a time. The fragmentalist denies Coherence, and so takes reality to divide into incompatible temporal “fragments”. Fine argues that the nonstandard realisms (and, in particular, fragmentalism) are, despite their obscurity, more defensible than standard realism.

Fine relates these considerations to the vexing case of first-personal realism. Standard realism about first-personal facts implausibly privileges a first-personal perspective. Overlooking nonstandard realisms, one may then draw the antirealist conclusion that there are no first-personal facts. But Fine’s apparatus reveals two nonstandard realist options: relativism and fragmentalism. According to Fine, these options (and, in particular, fragmentalism) are especially intuitive in the first-personal case. Indeed, Fine suggests that the question of the reality of tense might have more in common with the question of the reality of the first-personal, despite its more familiar association with the question of the reality of the modal.

4. Philosophy of Language

Fine has made four main contributions to the philosophy of language. The first two are in support of the referentialist tradition. One is to bolster arguments against the competing Fregean tradition. The other is to develop a novel version of referentialism, semantic relationism, that is superior to its referentialist competitors. The third contribution is to the nature of vagueness. And the fourth contribution is the development of an original approach to semantics, truthmaker semantics.

a. Referentialism

The referentialist tradition takes certain terms, especially names, to refer without the mediation of any Fregean sense or other descriptive information. Fine has made two main contributions in support of referentialism.

Fine’s first contribution to referentialism is to bolster arguments against Fregeanism. This includes a variety of supporting arguments scattered throughout his book Semantic Relationism (2007b). Perhaps the most notable of these is a thought experiment against the existence of the senses the Fregean posits (2007b: 36). The scenario involves a person in a universe that is perfectly symmetrically arranged around her center of vision. Her visual field therefore perfectly duplicates whatever is visible on the left to the right, and on the right to the left. When she is approached by two identical twins, she may name each ‘Bruce’. It seems she may refer by name to each. The Fregean can agree only if there is a pair of senses, one for the left ‘Bruce’ and the other for the right ‘Bruce’. But given the symmetry of the scenario, it seems there is no possible basis for thinking that the pair exists.

b. Semantic Relationism

Fine’s second contribution to referentialism is to introduce and develop what he argues is its most viable form: semantic relationism. The view is developed in his book Semantic Relationism which expands on his John Locke Lectures delivered at University of Oxford in 2003 (2007b).

Semantic relationism is representational in that it aims to account for the meanings of expressions in terms of what they represent (objects, properties, states of affairs, and so on). But it differs significantly from other representational approaches. These have typically (and implicitly) assumed that the meaning of an expression is intrinsic to it and so one is never required to consider any other expressions in accounting for the meaning of a given expression. Semantic relationism denies this. Instead, the meaning of (at least some) expressions at least partly consists in its “coordinative” relations to other meaningful expressions. This is different from typical kinds of semantic holism which usually characterize an expression’s meaning in non-representational terms and, instead, in terms of its inferential role.

One of the main benefits of semantic relationism is that it provides solutions to a variety of vexing puzzles, including the antinomy of the variable (2003b), Frege’s puzzle (Frege 1892), and Kripke’s puzzle about belief (Kripke 2011). To illustrate, Frege observed that an identity statement, like ‘Cicero is Cicero’, could be uninformative whereas another, like ‘Cicero is Tully’, could be informative despite the names ‘Cicero’ and ‘Tully’ being coreferential. Frege’s own solution was to bifurcate semantics into a level of sense and a level of reference. This enabled him to claim that the names ‘Cicero’ and ‘Tully’ differ in sense but not in reference. But powerful arguments from Kripke 1972 and others convinced many that the semantics of names only involve reference, not sense. How could one reconcile this referentialism about the semantics of names with Frege’s observation? Semantic relationism offers a novel answer. The pair ‘Cicero’,’Cicero’ in ‘Cicero is Cicero’ are coordinated: it is a semantic requirement that they co-refer. By contrast, the pair ‘Cicero’,‘Tully’ in ‘Cicero is Tully’ are uncoordinated: it is not a semantic requirement that they co-refer. This difference in coordination among the pairs of expressions explains the difference in their informativeness. But it is only by considering the pairs in relation to one another that this difference can even be recognized. The notion of semantic requirement involves a distinctive kind of semantic modality that Fine argues should play a significant role in semantic theorizing (2010a).

c. Vagueness

Fine provided what is widely considered to be the locus classicus for the so-called supervaluationist approach to vagueness (1975d). On this approach, vagueness is a kind of deficiency in meaning. What makes the deficiency specific to vagueness is that it gives rise to “borderline cases”. For example, the vague predicate ‘is bald’ admits of borderline cases. These are cases in which the predicate’s meaning does not settle whether it applies or does not apply to, say, a man with a receding hairline and thinning hair. Borderline cases pose an initial problem for classical logic. For if the predicate ‘is bald’ neither truly applies nor falsely applies in such cases, how could it be true to say ‘That man is bald or is not bald’? Supervaluationism answers by considering the admissible ways in which a vague predicate can be completed or made more precise. The sentence ‘That man is bald’ is “super-true” if true under every such “precisification”, “super-false” if false under every “precisification”, and neither otherwise. It can then be argued that ‘That man is bald or is not bald’ will be super-true because it will be true under every precisification, despite neither disjunct being super-true. This in turn helps supervaluationism provide a response to the Sorites Paradox.

In more recent work, Fine has given up on supervaluationism and instead developed an alternative approach. Fine’s reasons for rejecting supervaluationism are not specific to it but rather derive from a more far-reaching argument. Fine presents an apparent proof of the impossibility of vagueness (2008b). The challenge is to explain where the proof goes awry, since there is no question that vagueness is possible. But, Fine argues, standard accounts of vagueness, including especially supervaluationism, cannot satisfactorily meet this challenge. So, an alternative account is needed.

Fine develops such an alternative account that relies on a distinction between global and local vagueness (2015a). Global vagueness is vagueness over a range of cases, such as a series of indiscernible but distinct color tiles arranged incrementally from orange to red. Local vagueness is vagueness in a single case, such as in a single vermilion tile midway between the orange and red tiles. Given the distinction, there is a strong temptation to reduce global vagueness to local vagueness. But Fine argues against this. His own “globalist” approach, he argues, not only is able to meet the challenge of explaining the possibility of vagueness, but also why it does not succumb to the Sorites Paradox.

d. Truthmaker Semantics

In a series of articles, Fine develops a novel semantic approach he calls truthmaker semantics. The approach is in some ways like the more familiar possible-worlds semantics and, especially, situation semantics. But truthmaker semantics diverges from both. The contrast with possible-worlds semantics is especially vivid. On the latter approach, the truth-value of a sentence is evaluated with respect to a possible world in its entirety, no matter how irrelevant parts of that world might be to making the sentence true. Thus, ‘Fido barks’ will be true with respect to an entire possible world just in case it is a world in which Fido barks. Such a world includes much that is irrelevant to Fido’s barking, including sea turtle migration, weather patterns in sub-Saharan Africa, and distant galaxies. Truthmaker semantics departs in two ways from this. First, and like situation semantics, it replaces worlds with states which may, to a first approximation, be regarded as parts of worlds. So, for example, it is not the entire world—sea turtle migration, sub-Saharan weather, and distant galaxies included—that verifies or makes ‘Fido barks’ true, but rather instead just the state of Fido’s barking. What’s more, this state, unlike the entire world itself, does not verify any truths about sea turtles, sub-Saharan weather, and distant galaxies. Second, and unlike situation semantics, it is required that a state verifying a sentence must be wholly or exactly relevant to its truth. So, for example, the state that Fido barks and it’s raining in Djibouti will not verify ‘Fido barks’ because it includes an irrelevant part about Djibouti’s weather.

The general framework of truthmaker semantics is developed over the course of numerous articles (but see 2017c for an overview). An important feature of it is its abstractness. The semantics is specified in terms of a space of states, or a state space. The state space is assumed to have some mereological structure. But the assumptions are minimal and, in particular, no assumptions are made about the nature of the states themselves. This makes the framework highly abstract. This in turn grants the framework enormous flexibility in its potential range of applications. Indeed, Fine believes the main benefits of the general framework emerge from its wealth of applications to a wide variety of topics. These include: analytic entailment (2016a), counterfactuals (2012a; 2012b), ground (2020a), intuitionistic logic (2014b), semantic content (Fine [2017a,2017b]), the is-ought gap (2018b), verisimilitude (2019d; 2020b), impossible worlds (2019c), deontic and imperative and imperative statements (2014a; 2019a; 2019b), and more. This is not the place for a comprehensive survey of these applications. Still, one may get a sense of them by considering three applications in more detail.

First, consider counterfactuals. The standard semantics for counterfactuals derives from Stalnaker 1968 and Lewis 1973. According to Lewis’ version of it, the counterfactual ‘If A then it would be that C’ is true just in case no possible world in which A but not C is true is closer to actuality than any in which both A and C are true. Fine’s opposition to this semantics is evident from his critical notice (1975a) of Lewis’s book. There Fine introduced the so-called “future similarity objection”. It takes the form of a counterexample showing that small changes can make for great dissimilarities. Fine’s celebrated case was the counterfactual ‘If Nixon had pressed the button, then there would have been a nuclear holocaust’. Although it seems true, the standard semantics struggles to validate it. The great dissimilarities of a world where Nixon pressed the button causing nuclear holocaust ensure it is further from actuality than a world where Nixon pressed the button without nuclear holocaust. Fine’s critical notice also contained the seeds of ideas that later emerged in his work on truthmaker semantics. There he also objects that the standard semantics is committed to unsound implications because it permits the substitution of tautologically equivalent statements. This objection was prescient for anticipating a similar difficulty later developed in greater detail against the standard semantics (2012a; 2012b). Fine argues that the difficulty can be avoided by providing a truthmaker semantics for counterfactuals. Roughly, ‘If A then it would be that C’ is true just in case any possible outcome of a state verifying A also contains a state verifying C.

Second, consider intuitionistic logic. Realists and antirealists alike tend to agree that certain technical aspects of intuitionistic logic provide a natural home for antirealism. This would be a mistake, however, if intuitionistic logic could be given a realist semantic foundation. Fine shows how truthmaker semantics can be used to provide just such a realist semantics for intuitionistic logic (2014b).

Third, consider the is-ought gap. Hume 1739 famously argued for a gap between ‘is’ and ‘ought’ statements: one cannot validly derive any statement about what ought to be from any statements about what is. Despite the appeal of such a gap, it has not been easy to formulate it clearly. What’s more, standard formulations are vulnerable to superficial but resilient counterexamples (Prior 1960). Fine shows how truthmaker semantics can be used to formulate the gap in a way that avoids such superficial counterexamples (2018b).

5. Logics and Mathematics

Fine has made a variety of seminal technical contributions to formal logic as well as to philosophical logic and the philosophy of mathematics. These contributions may be organized, respectively, into three major groups: formal logic (especially modal logics), arbitrary objects, and the foundations of mathematics (broadly construed so as to include the theory of sets and classes).

a. Logics

Most of Fine’s earliest work focused on technical questions within formal logic, especially on modal logics. A detailed synopsis of Fine’s technical work is beyond the scope of this article. But a very brief summary of them can be given here:

various results about modal logics with propositional quantifiers (1970 which presents results from Fine’s Ph.D. dissertation 1969);
a completeness proof for a predicate logic without identity but with primitive numerical quantifiers (1972a);
early developments of graded modal logic (1972b);
various results about S4 logics (those with reflexive and transitive Kripke frames) and certain extensions of them (1971; 1972c; 1974a; 1974b);
the application of normal forms to a general completeness proof for “uniform” modal logics (1975b);
a seminal “canonicity theorem” for modal logics (1975c);
completeness results for logics containing K4 (those with transitive Kripke frames) (1974c; 1985a);
failure of Craig’s interpolation lemma for various quantified modal logics (1979);
the underivability of a quantifier permutation principle in certain modal systems without identity (1983b);
an exploration into whether truth can be defined without the notion of satisfaction (joint work with McCarthy 1984b);
incompleteness results for standard semantics for quantified relevance logic and an alternative semantics for it that is complete (1988; 1989a);
the development of stability (or “felicitous”) semantics for the conception of “negation as failure” in logic programming and computer science (1989b); and
general results about how properties of “monomodal” logics containing a single modal operator may transfer to a “multimodal” logic joining them (joint work with Schurz 1996).

In addition, Fine also wrote several articles in economic theory (1973a; 1972d), including two with his brother, economist Ben Fine (1974d; 1974e).

b. Arbitrary Objects

We often speak of arbitrary objects—an arbitrary integer, an arbitrary American, and so on. But at least since Berkeley 1710, the notion of an arbitrary object has been thought to be dispensable, if not outright incoherent. But in Fine’s book Reasoning with Arbitrary Objects, he argued that familiar opposition to arbitrary objects is misplaced and that they can, contrary to received wisdom, be given a rigorous theoretical foundation (1985d and its abridgements 1983a; 1985b).

The matter is not a mere intellectual curiosity. For it turns out, according to Fine, that arbitrary objects have various important applications. One salient application is to natural deduction and, especially, the logic of generality (1985d; 1985b). To illustrate, consider how one might explain the rule of universal generalization to students of a first formal logic course. One might say that if one can show that an arbitrary item a satisfies some condition f, then one may deduce that every item whatsoever satisfies that condition: “xf(x). Standard glosses on the rule ultimately attempt to avoid any appeal to the arbitrary item in favor of some alternative construal. But given Fine’s defense of arbitrary objects, there is no need to avoid appealing to them, and, in fact, it may be argued that they provide a more direct and satisfying account of the rule than alternative accounts do. Fine has also explored other applications to mathematical logic, the philosophy of language, and the history of ideas are also explored (1985d).

More recently, Fine has found new applications for arbitrary objects. One is to Cantor’s abstractionist constructions of cardinal numbers and order types. The constructions have faced formidable objections. But, according to Fine, the objections can be overcome by appealing to the theory of arbitrary objects (1998). In a belated companion article, Fine argues that his theory of arbitrary objects combined with the Cantorian approach can be extended to provide a general theory of types or forms, of which structural universals end up being a special case (2017a). And Fine also puts arbitrary objects to use when attempting to provide a paradox-free construction of sets or classes that allows for the existence of a universal class and for the Frege-Russell cardinal numbers (2005a), characterizing identity criteria (2016b) as well as unified foundations for essence and ground (2015b). Fine is currently preparing a revised version of Reasoning with Arbitrary Objects.

c. Philosophy of Mathematics

Most of Fine’s contributions to the philosophy of mathematics concern various foundational issues. Much recent interest in these issues derives from Frege’s infamous attempt to secure the foundations of mathematics by deriving it from logic alone. Frege’s attempt foundered in the early 1900s with the discovery of the set-theoretic paradoxes. Much of Fine’s work in the philosophy of mathematics concern the prospects for reviving Frege’s project without paradox.

At the heart of Frege’s own attempt was the notion of abstraction. Just as we may abstract the direction of two lines from their being parallel, so too we may abstract the number of two classes from their equinumerosity. Frege’s own use of abstraction ultimately led to paradox. But since then, neo-Fregeans (such as Fine’s colleague Crispin Wright and Bob Hale) have attempted to salvage much of Frege’s project by refining the use of abstraction in various ways. Fine has provided a detailed exploration of a general theory of abstraction as well as its prospects for sustaining neo-Fregean ambitions (2002a).

The discovery of the set-theoretic paradoxes generated turmoil within the foundations of mathematics and for associated philosophical programs. Since then, there have been a variety of attempts to provide a paradox-free construction of sets or classes. These attempts usually assume a notion of membership in their construction of the ontology. But Fine reverses the direction and constructs notions of membership in terms of the assumed ontology. This, Fine argues, has various advantages over standard constructions (2005a).

Many have thought that a central lesson of the aforementioned set-theoretic paradoxes is that quantification is inevitably restricted. Were it possible to quantify unrestrictedly over absolutely everything, then paradox would result. Instead, we may indefinitely extend the range of quantification without ever paradoxically quantifying over absolutely everything. So, it seems, quantification is always restricted, albeit indefinitely extendible. A persistent difficulty in sustaining this point of view, however, is the apparent arbitrariness of any restriction. Fine argues that the difficulty can be avoided (2006c). Quantification’s being absolute and its being unrestricted are often conflated. But Fine argues that they are distinct. Distinguishing them allows us to conceive of the possibility of quantification that is unrestricted but not absolute.

A recurring theme in some of the preceding papers is an approach to mathematics that Fine calls procedural postulationism. Traditional versions of postulationism take the existence of mathematical items and the truths about them to derive from certain propositions we postulate. But Fine’s procedural postulationism takes these postulates to be imperatival instead (e.g. “For each item in the domain that is a number, introduce another number that is its successor”). Fine believes this one difference helps postulationism provide a more satisfactory metaphysics, semantics, and epistemology of mathematics. Although procedural postulationism is hinted at in the previous articles, it is discussed in more detail in the context of discussing knowledge of mathematical items (2005d). Fine has indicated that he believes the core ideas of procedural postulationism may extend more generally, and briefly discusses their application to the metaphysics of material things (2007a).

6. History

It is not hard to find Aristotle’s influence in much of Fine’s work. But in addition to developing various Aristotelian themes, Fine has also directly contributed to more exegetical scholarship on Aristotle’s own work. These contributions have primarily focused on developing an account of Aristotle’s views on substance and what we may still learn from them. This begins with an attempt to formalize Aristotle’s views on matter (1992). Fine later raises a puzzle for Aristotle (and other neo-Aristotelians) concerning how the matter now composing one hylomorphic compound, say Callias, could later come to compose another hylomorphic compound, say Socrates (1994c). According to Aristotle, the world contains elements that may compose mixtures, and these mixtures in turn compose substances. Fine argues against conceptions of mixtures that take them to be at the same level as the elements composing them and, instead, defends a conception on which they are at a higher level (1995d). Finally, Fine argues that the best interpretation of a vexing discussion in Metaphysics Theta.4 is that Aristotle was attempting to introduce a novel conception of modality (2011a).

Additionally, Fine has written on Husserl’s discussions from the Logical Investigations on part and whole and the related topics of dependence, necessity, and unity (1995c). Fine also has work in preparation on Bolzano’s conception of ground.

7. References and Further Reading

Berkeley, George. 1710. A Treatise Concerning the Principles of Human Knowledge.
Fine, Kit. 1969. For Some Proposition and So Many Possible Worlds. University of Warwick.
Fine, Kit. 1970. “Propositional Quantifiers in Modal Logic.” Theoria 36 (3): 336-46.
Fine, Kit. 1971. “The Logics Containing S4.3.” Zeitschrift für Mathematische Logik und Grundlagen der Mathematik 17 (1): 371-76.
Fine, Kit. 1972a. “For So Many Individuals.” Notre Dame Journal of Formal Logic 13 (4): 569-72.
Fine, Kit. 1972b. “In So Many Possible Worlds.” Notre Dame Journal of Formal Logic 13 (4): 516-20.
Fine, Kit. 1972c “Logics Containing S4 without the Finite Model Property.” In Conference in Mathematical Logic–London ’70, edited by W. Hodges. New York: Springer-Verlag.
Fine, Kit. 1972d. “Some Necessary and Sufficient Conditions for Representative Decision on Two Alternatives.” Econometrica 40 (6): 1083-90.
Fine, Kit. 1973a. “Conditions for the Existence of Cycles under Majority and Non-minority Rules.” Econometrica 41 (5): 889-99.
Fine, Kit. 1974a. “An Ascending Chain of S4 Logics.” Theoria 40 (2): 110-16.
Fine, Kit. 1974b. “An Incomplete Logic Containing S4.” Theoria 40 (1): 23-29.
Fine, Kit. 1974c. “Logics Containing K4 – Part I.” The Journal of Symbolic Logic 39 (1): 31-42.
Fine, Kit. 1975a. “Critical Notice: Counterfactuals, by David Lewis.” Mind 84 (335): 451-58. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1975b. “Normal Forms in Modal Logic.” Notre Dame Journal of Formal Logic 16 (2): 229-34.
Fine, Kit. 1975c. “Some Connections Between Elementary and Modal Logic.” In Proceedings of the Third Scandinavian Logic Symposium, edited by S. Kanger. Amsterdam: North-Holland.
Fine, Kit. 1975d. “Vagueness, Truth and Logic.” Synthese 30: 265-300.
Fine, Kit. 1977a “Prior on the Construction of Possible Worlds and Instants.” In Worlds, Times and Selves, edited by A. N. Prior and K. Fine. London: Duckworth. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1977b. “Properties, Propositions and Sets.” Journal of Philosophical Logic 6: 135-91.
Fine, Kit. 1978a. “Model Theory for Modal Logic – Part I: The De Re/De Dicto Distinction.” Journal of Philosophical Logic 7 (1): 125-56.
Fine, Kit. 1978b. “Model Theory for Modal Logic – Part II: The Elimination of De Re Modality.” Journal of Philosophical Logic 7 (1): 277-306.
Fine, Kit. 1979. “Failures of the Interpolation Lemma in Quantified Modal Logic.” The Journal of Symbolic Logic 44 (2): 201-06.
Fine, Kit. 1980. “First-order Modal Theories II – Propositions.” Studia Logica 39 (2/3): 159-202.
Fine, Kit. 1981a. “First-order Modal Theories I – Sets.” Noûs 15 (2): 177-205.
Fine, Kit. 1981b. “Model Theory for Modal Logic – Part III: Existence and Predication.” Journal of Philosophical Logic 10 (3): 293-307.
Fine, Kit. 1982a. “Acts, Events and Things.” In Language and Ontology, edited by W. Leinfellner, E. Kraemer and J. Schank. Wien: Hölder-Pichler-Tempsky, as part of the proceedings of the Sixth International Wittgenstein Symposium 23rd to 30th August 1981, Kirchberg/Wechsel (Austria).
Fine, Kit. 1982b. “First-order Modal Theories III – Facts.” Synthese 53: 43-122.
Fine, Kit. 1982c. “The Problem of Non-existents.” Topoi 1: 97-140.
Fine, Kit. 1983a. “A Defence of Arbitrary Objects.” Proceedings of the Aristotelian Society, Supplementary Volume 57: 55-77.
Fine, Kit. 1983b. “The Permutation Principle in Quantificational Logic.” Journal of Philosophical Logic 12 (1): 33-37.
Fine, Kit. 1984a. “Critical Review of Parsons’ Non-Existent Objects.” Philosophical Studies 45 (1): 95-142.
Fine, Kit. 1985a. “Logics Containing K4 – Part II.” The Journal of Symbolic Logic 50 (3): 619-51.
Fine, Kit. 1985b. “Natural Deduction and Arbitrary Objects.” Journal of Philosophical Logic 14: 57-107.
Fine, Kit. 1985c “Plantinga on the Reduction of Possibilist Discourse.” In Alvin Plantinga, edited by J. E. Tomberlin and P. van Inwagen. Dordrecht: Reidel. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1985d. Reasoning with Arbitrary Objects. Oxford: Blackwell.
Fine, Kit. 1988. “Semantics for Quantified Relevance Logic.” Journal of Philosophical Logic 17 (1): 27-59.
Fine, Kit. 1989a “Incompleteness for Quantified Relevance Logics.” In Directions in Relevant Logics, edited by R. Sylvan and J. Norman. Dordrecht: Kluwer.
Fine, Kit. 1989b “The Justification of Negation as Failure.” In Proceedings of the Congress on Logic, Methodology and the Philosophy of Science VIII, edited by J. Fenstad, T. Frolov and R. Hilpinen. Amsterdam: Elsner Science Publishers B. V.
Fine, Kit. 1989c. “The Problem of De Re Modality.” In Themes from Kaplan, edited by J. Almog, J. Perry and H. Wettstein. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1990. “Quine on Quantifying In.” In Proceedings of the Conference on Propositional Attitudes, edited by C. A. Anderson and J. Owens. Stanford: CSLI. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 1992. “Aristotle on Matter.” Mind 101 (401): 35-57.
Fine, Kit. 1994a. “Compounds and Aggregates.” Noûs 28 (2): 137-58.
Fine, Kit. 1994b. “Essence and Modality.” Philosophical Perspectives 8: 1-16.
Fine, Kit. 1994c “A Puzzle Concerning Matter and Form.” In Unity, Identity, and Explanation in Aristotle’s Metaphysics, edited by T. Scaltsas, D. Charles and M. L. Gill. Oxford: Oxford University Press.
Fine, Kit. 1994d “Senses of Essence.” In Modality, Morality and Belief: Essays in Honor of Ruth Barcan Marcus, edited by W. Sinnott-Armstrong. Cambridge: Cambridge University Press.
Fine, Kit. 1994e. “The Study of Ontology.” Noûs 25 (3): 263-94.
Fine, Kit. 1995a. “The Logic of Essence.” Journal of Philosophical Logic 24: 241-73.
Fine, Kit. 1995b. “Ontological Dependence.” Proceedings of the Aristotelian Society 95: 269-90.
Fine, Kit. 1995c. “Part-Whole.” In The Cambridge Companion to Husserl, edited by B. Smith and D. Woodruff. Cambridge: Cambridge University Press.
Fine, Kit. 1995d. “The Problem of Mixture.” Pacific Philosophical Quarterly 76 (3-4): 266-369.
Fine, Kit. 1998. “Cantorian Abstraction: A Reconstruction and Defense.” The Journal of Philosophy 95 (12): 599-634.
Fine, Kit. 1999. “Things and Their Parts.” Midwest Studies in Philosophy 23: 61-74.
Fine, Kit. 2000a. “A Counter-example to Locke’s Thesis.” The Monist 83 (3): 357-61.
Fine, Kit. 2000b. “Neutral Relations.” The Philosophical Review 109 (1): 1-33.
Fine, Kit. 2000c. “Semantics for the Logic of Essence.” Journal of Philosophical Logic 29 (6): 543-84.
Fine, Kit. 2001. “The Question of Realism.” Philosophers’ Imprint 1 (2): 1-30.
Fine, Kit. 2002a. The Limits of Abstraction. Oxford: Clarendon Press.
Fine, Kit. 2002b. “The Problem of Possibilia.” In Handbook of Metaphysics, edited by D. Zimmerman. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 2002c. “The Varieties of Necessity.” In Conceivability and Possibility, edited by T. S. Gendler and J. Hawthorne. Oxford: Oxford University Press. Reprinted in Modality and Tense: Philosophical Papers.
Fine, Kit. 2003a. “The Non-Identity of a Material Thing and Its Matter.” Mind 112 (446): 195-234.
Fine, Kit. 2003b. “The Role of Variables.” The Journal of Philosophy 50 (12): 605-31.
Fine, Kit. 2005a. “Class and Membership.” The Journal of Philosophy 102 (11): 547-72.
Fine, Kit. 2005b. Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
Fine, Kit. 2005c. “Necessity and Non-existence.” In Modality and Tense: Philosophical Papers.
Fine, Kit. 2005d. “Our Knowledge of Mathematical Objects.” In Oxford Studies in Epistemology, edited by T. S. Gendler and J. Hawthorne. Oxford: Clarendon Press.
Fine, Kit. 2005e. “Reference, Essence, and Identity.” In Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
Fine, Kit. 2005f. “Tense and Reality.” In Modality and Tense: Philosophical Papers. Oxford: Clarendon Press.
Fine, Kit. 2006a. “In Defense of Three-Dimensionalism.” The Journal of Philosophy 103 (12): 699-714.
Fine, Kit. 2006b. “The Reality of Tense.” Synthese 150 (3): 399-414.
Fine, Kit. 2006c “Relatively Unrestricted Quantification.” In Absolute Generality, edited by A. Rayo and G. Uzquiano. Oxford: Clarendon Press.
Fine, Kit. 2007a. “Response to Kathrin Koslicki.” dialectica 61 (1): 161-66.
Fine, Kit. 2007b. Semantic Relationism. Oxford: Blackwell Publishing.
Fine, Kit. 2008a. “Coincidence and Form.” Proceedings of the Aristotelian Society, Supplementary Volume 82 (1): 101-18.
Fine, Kit. 2008b. “The Impossibility of Vagueness.” Philosophical Perspectives 22 (Philosophy of Language): 111-36.
Fine, Kit. 2009. “The Question of Ontology.” In Metametaphysics: New Essays on the Foundations of Ontology, edited by D. Chalmers, D. Manley and R. and Wasserman. Oxford: Oxford University Press.
Fine, Kit. 2010a. “Semantic Necessity.” In Modality: Metaphysics, Logic, and Epistemology, edited by B. Hale and A. Hoffmann. Oxford: Oxford University Press.
Fine, Kit. 2010b. “Some Puzzles of Ground.” Notre Dame Journal of Formal Logic 51 (1): 97-118.
Fine, Kit. 2010c. “Towards a Theory of Part.” The Journal of Philosophy 107.
Fine, Kit. 2011a. “Aristotle’s Megarian Manoeuvres.” Mind 120 (480): 993-1034.
Fine, Kit. 2011b. “What is Metaphysics?” In Contemporary Aristotelian Metaphysics, edited by T. E. Tahko. Cambridge: Cambridge University Press.
Fine, Kit. 2012a. “Counterfactuals without Possible Worlds.” The Journal of Philosophy 59 (3): 221-46.
Fine, Kit. 2012b. “A Difficulty for the Possible Worlds Analysis of Counterfactuals.” Synthese 189 (1): 29-57.
Fine, Kit. 2012c “Guide to Ground.” In Metaphysical Grounding: Understanding the Structure of Reality, edited by F. Correia and B. Schnieder. Cambridge: Cambridge University Press.
Fine, Kit. 2012d. “The Pure Logic of Ground.” The Review of Symbolic Logic 25 (1): 1-25.
Fine, Kit. 2013a. “Fundamental Truths and Fundamental Terms.” Philosophy and Phenomenological Research 87 (3): 725-32.
Fine, Kit. 2014a. “Permission and Possible Worlds.” dialectica 68 (3): 317-36.
Fine, Kit. 2014b. “Truth-Maker Semantics for Intuitionistic Logic.” Journal of Philosophical Logic 43: 549-77.
Fine, Kit. 2015a. “The Possibility of Vagueness.” Synthese 194 (10): 3699-725.
Fine, Kit. 2015b. “Unified Foundations for Essence and Ground.” Journal of the American Philosophical Association 1 (2): 296-311.
Fine, Kit. 2016a. “Angellic Content.” Journal of Philosophical Logic 45 (2): 199-226.
Fine, Kit. 2016b. “Identity Criteria and Ground.” Philosophical Studies 173 (1): 1-19.
Fine, Kit. 2016c. “Williamson on Fine on Prior on the Reduction of Possibilist Discourse.” Canadian Journal of Philosophy 46 (4-5): 548-70.
Fine, Kit. 2017a. “Form.” The Journal of Philosophy CXIV (10): 509-35.
Fine, Kit. 2017b. “Naive Metaphysics.” Philosophical Issues 27: 98-113.
Fine, Kit. 2017c. “Truthmaker Semantics.” In A Companion to the Philosophy of Language, edited by B. Hale, C. Wright and A. Miller. West Sussex: Wiley-Blackwell.
Fine, Kit. 2018a. “Ignorance of Ignorance.” Synthese 195 (9): 4031-45.
Fine, Kit. 2019c. “Constructing the Impossible.” In to appear in a collection of papers for Dorothy Edgington.
Fine, Kit. 2020a. “Semantics.” In The Routledge Handbook of Metaphysical Grounding, edited by M. J. Raven. New York: Routledge.
Fine, Kit, and Ben Fine. 1974d. “Social Choice and Individual Rankings I.” Review of Economic Studies 41: 303-22.
Fine, Kit, and Ben Fine. 1974e. “Social Choice and Individual Rankings II.” Review of Economic Studies 41: 459-75.
Fine, Kit, and Timothy McCarthy. 1984b. “Truth without Satisfaction.” Journal of Philosophical Logic 13 (4): 397-421.
Fine, Kit, and Gerhard Schurz. 1996. “Transfer Theorems for Multimodal Logics.” In Logic and Reality: Essays on the Legacy of Arthur Prior, edited by J. Copeland. Oxford: Clarendon.
Frege, Gottlob. 1892. “On Sense and Reference.” In Translations from the Philosophical Writings of Gottlob Frege, edited by P. T. Geach and M. Black. Oxford: Blackweel.
Hume, David. 1739. “A Treatise of Human Nature”, edited by L. A. Selby-Bigge and P. H. Nidditch. Oxford: Clarendon Press.
Kripke, Saul. 1972. Naming and Necessity. Cambridge, MA: Harvard University Press.
Kripke, Saul. 2011. “A Puzzle about Belief.” In Philosophical Troubles: Collected Papers, Volume I. Oxford: Oxford University Press.
Leibniz, Gottried Wilhelm. 1714. Monadology.
Lewis, David. 1973. Counterfactuals. Oxford: Blackwell Publishers.
Lewis, David. 1986. On the Plurality of Worlds. Oxford: Blackwell Publishers.
Lewis, David. 1991. Parts of Classes. Oxford: Blackwell.
Locke, John. 1689. An Essay Concerning Human Understanding.
McTaggart, J. M. E. 1908. “The Unreality of Time.” Mind 17: 457-74.
Prior, A. N. 1960. “The Autonomy of Ethics.” Australasian Journal of Philosophy 38 (3): 199-206.
Quine, Willard Van Orman. 1948. “On What There is.” Review of Metaphysics 2: 21-38. Reprinted in From a Logical Point of View, 2nd ed., Harvard: Harvard University Press, 1980, 1-19.
Raven, Michael J. 2019. “(Re)discovering Ground.” In Cambridge History of Philosophy, 1945 to 2015, edited by K. M. Becker and I. Thomson. Cambridge: Cambridge University Press.
Raven, Michael J., ed. 2020. The Routledge Handbook of Metaphysical Grounding. New York: Routledge.
Spinoza, Baruch. 1677. Ethics, Demonstrated in Geometrical Order.
Stalnaker, Robert. 1968. “A Theory of Conditionals.” In Studies in Logical Theory, edited by N. Rescher. Oxford: Blackwell.
Williamson, Timothy. 2013b. Modal Logic as Metaphysics. Oxford: Oxford University Press.

Author Information

Mike Raven
Email: mike@mikeraven.net
University of Victoria
Canada

This document is generated by “Embed Any Pdf”

Yablo Paradox - from Chrome Print

This document is generated by “Embed Any Document”

Taking too long?

Reload document

Open in new tab

Immanuel Kant: Logic

For Immanuel Kant (1724–1804), formal logic is one of three paradigms for the methodology of science, along with mathematics and modern-age physics. Formal logic owes this role to its stability and relatively finished state, which Kant claims it has possessed since Aristotle. Kant’s key contribution lies in his focus on the formal and systematic character of logic as a “strongly proven” (apodictic) doctrine. He insists that formal logic should abstract from all content of knowledge and deal only with our faculty of understanding (intellect, Verstand) and our forms of thought. Accordingly, Kant considers logic to be short and very general but, on the other hand, apodictically certain. In distinction to his contemporaries, Kant proposed excluding from formal logic all topics that do not properly belong to it (for example, psychological, anthropological, and metaphysical problems). At the same time, he distinguished the abstract certainty (that is, certainty “through concepts”) of logic (and philosophy in general) from the constructive evidence of mathematical knowledge. The idea of formal logic as a system led Kant to fundamental questions, including questions about the first principles of formal logic, redefinitions of logical forms with respect to those first principles, and the completeness of formal logic as a system. Through this approach, Kant raised some essential problems that later motivated the rise of modern logic. Kant’s remarks and arguments on a system of formal logic are spread throughout his works (including his lectures on logic). Nonetheless, he never published an integral, self-contained presentation of formal logic as a strongly proven doctrine. A lively dispute has thus developed among scholars about how to reconstruct his formal logic as an apodictic system, in particular concerning his justification of the completeness of his table of judgments.

One of Kant’s main results is his establishment of transcendental logic, a foundational part of philosophical logic that concerns the possibility of the strictly universal and necessary character of our knowledge of objects. Formal logic provides transcendental logic with a basis (“clue”) for establishing its fundamental concepts (categories), which can be obtained by reinterpreting the logical forms of judgment as the forms of intuitively given objects. Similarly, forms of inference provide a “clue” for transcendental ideas, which lead to higher-order and meta-logical perspectives. Transcendental logic is crucial to and forms the largest part of Kant’s foundations of metaphysics, as they are critically investigated and presented in his main work, the Critique of Pure Reason.

This article focuses on Kant’s formal logic in the systematic order of logical forms and outlines Kant’s approach to the foundations of formal logic. The main characteristics of Kant’s transcendental logic are presented, including his system of categories and transcendental ideas. Finally, a short overview is given of the subsequent role of Kant’s logical views.

Introduction
The Concept of Formal Logic
Concept
Judgment
Inference
General Methodology
The Foundations of Logic
Transcendental Logic (Philosophical Logic)
Influences and Heritage
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Introduction

Presentations of the history of logic published at the beginning of the 21st century seem to positively re-evaluate Kant’s role, especially with regard to his conceptual work that led to a new development of logic (see, for example, Tiles 2004). Although older histories of logic written from the standpoint of mathematical logic did appreciate Kant’s restitution of the formal side of logic, they ascribed to Kant a relatively unimportant role. They criticized him for what seemed to be his view on logic as principally not exceeding the traditional, Aristotelian boundaries (Kneale and Kneale 1991) and for his principled separation of logic and mathematics (Scholz 1959). Nevertheless, during the 20th century, some Kant scholars have confirmed and extensively elaborated on his relevance to mathematical logic (for example, Wuchterl 1958, Schulthess 1981). Moreover, it is significant that several founders of modern logic (including Frege, Hilbert, Brouwer, and Gödel) explicitly referred to and built upon aspects of Kant’s philosophy.

According to Kant, formal logic appears to be an already finished science (accomplished by Aristotle), in which essentially no further development is possible (B VIII). In fact, some of Kant’s statements leave the impression that his views of formal logic may have been largely compiled from contemporary logic textbooks (B 96). Nonetheless, Kant mentions that the logic of his contemporaries was not free of insufficiencies (Prolegomena IV:323). He organized the existing material of formal logic in a specific way; he separated the extraneous (for instance, the psychological, anthropological, and metaphysical) material from formal logic proper. What is particularly important for Kant are his redefinitions of logical forms in terms of formal unity and consciousness. These redefinitions are indispensable for his main contributions: his systematic view of formal logic and the application of this view in transcendental logic.

It also became apparent, primarily due to K. Reich’s 1948 monograph, that Kant’s systematic view of formal logic assumed, as an essential component, a justification of the completeness of formal logic with respect to the forms of our thinking. This conforms with Kant’s critique of Aristotle for his unsystematic, “rhapsodical” approach in devising the list of categories, since Kant intended to repair this deficiency by setting up a system of categories specifically on the basis of formal logic.

Finally, the contemporary development of logic, where logic has far exceeded the shape of a standard (“classical”) mathematical logic, has made it technically possible to explore some features of Kant’s logic that have largely escaped the attention of the earlier, “classically” based perception of Kant’s logic.

Although formal logic is the starting point of Kant’s philosophy, there is no separate text in which Kant systematically, in a strictly scientific way, presented formal logic as a doctrine. Essential parts of this doctrine, however, are contained in his published works, especially those on the foundations of metaphysics, in his handwritten lecture notes on logic (with the addition of Jäsche’s compilation), and in the transcripts of Kant’s lectures on logic. These lectures are based primarily on the textbook by G. F. Meier; and, according to the custom of the time, they include a large amount of material that does not strictly pertain to formal logic. Kant’s view was that it was harmful to beginners to receive instruction in a highly abstract form, in contrast to their concrete and intuitive way of thinking (compare II:305‒306). Nevertheless, many places in Kant’s texts and lectures are pertinent to or reflect the systematic aspect of logic. On this ground, it is possible to reconstruct and describe most of the crucial details of Kant’s doctrine of formal logic.

The reason Kant did not write a systematic presentation of formal logic can be attributed to his focus on metaphysics and the possibility of its foundations. Besides, he might have presumed that the systematic doctrine of formal logic could be recognized from the sections and remarks he had included about it in his written work, at least to the extent to which formal logic was necessary to understand his argument on the foundations of metaphysics. Furthermore, Kant thought that once the principles were determined, a formal analysis (as is required in logic) and a complete derivation of a system could be relatively easily accomplished with the additional help of existing textbooks (see B 27‒28, 108‒109, A XXI: “more entertainment than labor”).

We first present Kant’s doctrine of formal logic, that is, his theory of concepts, judgments and inference and his general methodology. Then, we address the question of the foundations of logic and its systematic character. Finally, we outline Kant’s transcendental logic (that is, logical foundations of metaphysics), especially in relation to formal logic, and give a brief overview of his historical influence.

2. The Concept of Formal Logic

What we here term “formal logic” Kant usually calls “general logic” (allgemeine Logik), in accordance with some of his contemporaries and predecessors (Jungius, Leibniz, Knutzen, Baumgarten). Kant only rarely uses the terms “formal logic” (B 170, also mentioned by Jungius) or “formal philosophy” (Groundwork of the Metaphysics of Morals IV:387), and he preferred to define “logic” in this general sense as a science of the “formal rules of thinking,” rather than merely a general doctrine of understanding (Verstand) (XVI refl. 1624; see B IX, 78, 79, 172). Let us note the distinction between Kant’s use of the term “formal philosophy” and its contemporary use (philosophy in which modern formalized methods are applied).

The following are the essential features of Kant’s formal logic (see B 76‒80):

(1) Formal logic is general inasmuch as it disregards the content of our thought and the differences between objects. It deals only with the form and general rules of thought instead and can only be a canon for judging the correctness of thought. In distinction, a special logic pertains to a special kind of objects and is conjoined with some special science as its organon to extend the content of knowledge.

(2) Formal logic is pure, as it is not concerned with the psychological empirical conditions under which we think and that influence our thought. These psychological conditions are dealt with in applied logic. In general, pure logic does not incorporate any empirical principles, and according to Kant, it is only in this way that it can be established as a science that proves its propositions with strong certainty.

Formal logic should abstract from the distinction of whether the content to which logical forms apply is pure or empirical. Therefore, formal logic is distinguished from transcendental logic, which is a special logic of pure (non-empirical) thinking and which deals with the origin of our cognitions that is independent of given objects. However, transcendental logic is, in a sense, also general, because it deals with the general content of our thought—that is, with the categories that determine all objects.

It is clear that Kant conceives logical forms, as forms of thought, in mentalistic, albeit not in psychological terms. For him, forms of thought are ways of establishing a unity of our consciousness with respect to a given variety of representations. In this context, consciousness comes into play quite abstractly as the most general instance of unity, since ultimately it is we ourselves, in our own consciousness, who are uniting and linking representations given to us. This abstract (non-empirical) unity is to be distinguished from a mere psychological association of representations, which is dispersed and dependent on changing subjective states, and thus cannot establish unity.

By using a mentalistic approach, Kant stresses the operational character of logic. For him, a logical form is a result of the abstract operations of our faculty of understanding (Verstand), and it is through these operations that a unity of our representations can be established. In connection with this, Kant defines function as “the unity of the action [Handlung] of ordering different representations under a common one” (B 93) and he considers logical forms to be based on functions. We see in more detail below how Kant applies his concept of function to logical forms. Further historical development and modifications of Kant’s notion of function can be traced in Frege’s notion of “concept” and Russell’s “propositional functions.”

3. Concept

According to Kant, the unity that a concept establishes from a variety of representations is a unity in a common mark (nota communis) of objects. The form of a concept as a common mark is universality, and its subject matter is objects. Three types of operations of understanding bring about a concept: comparison, reflection, and abstraction.

(1) Through comparison, as a preparatory operation, we become conscious of the identity and difference of objects, and come to an identical mark that is contained in representations of many things. This is a common mark of these things, which is a “partial concept” contained in their representations; other marks may also be contained in these representations, making the things different from one another.

(2) Through reflection, which is essential for concept formation, we become conscious of a common mark as belonging to and holding of many objects. This is a “ground of cognition” (Erkenntnisgrund) of objects, which universally holds of them. Universality (“universal validity”) is the form through which we conceive many objects in one and the same consciousness.

(3) Through abstraction, we leave out (“abstract from”) the differences between objects and retain only their common mark in our consciousness.

Kant characterizes the sort of unity that is established by a concept in the following, foundational way. Each concept, as a common mark that is found in many representations, has an analytic unity (identity) of consciousness “on itself.” At the same time, the concept is presupposed to belong to these, possibly composed, representations, where it is combined (synthesized) with the other component marks. That is, each concept presupposes a synthetic unity of consciousness (B 134 footnote).

On the ground of this functional theory of concepts, Kant explains the distinction between the content (intension) and the extension (sphere) of a concept. This distinction stems from the so-called Port-Royal logic (by A. Arnauld and P. Nicole) of the 17th century and has since become standard in so-called traditional logic (that is, in logic before or independent of its transformation starting with Boole and Frege’s methodology of formalization). According to Kant, concept A has a content in the sense that A is a “partial concept” contained in the representation of an object; concept A has extension (sphere) in the sense that A universally holds of many objects that are contained under A (Jäsche Logic §7 IX:95, XVI refl. 2902, Reich 1948 p. 38). The content of A can be complex, that is, it can contain many marks in itself. The content and extension of a concept A stand in an inversely proportional relationship: the more concept A contains under itself, the less A contains in itself, and vice versa.

A traditional doctrine (mainly originating from Aristotle) of the relationship between concepts can also be built on the basis of Kant’s theory of concepts. A concept B can be contained under A if A is contained in B, that is, as Kant says, if A is a note (a “ground of cognition”) of B. In this case, Kant calls A a higher concept with respect to B, and B a lower concept with respect to A. Kant also says that A is a “mark of a mark” B (a distant mark). Obviously, A is not meant as a second-order mark but rather as a mark of the same order as B. Also, A is a genus of B, while B is a species of A. Through abstraction, we ascend to higher and higher concepts; through determination, we descend to lower and lower concepts. The relationship between higher and lower concepts is subordination, and the relationship between lower concepts among themselves without mutual subordination is coordination. According to Kant, there is no lowest species, because we can always add a new mark to a given concept and thus make it more specific. Finally, with respect to extension, a higher concept is wider, and a lower concept is narrower. The concepts with the same extension are called reciprocal.

4. Judgment

Judgment is for Kant the way to bring given representations to the objective unity of self-consciousness (see B 141, XVI refl. 3045). Because of this unifying of a manifold (of representations) in one consciousness, Kant conceives judgment as rule (Prolegomena §23 IV:305, see Jäsche Logic §60 IX:121). For example, the objective unity is the meaning of the copula “are” in the judgment “All bodies are heavy”; what is meant is not our subjective feeling of heaviness, but rather the objective state of affairs that bodies are heavy (see B 142), which is representable by a thinking agent (“I”) irrespective of the agent’s changeable psychological states.

As Kant points out, there is no other logical use of concepts except in judgments (B 93), where a concept, as a predicate, is related to objects by means of another representation, a subject. No concept is related to objects directly (like intuition). In a judgment, a concept becomes an assertion (predicate) that is related to objects under some condition (subject) by means of which objects are represented. A logical unity of representations is thus established in the following way: many objects that are represented by means of some condition A are subsumed under some general assertion B, under which other conditions A’, A”, . . . too can possibly be subsumed. The unity of a judgment is objective, since it is conditioned by a representation (a subject concept or a judgment) that is objective or related to objects. The objective unity in a judgment is generalized by Kant so as to hold not merely between concepts (subject and predicate), but also between judgments themselves (as parts of a hypothetical or a disjunctive judgment).

According to Kant, the aspects and types of the unity of representations in a judgment can be exhaustively and systematically described and brought under the four main “titles”: quantity, quality, relation, and modality. This is a famous division of judgments that became standard in traditional logic after Kant.

a. Quantity and Quality

The assertion of a judgment can be related to its condition of objectivity without any exception or with a possible exception. In the first case, the judgment is universal (for example, “All A are B”), and in the second case, it is particular (for example, “Some A are B”).

With respect to a given condition of objectivity, an assertion is combined or not combined with it. In the first case, the judgment is affirmative (for example, “Some A are B”), while in the second case, it is negative (for example, “Some A are not B”).

If taken together, quantity and quality yield the four traditionally known (Aristotelian) types of judgment: universal affirmative (“All A are B,” AaB), universal negative (“No A is B,” AeB), particular affirmative (“Some A are B,” AiB), and particular negative (“Some A are not B,” AoB).

b. Relation

In a judgment, an assertion is brought under some condition of objective validity. There are three possible relations of the condition of objective validity to the assertion—subject–predicate, antecedent–consequent, and whole–members—each one represented by an appropriate exponent (“copula” in a wider sense).

(1) In a categorical judgment, a concept (B) as a predicate is brought under the condition of another concept (A) that is a subject that represents objects. Predicate B is an assertion that obtains its objective validity by means of the subject A as the condition:

x, which is contained under A, is also under B (XVI refl. 3096, Jäsche Logic §29 IX:108, symbols modified).

The relation of a categorical judgment is represented by the copula “is.” A categorical judgment stands under the principle of contradiction, which is formulated by Kant in the following way:

No predicate contradictory of a thing can belong to it (B 190).

Hence, there is no violation of the principle of contradiction in stating “A is B and non-B” as far as B or non-B does not contradict A. Only, “and” is not a logical operator for Kant, since it can be relativized by time: “A is B” and “A is non-B” can both be true, but at different moments in time (B 192). (Thus, Kant’s logic of categorical judgments can be considered as “paraconsistent,” in the sense that p and not-p, not violating the law of contradiction, do not entail an arbitrary judgment.)

(2) In a hypothetical judgment, some judgment (say, categorical), q, is an assertion that obtains its objective validity under the condition of another judgment, p: q is called a consequent, p its antecedent (ground), while their relation is what Kant calls (in accordance with other logics of the time) consequence. The exponent of the hypothetical judgment is “if . . . then . . .,” but it need not correspond to the main operator of a judgment in the sense of the syntax in modern logic. This means that a hypothetical judgment is not simply a conditional, since, for instance, it should also include universally quantified propositions like “If the soul is not composite, then it is not perishable,” which could be naturally formalized as “x ((Sx ˄ ¬Cx) → ¬Px) (compare Dohna-Wundlacken Logic XX-II:763; see examples in LV-I:203, LV-II:472). Let us note that “If something is a human, then it is mortal” is for Kant a hypothetical judgment, in distinction to the categorical judgment “All humans are mortal” (Vienna Logic XX-II:934, Hechsel Logic LV-II:31).

A hypothetical judgment stands under the principle of sufficient reason:

Each assertion has its reason.

Not having a reason contradicts the concept of assertion. By this principle (to be distinguished from Leibniz‘s ontological principle of the same name), q and not-q are excluded as consequents of the same antecedent: they cannot be grounded on one and the same reason. As can be seen, only now do we come to a version of the Aristotelian principle of contradiction, according to which no predicate can “simultaneously” belong and not belong to the same subject. On the other hand, we have no guarantee that there will always be an antecedent sufficient to decide between some p and not-p as its possible consequents. (In this sense, it could be said that Kant’s logic of assertions is “paracomplete.”)

(3) In a disjunctive judgment, the component judgments are parts of some whole (the disjunctive judgment itself) as their condition of objective validity. That is, the objectively valid assertion is one of the mutually exclusive but complementary parts of the whole, for example:

x, which is contained under A, is contained either under B or C, etc. (XVI refl. 3096, Jäsche Logic §29 IX:108).

The exponent of the disjunctive relation is “either . . . or . . .” in the exclusive sense, and, again, it should not be identified with the main operator in the modern sense. To see this, let us take Kant’s example of a disjunctive judgment, “A learned man is learned either historically or rationally,” which would, in a modern formalization, give a universally quantified sentence “x (Lx ® (Hx ˅ Rx)) (Jäsche Logic §29 IX:107).

In a disjunctive judgment, under the condition of an objective whole, some of its parts hold with the exclusion of the rest of the parts. A disjunctive judgment stands under the principle of excluded middle between p and not-p, since it is a contradiction to assert (or to deny) both p and not-p.

Remark. With respect to relation, a judgment is gradually made more and more determinate: from allowing mutually contradictory predicates, to excluding such contradictions on some ground but allowing undecidedness among them, to positing exactly one of the contradictory predicates by excluding the others. Through the three relations in a judgment, we step by step upgrade the conditions of a judgment, improve its unity, and strengthen logical laws, starting from paraconsistency and paracompleteness to finally come to a sort of classical logic.

In general, we can see that relation is what the objective unity of consciousness in a judgment basically consists in: it is a unifying function that (in three ways) relates a manifold of given representations to some condition of their objectivity. Since judgment is generally defined as a manner of bringing our representations to the objective unity of consciousness, the relation of a judgment makes the essential aspect of a judgment.

c. Modality

This is one of the most distinctive parts of Kant’s logic, revealing its purely intensional character. One and the same judgment structure (quantity, quality, and relation of a judgment) can be thought by means of varying and increasing strength as possible, true, and necessary. Correspondingly, Kant distinguishes

(1) problematic,

(2) assertoric, and

(3) apodictic

judgments (assertoric judgment is called “proposition,” Satz). For example, the antecedent p of a hypothetical judgment is thought merely as problematic (“if p”); secondly, p can also occur outside a hypothetical judgment as, for some reason, an already accepted judgment, that is, as assertoric; finally, p can occur as necessarily accepted on the ground of logical laws, thus apodictic.

These modes of judgment pertain just to how a judgment is thought, that is, to the way the judgment is accepted by understanding (Verstand). Kant says that (1) problematic modality is a “free choice,” an “arbitrary assumption,” of a judgment; (2) assertoric modality (in a proposition) is the acceptance of a judgment as true (logical actuality); while (3) apodictic modality consists in the “inseparable” connection with understanding (see B 101).

There is no special exponent (or operator) of modality; modality is just the “value,” “energy,” of how the existing exponent of a relation in a judgment is thought. Modality is in an essential sense distinguished from the quantity, quality, and relation, which, in distinction, constitute the logical content of a judgment (see B 99‒100; XVI refl. 3084).

Despite a very specific nature of modality, it is in a significant way—through logical laws—correlated with the relation of a judgment:

(1) logical possibility of a problematic judgment is judged with respect to the principle of contradiction—no judgment that violates this principle is logically possible;

(2) logical actuality (truth) of an assertoric judgment is judged with respect to the grounding of the judgment on some sufficient reason;

(3) logical necessity of an apodictic judgment is judged with respect to the decidability of the judgment on the ground of the principle of excluded middle

(see Kant’s letter to Reinhold from May 19, 1789 XI:45; Reich 1948 pp. 73‒76).

The interconnection of relation and modality is additionally emphasized by the fact that Kant sometimes united these two aspects under the title of queity (quaeitas) (XVI refl. 3084, Reich 1948 pp. 60‒61).

d. Systematic Overview

Systematic Overview

Kant gives an overview of his formal logical doctrine of judgments by means of the following table of judgments:

In his transcendental logic, Kant adds singular and infinite judgments as special judgment types. In formal logic (as was usual in logic textbooks of Kant’s time), they are subsumed under universal and affirmative judgments, respectively (see B 96‒97). A characteristic departure from the custom of 17th- and 18th-century logic textbooks is Kant’s (generalized) aspect of relation, which is not reducible to the subject–predicate relation, and directly comprises categorical, hypothetical, and disjunctive judgments—bypassing, for example, subdivision into simple and compound judgments. Another divergence from the custom of the time is Kant’s understanding of modality as independent of explicit modal expressions (“necessarily,” “contingently,” “possibly,” “impossibly”). Instead, Kant understands modality as an intrinsic moment of each judgment (for example, the antecedent and the consequent of a hypothetical judgment are as such problematic, and the consequence between them is assertoric), in distinction to the customary division into “pure” and “modal” propositions. The result of this was a more austere system of judgments that is reduced to strictly formal criteria in Kant’s sense and avoids the admixture of psychological, metaphysical, or anthropological aspects (B VIII).

Kant’s table of judgments has a systematic value within his formal logic. The fact that Kant uses the tabular method to give an overview of the doctrine of judgments shows, according to his methodological view on the tabular method (Section 6), that he is only summarizing a systematic whole of knowledge. Formal logic, as a system, is a “demonstrated doctrine” (Section 6), where everything “must be certain completely a priori” (B 78, compare many other places like B IX; A 14; Prolegomena IV:306; Groundwork of the Metaphysics of Morals IV:387; XVI refl. 1579 p. 21, 1587, 1620 p. 41, 1627, 1628; Preisschrift XX:271). Kant’s text supports the view that his formal logic should include a systematic, a priori justification of his table of judgments, despite dispute among scholars about how this justification can be reconstructed (see Section 7).

5. Inference

In an inference, a judgment is represented as “unfailingly” (that is, a priori, necessarily) connected with (and “derived” from) another judgment that is its ground (see B 360).

Kant distinguishes two ways we can derive a judgment (conclusion) from its ground:

(a) by the formal analysis of a given judgment (ground, premise), without the aid of any additional judgment—such an inference, which is traditionally known as immediate consequence, Kant calls an inference of understanding (Verstandesschluß, B 360);

(b) by the subsumption under some already accepted judgment (major premise) with the aid of some mediate judgment (additional, minor premise)—this is an inference of reason (Vernunftschluß), that is, a syllogism (B 360, compare, for example, XVI refl. 3195, 3196, 3198, 3201).

Kant distinguishes between “understanding” (Verstand) and “reason” (Vernunft) in the following way: understanding is the faculty of the unity of representations (“appearances”) by means of rules, while “reason” is the faculty of the unity of rules by means of principles (see B 359, 356, 361). Obviously, inference of understanding essentially remains at the unity already established by means of a given judgment (rule), whereas inference of reason starts from a higher unity (principle) under which many judgments can be subsumed.

Additionally, we can infer a conclusion by means of a presumption on the ground of already accepted judgments. This inference Kant names inference of the power of judgment (Schluß der Urteilskraft), but he does not consider it to belong to formal logic in a proper sense, since its conclusion, because of possible exceptions, does not follow with necessity.

a. Inference of Understanding (Immediate Consequence)

This part of Kant’s logical theory includes a variant of the traditional (Aristotelian) doctrine of immediate consequence, but as grounded in Kant’s previously presented theory of judgment. According to Kant, in an inference of understanding, we merely analyze a given judgment with respect to its logical form. Thus, Kant divides inference of understanding in accordance with his division of judgments:

(a) with respect to the quantity of a judgment, an inference is possible by subalternation: from a universal judgment to its corresponding particular judgment of the same quality (AaB / AiB, AeB / AoB);

(b) with respect to the quality of a judgment, an inference is possible according to the square of opposition (which usually includes subalternation): of the contradictories (AaB and AoB, AeB, and AiB), one is true and another false; of the contraries (AaB and AeB), at least one is false; of the subcontraries (AiB and AoB), at least one is true;

(c) with respect to the relation of a judgment, there is an inference by conversion (simple or changed): if B is (not) predicated of A, then A is (not) predicated of B (AaB / BiA, AeB / BeA, AiB / BiA);

(d) with respect to modality, an inference is possible by contraposition (for example AaB / non-BeA); Kant assigns contraposition to modality because the contraposition changes the logical actuality of the premise (proposition) to the necessity of the conclusion; that is, granted the premise, the conclusion expresses the exclusion (opposite) of self-contradiction (XVI refl. 3170, Hechsel Logic LV-II:448): granted AaB, non-B contradicts A (also, granted AeB or AoB, universal exclusion of non-B contradicts A, that is, non-BiA follows).

These inferences are valid on the ground of Kant’s assumption of the non-contradictory subject concept. Otherwise, if the subject concept is self-contradictory (nothing can be thought by it), then both contradictories would be false. For example, “A square circle is round” and “A square circle is not round” are both false due to the principle of contradiction (Prolegomena §52b IV:341, B 821: “both what one asserts affirmatively and what one asserts negatively of the object [of an impossible concept] are incorrect”; see B 819, 820‒821).

b. Inference of Reason (Syllogism)

Kant considers inference of reason within a variant of traditional theory of syllogisms, which includes categorical syllogism (substantially reduced to the first syllogistic figure), hypothetical syllogism, and disjunctive syllogism, everything shaped and modified in accordance with his theory of judgments and his conception of logic in general.

Each syllogism starts from a judgment that has the role of the major premise. In Kant’s view, the major premise is a general rule under the condition of which (for example, of its subject concept) a minor premise is subsumed. Accordingly, the condition of the minor premise itself (for example, its subject concept) is subsumed in the conclusion under the assertion of the major premise (for example, its predicate) (B 359‒361, B 386‒387). The major premise becomes in a syllogism a (comparative) principle from which other judgments can be derived as conclusions (see B 357, 358). Since there are three species of judgments with respect to relation, Kant distinguishes three species of syllogisms according to the relation of the major premise (B 361, XVI refl. 3199):

(a) Categorical syllogism. Kant starts from a standard doctrine of first syllogistic figure, where the major concept (predicate of the major premise) is put in relation to the minor concept (subject of the minor premise) by means of the middle concept (the subject of the major and the predicate of the minor premise): MaP, SaM / SaP; MeP, SaM / SeP; MaP, SiM / SiP; MeP, SiM / SoP. Kant insists that only the first figure of the categorical syllogism is an inference of reason, whereas in other figures there is a hidden immediate inference (sometimes reductio ad absurdum is needed) by means of which a syllogism can be transformed into the first figure (B 142 footnote, XVI refl. 3256; see The False Subtlety of the Four Syllogistic Figures in II).

(b) Hypothetical syllogism. The major premise is a hypothetical judgment, in which the antecedent and the consequent are problematic. Subsumption is accomplished by means of the change of the modality of the antecedent (or of the negation of the consequent) to an assertoric judgment (minor premise), from where in the conclusion the assertoric modality of the consequent (or of the negation of the antecedent) follows. The inference from the affirmation of the antecedent to the affirmation of the consequent is modus ponens, and the inference from the negation of the consequent to the negation of the antecedent is modus tollens of the hypothetical syllogism.

(c) Disjunctive syllogism. The major premise is a disjunctive judgment, where the disjuncts are problematic. Subsumption is carried out by the change of the problematic modality of some disjuncts (or their negations) to assertoric modality, from where in the conclusion the assertoric modality of the negation of other disjuncts (the assertoric modality of other disjuncts) follows. The inference from the affirmation of one part of the disjunction to the negation of the other part is modus ponendo tollens, and the inference from the negation of one part of the disjunction to the affirmation of the other part is modus tollendo ponens of the disjunctive syllogism.

In hypothetical and disjunctive syllogisms, there is no middle term (concept). As explained, the subsumption under the rule of the major premise is carried out just by means of the change of the modality of one part (or of its negation) of the major premise (see XVI refl. 3199).

In Kant’s texts, we can find short indications on how a theory of polysyllogisms should be built (for example, B 364, B 387‒389). Inference can be continued on the side of conditions by means of a prosyllogism, whose conclusion is a premise of a given syllogism (an ascending series of syllogisms), as well as on the side of what is conditioned by means of an episyllogism, whose premise is the conclusion of a given syllogism (a descending series of syllogisms). In order to derive, by syllogisms, a given judgment (conclusion), the ascending totality of its conditions should be assumed (either with some first unconditioned condition or as an unlimited but unconditioned series of all conditions) (B 364). In distinction, a descending series from a given conclusion could be only a potential one, since the acceptance of the conclusion, as given, is already granted by the ascending totality of conditions (B 388‒389). By requiring a given, completed ascending series of syllogisms, we advance towards the highest, unconditioned principles (see B 358). In this way, the logical unity of our representations increases towards a maximum: our reason aims at bringing the greatest manifold of representations under the smallest number of principles and to the highest unity (B 361).

c. Inference of the Power of Judgment (Induction and Analogy)

The inference of the power of judgment is only a presumption (“empirical inference”), and its conclusion a preliminary judgment. On the ground of the accordance in many special cases that stand under some common condition, we presume some general rule that holds under this common condition. Kant distinguishes two species of such an inference: induction and analogy. Roughly,

(a) by induction, we conclude from A in many things of some genus B, to A in all things of genus B: from a part of the extension of B to the whole extension of B;

(b) by analogy, we conclude from many properties that a thing x has in common with a thing y, to the possession by x of all properties of y that have their ground in C as a genus of x and y (C is called tertium comparationis): from a part of a concept C to the whole concept C

(see XVI refl. 3282‒3285).

What justifies such reasoning is the principle of our power of judgment, which requires that many cases of accordance should have some common ground (by means of belonging to the extension of the same concept or by having the marks of the same concept). However, since we do not derive this common ground with logical necessity, no objective unity is established, but only presumed, as a result of our subjective way of reflecting.

d. Fallacious Inference

For Kant, fallacious inferences should be explained by illusion (Schein, B 353): an inference may seem to be correct if judged on the ground of its appearance (species, Pölitz Logic XXIV-II:595, Warsaw Logic LV-II:649), although the real form of this proposed inference may be incorrect (just an “imitation” of a correct form, B 353, 354). Through such illusions, logic illegitimately becomes an organon to extend our knowledge outside the limits of the canon of logical forms. Kant calls dialectic the part of logic that deals with the discovery and solutions of logical illusions in fallacious inferences (for example, B 390, 354), in distinction to mere analytic of the forms of thought. Formal logic gives only negative criteria of truth (truth has to be in accordance with logical laws and forms), but cannot give any general material criterion of truth, because material truth depends on the specific knowledge about objects (B 83‒84). Formal logic, which is in itself a doctrine, becomes in its dialectical part the critique of fallacies and of logical illusion. In his logic lectures and texts, Kant addresses some traditionally well-known fallacies (for example, sophisma figurae dictionis, a dicto secundum quid ad dictum simpliciter, sophisma heterozeteseos, ignoratio elenchi, Liar). Below, in connection with Kant’s transcendental logic, we mention some of his own characteristic, systematically important examples of fallacies.

6. General Methodology

Since, according to Kant, formal logic abstracts from the differences of objects and hence cannot focus on the concrete content of a particular science, it can only give a short and very general outline of the form of a science, as the most comprehensive logical form. This outline is a mere general doctrine on the formal features of a method and on the systematic way of thinking. On the other hand, many interesting distinctions can be found in Kant’s reflections on general methodology that cast light on Kant’s approach to logic, philosophy, and mathematics.

Building on his concept of the faculty of reason, Kant defines method in general as the unity of a whole of knowledge according to principles (or as “a procedure in accordance with principles,” B 883). By means of a method, knowledge obtains the form of a system and transforms into a science. Non-methodical thinking (without any order), which Kant calls “tumultuar,” serves in combination with a method the variety of knowledge (whereas method itself serves the unity). In a wider sense, Kant speaks of a fragmentary (rhapsodical) method, which consists only in a subjective and psychological connection of thinking (it does not establish a system, but only an aggregate of knowledge, not a science, but merely ordinary knowledge).

In further detail, Kant’s general methodology includes the doctrine of definition, division, and proof—mainly a redefined, traditionally known material, with Kant’s own systematic form.

Let us first say that for Kant a concept is clear if we are conscious of its difference from other concepts. Also, a concept is distinct if its marks are clearly known. Now, definition is, according to Kant, a clear, distinct, complete, and precise (“reduced to a minimal number of marks”) presentation of a concept. Since all these requirements for a definition can be strictly fulfilled only in mathematics, Kant distinguishes various forms of clarification that only partially fulfill the above-mentioned requirements, as exposition, which is clear and distinct, but need not be precise and complete (see XVI refl. 2921, 2925, 2951; B 755‒758). Division is the representation of a manifold under some concept and as interrelated, by means of mutual opposition, within the whole sphere of the concept (see XVI refl. 3025).

Proof provides certainty to a judgment by making distinct the connection of the judgment with its grounds (see XVI refl. 2719). Proofs can be distinguished with respect to the grade of certainty they provide. (1) A proof can be apodictic (strong), in a twofold way: as a demonstration (proof by means of the construction in an intuition, in concreto, as in mathematics) or as a discursive proof (by means of concepts, in abstracto, as in philosophy). In addition, a strong proof can be direct (ostensive), by means of the derivation of a judgment from its ground, or indirect (apagogical), by means of showing the untenability of a consequent of the judgment’s contradictory. In his philosophy, Kant focuses on the examples where indirect proofs are not applicable due to the possibility of dialectical illusion (contraries and subcontraries that only subjectively and deceptively appear to be contradictories, which is impossible in mathematics, B 819‒821). (2) Sometimes the grounds of proof give only incomplete certainty, for instance, empirical certainty (as in induction and analogy), probability, possibility (hypothesis), or merely apparent certainty (fallacious proof) (see Critique of Judgment §90 V:463).

Furthermore, Kant distinguishes the syllogistic and tabular methods. The syllogistic method derives knowledge by means of syllogisms. An already established systematic whole of knowledge is presented in its whole articulation (branching) by the tabular method (as is the case, for example, with Kant’s tables of judgments and categories; see, for example, Pölitz Logic XXIV-II:599, Dohna-Wundlacken Logic XXIV-II:80, Hechsel Logic LV-II:494). In addition, the division of the syllogistic method into the synthetic (progressive) and analytic (regressive) is important. The former proceeds from the principles to what is derived, from elements (the simple) to the composed, from reasons to what follows from them, whereas the latter proceeds the other way around, from what is given to its reasons, elements, and principles. (For the application of these two syllogistic methods in metaphysics, see, for instance, B 395 footnote.)

Finally, Kant comes to the following three general methodological principles (B 685‒688):

(1) the principle of “homogeneity of the manifold under higher genuses”;

(2) the principle of specification, that is, of the “variety of the homogeneous under lower species”;

(3) the principle of continuity of the transition to higher genuses and to lower species.

These principles correspond to the three interests of the faculty of reason: the interests of unity, manifold, and affinity. Again, all three principles are just three sides of one and the same, most general, principle of the systematic (“thoroughgoing”) unity of our knowledge (B 694).

The end result of the application of methodology in our knowledge is a “demonstrated doctrine,” which derives knowledge by means of apodictic proofs. It is accompanied by a corresponding discipline, which, by means of critique, prevents and corrects logical illusion and fallacies.

7. The Foundations of Logic

As stated by Kant, formal logic itself should be founded and built according to strict criteria, as a demonstrated doctrine. It should be a “strongly proven,” “exhaustively presented” system (B IX), with the “a priori insight” into the formal rules of thinking “through mere analysis of the actions of reason into their moments” (B 170). Since in formal logic “the understanding [Verstand] has to do with nothing further than itself and its own form” (B IX), formal logic should be grounded in the condition of the possibility of the understanding in the formal sense, and this condition is technically (operationally) defined by Kant as the unity of pure (original) self-consciousness (apperception) (B 131, compare XVI 21 refl. 1579: logical rules should be “proven from the reason [Vernunft]”). This unity is the fundamental, qualitative unity of the act of thinking (“I think”) as opposed to a given manifold (variety) of representations. The operational “one-many” opposition, as well as the further analysis of its general features and structure, should be appropriate as a foundational starting point from which a system of logic could be strongly derived. The basic step of the analysis of this fundamental unity is Kant’s distinction between the analytic and synthetic unity of self-consciousness (see, for example, B §§15‒19): at first, the act of thinking (“I think”) appears simply to accompany all our representations. It is the identity of my consciousness in all my representations, termed by Kant analytic unity of self-consciousness. But this identity of consciousness would, for me (as a thinking subject), not be possible if I would not conjoin (synthesize) one representation with another and be conscious of this synthesis. Thus, the analytic unity of self-consciousness is possible only under the condition of the synthetic unity of self-consciousness (B 133). Kant further shows that the synthetic unity is objective, because it devises a concept of object with respect to which we synthesize representations into a unity. This unity is necessary and universally valid, that is, independent of any changeable, psychological state.

In Kant’s words: “the synthetic unity of apperception is the highest point to which one must affix all use of the understanding, even the whole logic and, after it, transcendental philosophy; indeed this faculty is the understanding itself” (B 134 footnote; see A 117 footnote and Opus postumum XXII:77). (For a formalization of Kant’s theory of apperception according to the first edition of the Critique of Pure Reason, see Achourioti and Lambalgen 2011.)

Kant himself did not write a systematic presentation of formal logic, and the form and interpretation of Kant’s intended logical system are disputed among Kant scholars. Nevertheless, it is evident that each logical form is conceived by Kant as a type of unity of given representations, that this unity is an act of thinking and consciousness, and that each logical form is therefore essentially related to the “original” unity of self-consciousness. Some scholars, starting from the concept of the original unity of self-consciousness—that is, from the concept of understanding (as confronted with a given “manifold” of our representations)—proposed various lines of a reconstruction of Kant’s assumed completeness proof of his logical forms (or supplied such a proof on their own), in particular, of his table of judgments (see a classical work by Reich 1948, and, for example, Wolff 1995, Achourioti and van Lambalgen 2011, Kovač 2014). There are authors who offer arguments that the number and the species of the functions of our understanding are for Kant primitive facts, and can be at most indicated (Indizienbeweis) on the ground of the “functional unity” of a judgment (Brandt 1991; see a justification of Kant’s table of judgments in Krüger 1968).

8. Transcendental Logic (Philosophical Logic)

Besides formal logic, Kant considers a branch of philosophical logic that deals with the foundations of ontology and the rest of metaphysics and shows how objects are constituted in our knowledge by means of logical categorization. This branch of logic Kant names “transcendental logic.”

a. A Priori–A Posteriori; Analytic–Synthetic

Kant’s transcendental logic is based on two important distinctions, which exerted great influence in the ensuing history of logic and philosophy: the distinction between a priori and a posteriori knowledge, and the distinction between synthetic and analytic judgments (see B 1‒3).

Knowledge is a priori if it is possible independently of any experience. For instance, “Every change has its cause.” As the example shows, knowledge can be a priori, but about an empirical concept, like “change,” since given a change, we independently of any experience know that it should have a cause. A priori knowledge is pure if it has no empirical content, like, for example, mathematical propositions.

Knowledge is a posteriori (empirical) if it is possible only by means of experience. An example is “All bodies are heavy,” since we cannot know without experience (just from the concept of body) whether a body is heavy.

Kant gives two certain, mutually inseparable marks of a priori knowledge: (1) it is necessary and derived (if at all) only from necessary judgments; (2) it is strictly universal, with no exceptions possible. In distinction, a posteriori knowledge (1) permits that the state of affairs that is thought of can also be otherwise, and (2) it can possess at most assumed and comparative universality, with respect to the already perceived cases (as in induction) (B 3‒4).

Analytic and synthetic judgments are distinguished with respect to their content: a judgment is analytic if it adds nothing to the content of the knowledge given by the condition of the judgment; otherwise, it is synthetic.

That is, analytic judgments are merely explicative with respect to the content given by the condition of the judgment, while synthetic judgments are expansive with respect to the given content

(see Prolegomena §2a IV:266, B 10‒11). Kant exemplifies this distinction on affirmative categorical judgments: such a judgment is analytic if its predicate does not contain anything that is not contained in the subject of the judgment; otherwise, the judgment is synthetic: its predicate adds to the content of the subject what is not already contained in it. An example of analytic judgments is “All bodies are extended” (“extended” is contained in the concept “body”); an example of synthetic judgments is the empirical judgment “All bodies are heavy” (“heavy” is not contained in the concept “body”).

We note that Kant’s formal logic should contain only analytic judgments, although its laws and principles refer to and hold for all judgments (analytic and synthetic) in general (see Reich 1948 14‒15, 17). Conversely, analytic knowledge is based on formal logic, affirming (negating) only what should be affirmed (negated) on pain of contradiction. Let us remark that for Frege, unlike for Kant, this notion of analytic knowledge holds also for arithmetic.

b. Categories and the Empirical Domain

The objective of Kant’s transcendental logic is pure forms of thinking in so far as they a priori refer to objects (B 80‒82). That is, necessary and strictly universal ways should be shown for how our understanding determines objects, independently of, and prior to, all experience. In Kant’s technical language, this means that transcendental logic should contain synthetic judgments a priori.

According to Kant’s restriction on transcendental logic, objects can be given to us only in a sensible intuition. These objects can be conceived as making Kant’s only legitimate, empirical domain of theoretical knowledge. Hence, the task is to discover which pure forms of our thought (categories, “pure concepts of understanding”), and in which way, determine the empirically given objects. Kant obtains categories from his table of logical forms of judgment (“metaphysical deduction of categories,” B §10, see §§20, 26) because these forms, besides giving unity to a judgment, are also what unite a sensibly given manifold into a concept of an object. Technically expressed, a form of a judgment is a “function of unity” that can serve to synthesize a manifold of an intuition. The manifold is synthesized into a unity that is a concept of an object given in the intuition. To “deduce” categories, Kant introduces some small emendations into his table of the logical functions in judgments. These emendations are needed because what is focused on in transcendental logic is not merely the form of thought, but also the a priori content of thought. Thus, Kant extends the division of “moments” under the titles of quantity and quality of judgments by adding singular and infinite judgments, respectively (for instance, “Plato is a philosopher”; “The soul is non-mortal”). He also changes the term “particular judgment” for “plurative,” since the intended content is not an exception from totality (which is the logical form of a particular judgment), but plurality independently from totality. With respect to the content, Kant reverses the order under the title of quantity (Prolegomena §20 footnote IV:302).

In correspondence with the 12 forms of judgments, Kant obtains 12 categories:

(Prolegomena §21 IV:303).

Sometimes, the order of the categories of quality is also changed: reality, limitation, full negation (Prolegomena §39 IV:325). In the Critique of Pure Reason, the table is more explicative. Under “Relation,” Kant lists:

(a) inherence and subsistence (substantia et accidens);

(b) causality and dependence (cause and effect);

Under “Modality,” he adds negative categories of impossibility, non-existence, and contingency (B 106). (For a possible reconstruction of a deduction of categories from the synthetic unity of self-consciousness as the first principle, see Schulting 2019.)

Kant further shows that all objects of a sensible intuition in general (be it in space and time or not) presuppose a synthetic unity (in self-consciousness) of a manifold according to categories. On the ground of this premise, he also shows that all objects of our experience, too, stand under categories. Briefly, in the proof of this result, Kant shows, first, that each of our empirical intuitions presupposes a synthetic unity according to which space and time are determined in this intuition. We then abstract from the space-time form of our empirical intuition, isolate just the synthetic unity, and, by subsumption under the first premise (on intuitions in general), conclude that this synthetic unity is based on the categories, which are applicable to our space-time intuition (“transcendental deduction of categories,” B §§20, 21, 26, B 168‒169).

In addition, transcendental logic comprises a theory of judgments a priori and of their principles. These principles determine how categories, which are pure concepts, are applied to objects given in our intuition and make our knowledge of objects possible. For Kant, there is no way to come to a theoretical knowledge of objects other than by means of experience, which includes, as its formal side, categories as well as space and time. Accordingly, there are a priori judgments about how categories can have objective validity in application to what can be given in our space-time intuition. As Kant puts it: the conditions (including categories) of the possibility of experience are at the same time the conditions of the possibility of the objects of experience, and thus have objective validity (B 197).

Kant systematically elaborates the principles of the pure faculty of understanding in consonance with his table of judgments. According to these principles, different moments that constitute our experience (1. intuition; 2. sensation; 3. perception of permanence, change, and simultaneity; 4. formal and material conditions in general) are subsumed under corresponding categories (1. extensive magnitude, 2. intensive magnitude, 3. categories of relation, 4. modal categories).

Kant emphasizes that concepts themselves cannot be conceived as objects (noumena) in the same (empirical) domain of objects (appearances, phaenomena) to which they as concepts apply. That is, in modern terms, we can speak of noumena only within a second-order regimentation of domains, with the lower (empirical) domain as ontologically preferred.

c. Transcendental Ideas

There are further concepts to which we are led, not by our faculty of understanding and the forms of judgment, but by our faculty of reason and its forms of inference. In distinction to categories, which are applicable to the domain of our experience, the concepts of the faculty of reason do not have their corresponding objects given in our intuition; their correspondents can only be purported objects “in themselves” (Dinge an sich), which transcend all our experience. A concept of the “unconditioned” (“absolute,” referring to the totality of conditions) for a given “conditioned” thing or state is termed by Kant a transcendental idea. Transcendental ideas, although going beyond our experience, have a regulative role to direct and lead our empirical thought towards the paradigm of the unconditioned synthetic unity of knowledge. According to the three species of inference of reason (categorical, hypothetical, and disjunctive), there are three classes of transcendental ideas (B 391, 438‒443):

(1) the unconditioned unity of the subject (the idea of the “thinking self”) that is not a predicate of any further subject;

(2) the unconditioned unity of the series of conditions of appearance (the idea of “world”), which further divides into four ideas in correspondence with the four classes of categories:

(a) the unconditioned completeness of the composition of the whole of appearances,

(b) the unconditioned completeness of the division of a given whole in appearance,

(d) the unconditioned completeness of the dependence of appearances regarding their existence;

(3) the unconditioned unity of the ground of all objects of thinking, in accordance with the principle of complete determination of an object regarding each possible predicate (the idea of “being of all beings”).

These transcendental ideas are in a natural way connected with a dialectic of our faculty of reason, where reason aims towards the knowledge of empirically unverifiable objects (B 397‒398).

(1) Through transcendental paralogisms, we come to think of the formal subject of our thought as of a substance.

(2) Through the antinomies of pure reason, the following opposites (seeming contradictions) remain undecided:

(a) the world has a beginning – the world is infinite;

(b) each composed thing consists of simple parts – there is nothing simple in things (they are infinitely divisible);

(d) there is an absolutely necessary being – everything is contingent.

(3) The ideal of pure reason leads us to found the principle of complete determination on the idea of the most perfect being. In addition, Kant assumes here that “existence” is not a real predicate—that is, it does not contribute to the determination of a thing.

Kant insists on separating and excluding (1) the formal logical subject (“I think”) of all our thought from the empirical objects (substances) about which the subject can think; (2) the domain of experience from the members of this domain; and (3) the totality of concepts applicable to the domain from these concepts themselves. Thus, Kant’s transcendental dialectic includes and deals with logical problems connected with the possible disregarding of what we could today call type-theoretical distinctions and the distinction between a theory and its metatheory.

Let us add a methodological remark about the relationship between mathematical and transcendental logical knowledge. The rigor of mathematical evidence (intuitive certainty, B 762) is based, according to Kant, on the possibility of constructing mathematical concepts in intuition. This construction can be ostensive (geometric) or “symbolic” (“characteristic”, B 745, 762, as in arithmetic and algebra). However, as Kant points out, this is not available for transcendental logic, where knowledge should also be apodictic and a priori, but confined to the abstract, conceptual “exposition” (without a construction in intuition, albeit with an application of concepts to intuition). For this reason, definitions and demonstrations in the strictest sense are possible in mathematics, but not in transcendental logic (B 758‒759, 762‒763).

9. Influences and Heritage

Although Kant’s logic, if taken literally, is in form and content largely traditional as well as significantly dependent on the science of his time, it offered new essential and foundational perspectives that are deeply (and often unknowingly) built into modern logic.

Kant required a formal, though not mathematical, rigor in logic, purifying it of psychological and anthropological admixtures. This rigor was required in two ways: (a) in the sense of functionally defined logical forms, and (b) in the sense of a systematic, scientific form of logic. Kant’s transcendental logic is characterized by the strict distinction of formal logical and metaphysical aspects of concepts, as well as by defined standards of the justification of concepts and of their application in an empirical model of knowledge. Nevertheless, Kant strictly separated mathematical and philosophical rigor. It is in the aspect of the possibilities of the “symbolic construction” of concepts that modern logic has made great advances in comparison to Kant’s logic.

Let us give some examples of Kant’s influence on the posterior development of logic and philosophy.

Kant’s table of judgments influenced a large part of traditional or reformed traditional logic deep into the 20th century. Besides, although Frege criticized Kant’s table of judgment as contentual and grammatical, in Frege’s distinction between “judging” and the content of judgment, Kant’s distinction between modality and the logical content of the judgment can be traced. Kant’s restriction of the importance of categorical judgments, with an emphasis also on the logical relation between judgments, announced the future development of truth-functional propositional logic. Kant’s criterion of sensible intuition for the givenness of objects inspired Hilbert’s finitistic formalism with “concrete signs” and their shapes as the immediately intuitively given of his metamathematics. Kant’s foundational theory of the unity of apperception (in application to time) inspired the emergence of intuitionism (Brouwer). Kant’s undecidability of geometry by analytic means, properly corrected and reinterpreted, anticipates Gödel’s incompleteness results.

Kant’s distinctions of the analytic and the synthetic, and of the a priori and the a posteriori, had a deep impact on philosophical and mathematical logic, and have delineated an important part of philosophical discussions after Kant. Frege especially praised Kant’s analytic-synthetic distinction, despite his departure from Kant, according to whom arithmetic was, like geometry, synthetic. The analytic-synthetic distinction was a crucial subject of discussion and revision, for example, in Carnap‘s, Gödel’s, Quine‘s, and Kripke’s philosophies of logic, language, and knowledge.

Kant’s duality of the conceptual system and empirical model, with differentiated logical (and ontological) orders of concepts and their (intended) corresponding objects, already leads into the area of solving logical antinomies and of incompleteness (see Tiles 2004). With his conception of successively upgrading logical laws (from the law of contradiction, to the law of sufficient reason, to the law of excluded middle), Kant implicitly offered a general picture of possible logics that exceeds classical logic—as far as it was possible with the tools available to him. His logical foundations of philosophy can still inspire modern logical-philosophical investigations.

10. References and Further Reading

a. Primary Sources

Kant, Immanuel. 1910–. Kant’s gesammelte Schriften. Königlich Preussische Akademie der Wissenschaften (ed.). Berlin: Reimer, Berlin and Leipzig: de Gruyter. Also Kants Werke I–IX, Berlin: de Gruyter, 1968 (Anmerkungen, 2 vols., Berlin: de Gruyter, 1977).
Cited by volume number (I, II, etc.); Kritik der reinen Vernunft, 1st ed. = A, 2nd ed. = B.
Kant, Immanuel. 1998. Critique of Pure Reason. Cambridge, UK: Cambridge University Press. Transl. and ed. by Paul Guyer and Allen W. Wood.
Kant, Immanuel. 1992. Lectures on Logic. Cambridge, UK: Cambridge University Press. Transl. and ed. by J. Michael Young.
Kant, Immanuel. 1998. Logik-Vorlesungen: Unveröffentlichte Nachschriften I‒II. Hamburg: Meiner. Ed. by T. Pinder.
Cited as LV.
Kant, Immanuel. 2004. Prolegomena to Any Future Metaphysics. Cambridge, UK: Cambridge University Press. Transl. and ed. by Gary Hatfield.

b. Secondary Sources

Achourioti, Theodora and van Lambalgen, Michiel. 2011. “A Formalization of Kant’s Transcendental Logic.” The Review of Symbolic Logic. 4: 254–289.
Béziau, Jean-Yves. 2008. “What is ʻFormal Logicʼ?” in Proceedings of the XXII Congress of Philosophy, Myung-Hyung-Lee (ed.), Seoul: Korean Philosophical Association, 13: 9–22.
Brandt, Reinhard. 1991. Die Urteilstafel. Kritik der reinen Vernunft A 67‒76; B 92‒101. Hamburg: Meiner.
Capozzi, Mirella and Roncaglia, Gino. 2009. “Logic and Philosophy of Logic from Humanism to Kant” in Leila Haaparanta (ed.), The Development of Modern Logic. New York: Oxford University Press, pp. 78–158.
Conrad, Elfriede. 1994. Kants Vorlesungen als neuer Schlüssel zur Architektonik der Kritik der reinen Vernunft. Stuttgart-Bad Cannstatt: Frommann-Holzboog.
Friedman, Michael. 1992. Kant and the Exact Sciences. Cambridge (Ma), London: Harvard University Press.
Kneale, William and Kneale, Martha. 1991. The Development of Logic. Oxford: Oxford University Press. First published 1962.
Kovač, Srećko. 2008. “In What Sense is Kantian Principle of Contradiction Non-classical”. Logic and Logical Philosophy. 17: 251–274.
Kovač, Srećko. 2014. “Forms of Judgment as a Link between Mind and the Concepts of Substance and Cause” in Substantiality and Causality, Mirosław Szatkowski and Marek Rosiak (eds.), Boston, Berlin, Munich: de Gruyter, pp. 51–66.
Krüger, Lorenz, 1968. “Wollte Kant die Vollständigkeit seiner Urteilstafel beweisen.” Kant-Studien. 59: 333–356.
Lapointe, Sandra (ed.), 2019. Logic from Kant to Russell: Laying the Foundations for Analytic Philosophy. New York, London: Routledge.
Longuenesse, Beatrice. 1998. Kant and the Capacity to Judge: Sensibility and Discursivity in the Transcendental Analytic of the Critique of Pure Reason. Princeton: Princeton University Press. Transl. by Charles T. Wolfe.
Loparić, Željko. 1990. “The Logical Structure of the First Antinomy.” Kant-Studien. 81: 280–303.
Lu-Adler, Huaping. 2018. Kant and the Science of Logic: A Historical and Philosophical Reconstruction. New York: Oxford University Press.
MacFarlane, John. 2002. “Frege, Kant, and the Logic in Logicism.” The Philosophical Review. 111: 25–65.
Mosser, Kurt. 2008. Necessity and Possibility: The Logical Strategy of Kant’s Critique of Pure Reason. Washington, DC: Catholic University of America Press.
Newton, Alexandra. 2019. “Kant’s Logic of Judgment” in The Act and Object of Judgment, Brian Ball and Christoph Schuringa (eds.), New York, London: Routledge, pp. 66–90.
Reich, Klaus. 1948. Die Vollständigkeit der kantischen Urteilstafel. 2nd ed. Berlin: Schoetz. (1st ed. 1932).
English: The Completeness of Kant’s Table of Judgments, transl. by Jane Kneller and Michael Losonsky, Stanford University Press, 1992.
Scholz, Heinrich. 1959. Abriß der Geschichte der Logik. Freiburg, München: Alber. (1st ed. 1931).
Schulthess, Peter. 1981. Relation und Funktion: Eine systematische und entwicklungsgeschichtliche Untersuchung zur theoretischen Philosophie Kants. Berlin, New York: de Gruyter.
Schulting, Dennis. 2019. Kant’s Deduction from Apperception: An Essay on the Transcendental Deduction of Categories. 2nd revised ed. Berlin, Boston: de Gruyter.
Stuhlmann-Laeisz, Rainer. 1976. Kants Logik: Eine Interpretation auf der Grundlage von Vorlesungen, veröffentlichten Werken und Nachlaß. Berlin, New York: de Gruyter.
Tiles, Mary. 2004. “Kant: From General to Transcendental Logic” in Handbook of the History of Logic, vol. 3, Dov M. Gabbay and John Woods (eds.). Amsterdam etc: Elsevier, pp. 85–130.
Tolley, Christian. 2012. “The Generality of Kant’s Transcendental Logic.” Journal of the History of Philosophy. 50: 417‒446.
Tonelli, Giorgio. 1966. “Die Voraussetzungen zur Kantischen Urteilstafel in der Logik des 18. Jahrhunderts” in Kritik und Metaphysik, Friedrich Kaulbach and Joachim Ritter (eds). Berlin: de Gruyter, pp. 134–158.
Tonelli, Giorgio. 1994. Kant’s Critique of Pure Reason within the Tradition of Modern Logic: A Commentary on its History. Hildesheim, Zürich, New York: Olms.
Wolff, Michael. 1995. Die Vollständigkeit der kantischen Urteilstafel: Mit einem Essay über Frege’s Begriffsschrift. Frankfurt a. M.: Klostermann.
Wuchterl, Kurt. 1958. Die Theorie der formalen Logik bei Kant und in der Logistik. Inaugural-Dissertation, Ruprecht-Karl-Universität zu Heidelberg.

Author Information

Srećko Kovač
Email: skovac@ifzg.hr
Institute of Philosophy, Zagreb
Croatia

testy

How now brown cow?

In LaTeX, in order to place a subscript q on a capital F, try $F_q$, which will make the ‘q’ smaller and slightly lower but will force it to be in italics. Otherwise, try some of the font size modifiers to reduce the size of the ‘q’ by doing $\small q$ or $ \tiny q $, but unfortunately this also forces an italics.

Life
The Modern Turn
1. Against Scholasticism
  1. Descartes’ Project

1. Life

(Formater: Insert paragraphs for this section here.)

2. The Modern Turn

(Formater: Insert paragraphs for this section here.)

a. Against Scholasticism

(Formater: Insert paragraphs for this section here.)

i. Descartes’ Project

(Formater: Insert paragraphs for this section here.)

Reference 1
Reference 2
Reference 3
…

Author Information

Tesla Edison
Email: x@x.edu
Near-Earth orbit

Bernardino Telesio (1509—1588)

Dubbed “the first of the new philosophers” by Francis Bacon in 1613, Bernardino Telesio was one of the most eminent thinkers of Renaissance Italy, along with figures such as Pico, Pomponazzi, Cardano, Patrizi, Bruno, Doni, and Campanella.

The young Telesio spent the early decades of his life under the guidance of his uncle Antonio (1482-1534), a fine humanist who was determined to go beyond the strict disciplinary division between literary and philosophical texts. Before the printing of the first edition of his principal work, De natura iuxta propria principia (On the Nature of Things According to their Own Principles) (Rome, 1565), Telesio assimilated the basics of ancient scientific thought (both Greek and Latin), as well as those of Plato’s and Aristotle’s Scholastic commentators. In the second half of the 16^th century, he began to be recognized as an adversary of Aristotle’s thought, insofar as he upheld a conception of man and nature that attempted to replace the principles of Aristotle’s natural philosophy. His starting point was the definition of a new role for the notion of sense perception in animal cognition. Using the Stoic notion of spiritus (translating the Greek word pneuma), he criticized Aristotle’s hylomorphism. As a fiery substance and an internal principle of motion, spiritus is the principle of sensitivity: by the way of heat, it pervades the entire cosmos, so that all beings are capable of sensation. In addition to grounding Telesio’s epistemology, then, the notion of spiritus lies at the core of his natural philosophy. During the time span extending from 1565 to 1588, he overturned the traditional conception of the relationship between sensus and intellectus, as championed by the Scholastic followers of Aristotle. Telesio denied that human brain possesses a faculty able to grasp the forms or essentiae of natural beings from simple passive sensible data of experience. On the contrary, sense perception has an active role: it is the first form of understanding the natural world. It is by the “way of senses” that mental representations of natural things are selected and shaped. This process happens in strict cooperation with the corporeal principle of self-organization of the material soul. In human beings as well as in animals, brain is the main source of this principle, which governs the cognitive process without the support of a superior immaterial agent. This active form of “sentience” constitutes the primary causal connection between the brain and the external world. Founded on a reassessment of the categories of sense perception, Telesio’s philosophy of mind led to an empiricist approach to the study of natural phenomena.

Life and Times
Psychology and Theory of Knowledge
Cosmology
Influence and Legacy
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Life and Times

Bernardino Telesio was born in Cosenza (Northern Calabria) to Giovanni Battista, a noble man of letters, and Vincenza Garofalo, the daughter of a lawyer. Bernardino was the first-born of eight sons, and as a child was sent to his uncle Antonio (1482-1534) to be educated. In 1517, they went to the Duchy of Milan, where the young Telesio became acquainted with the most illustrious pupils of his uncle. He also met some eminent men of letters, like Matteo Bandello (1485-1561), who in his Novelle (1554) will recall Antonio’s knack for entertaining the members of the intellectual circles led by such gentlewomen as Camilla Scarampa Guidoboni (ca.1454-ca.1518) and Ippolita Sforza Bentivoglio (1481-ca.1520).

In 1523 Bernardino and Antonio moved to Rome, entering the intellectual milieu of the papal court and of the Vatican library, which was animated by philosophers and humanists such as Paolo Giovio (1483-1552), Marco Girolamo Vida (1485-1566), Marcello Cervini (1501-1555), Coriolano Martirano (1503- 1558), and Giovanni Antonio Pantusa (1501-1562). Bernardino left Studium Urbis in 1527, soon after the “sack of Rome”. Then he moved to Padua, where his uncle had been appointed professor of Latin by the municipality of Venice (October 17^th, 1527).

During his early education, Bernardino was deeply influenced by his uncle. Antonio was a fine humanist, whose works largely circulated across Europe. To name an example, Antonio’s De coloribus libellus (Venice, 1528) rose to great fame. Following the Venetian first edition, at least ten editions of the work were released in Paris by scholar-printers such as Chrestien Wechel, Jacob Gazel and Robert Estienne (Stephanus); and a further five appeared in Basel. In particular, the Basel reprints were released by such renowned humanists as Hieronymus Froben and Johannes Herbst Oporinus. Thus, the young Bernardino could benefit from the mastery of some of the finest Italian connoisseurs of ancient Greek and Latin literature, soon becoming himself an expert reader of classic authors such as Virgil, Cicero, Seneca, Pliny, and Lucretius.

It is important to note that the materialist and empiricist approach Telesio displayed in his early works did not come out at first; the main source was an open-minded reading of the texts written by the early commentators of Aristotle’s works, such as Alexander of Aphrodisia, recently revisited by a new generation of scholars, such as, for example, Pietro Pomponazzi. At the University of Padua, the young Telesio could learn the new critical approach to Aristotle’s works. During the time spent in Padua and Venice, he did not gain the title of magister medicinae et artium, yet he started to develop a serious interest in mathematics, medicine, and natural philosophy.

At the end of the Venetian period (1527-1529), Telesio came back to Calabria. After some time spent in Naples (probably from 1532 or 1533 up to the spring of 1534, when his uncle passed away), Bernardino moved to Rome (1534-1535), living in the papal environment of Paolo III Farnese. Then, between 1536 and 1538 he spent a fruitful period of study at the Benedictine monastery of Seminara in the South of Calabria. There he began to develop his arsenal of anti-Aristotelian arguments, partly taken from Presocratic, Hippocratic, Epicurean and Stoic ideas. From there he went back to Rome, meeting some illustrious members of the papal court. Benedetto Varchi, Annibal Caro, Niccolò Ardinghelli, Ippolito Capilupi, Alessandro Farnese, Gasparo Contarini, Niccolò Gaddi, Giovanni Della Casa, and the Orsini brothers became soon acquainted with the philosopher of Cosenza. The significant number of letters written by these figures in the 1540s allows us to follow Bernardino’s movements between Rome, Naples, and Padua (Sergio 2014; Simonetta 2015). By the early 1540s, Telesio was already renowned as an anti-Aristotelian philosopher.

It was during that time that Telesio started to study Vesalius’s program of reform of the ancient ars medendi, including both Galen’s legacy and the Corpus Hippocraticum. Between 1541 and 1542, he spent some time in Padua, during which he met the anatomist and physician Matteo Realdo Colombo (1516-1559). Telesio’s interest in the nova ars medendi, and, more specifically, in physiology of sense perception, will be attested in a work “Contra Galenum” entitled Quod animal universum ab unica Animae substantia gubernatur, written in the 1560s, and posthumously edited and published by Antonio Persio (1542-1612) in Varii de naturalibus rebus libelli (Telesio 1590, [139-227]).

In the late 1540s he probably lived in the Neapolitan household of Alfonso Carafa (d. 1581), III Duke of Nocera, and in the early 1550s he came back to Cosenza. There, in 1553, he married Diana Sersale (d. 1561), a noblewoman belonging to the municipality of Cosenza. He soon became a leading figure in the city, laying foundations for the future creation of the “Accademia Cosentina” (Lupi 2011). In Cosenza, Telesio had such distinguished pupils as Sertorio Quattromani (1541-1603) and Iacopo di Gaeta (fl. 1550-1600); the philosopher and physician Agostino Doni (fl. 1545-1583); the orientalist Giovanni Battista Vecchietti (1552-1619); the future mayor of Cosenza, Giulio Cavalcanti (1591-1600); and Telesio’s first biographer, Giovan Paolo d’Aquino (d. 1612). In 1554 Telesio was elected mayor of Cosenza. Throughout the 1550s, he worked to improve the initial versions of his works, and, soon after the death of his wife (1561), he probably spent a second period of study at the Benedictine abbey of Santa Maria del Corazzo.

In the early 1560s Telesio became more familiar with the academic environment of Naples, where the works of Vesalius, Colombo, Cardano, Eustachius, Cesalpino, Fracastoro, and Jean Fernel featured prominently in the study of natural philosophy and medicine. There Telesio probably read the works of Giovanni Argenterio (1513-1572), professor of medicine in Naples from 1555 to 1560, one of the major contributors to the diffusion of new medical ideas in Southern Italy. Like Girolamo Fracastoro, Argenterio criticized the Galenic theories of contagion and diseases, contributing to the slow downfall of Galen’s authority. He also probably read the work of Giovanni Filippo Ingrassia (1510-1580), a Sicilian physician who received his scientific education at Padua—studying with Vesalius, Colombo, Eustachius, and Fracastoro—and who was also critical of Galen.

In 1563 Telesio went to Brescia, paying a visit to the Aristotelian Vincenzo Maggi (1498-1564). On that occasion, he submitted to Maggi the manuscript of the first edition of De natura iuxta propria principia. In 1565, Telesio’s masterpiece was published in Rome by the papal printer Antonio Blado. In the same period, he completed the draft of the most important of his medical writings—the aforementioned Quod animal universum ab unica animae substantia gubernatur. In the next year, the Neapolitan printer Matteo Cancer released a short treatise, Ad Felicem Moimonam iris, about the phenomenon of rainbow (Telesio 1566 and 2011). These latter two works were an early testimony to the wide range of Telesio’s philosophical interests, as much as to the originality of his method in the quest for the causes of natural phenomena. In the same year of the publication of De natura, one of Telesio’s brothers, Tommaso (1523-1569), accepted the position of Archbishop of Cosenza, a title initially offered to Bernardino by Pius IV.

Toward the end of the 1560s and the beginning of 1570s, Telesio’s philosophical reputation was becoming more and more widespread. In 1567 the humanist Giano Pelusio wrote a short poem, Ad Bernardinum Thylesium Philosophum, where the philosopher of Cosenza was compared to Pythagoras; during the next few years, Telesio was assisted by his disciple Antonio Persio in the publication of the second edition of De rerum natura (1570). In the same year three pamphlets were printed: De colorum generatione, De his quae in aere fiunt et de terraemotibus, and De mari liber unicus (Telesio 1981 and 2013). They were printed in Naples, where Telesio lived under the patronage of Alfonso and Ferrante Carafa (d. 1593).

Telesio’s fame grew in the 1570s: in 1572, Francesco Patrizi wrote an insightful review of Telesio’s De rerum natura (Objectiones), to which Telesio replied in a letter, Solutiones Tylesij; meanwhile, Antonio Persio wrote a reply entitled Apologia pro Telesio adversus Franciscum Patritium. Patrizi’s letter offered Telesio an occasion to point out some arguments of his cosmology and psychology: a) the universality of sensus rather than of the soul (anima, spiritus); b) the fiery and physical nature of the heavens, wherein the Sun is considered the source of motion as well as of the life of celestial bodies; c) the eternity of “celestial spheres” replacing the Platonic idea of creatio ex nihilo; d) the primacy of sense perception over the intellect in the cognitive process of animal understanding. Telesio’s understanding of nature championed the notion of universal sensibility (pansensism) over that of universal animation of things (panpsychism). What governs nature itself are just internal natural principles: there is no need for a divine intelligence in order to explain its inner processes and the variety of natural phenomena.

In the same years as when the correspondence between Telesio and Patrizi was published, the Florentine Francesco Martelli translated Telesio’s De rerum natura (Delle cose naturali libri due) and the treatises De mari and De his, quae in aere fiunt. Around the same period, the orientalist Giovan Battista Vecchietti spent a brief journey at Pisa, defending Telesio’s doctrines against the Aristotelians of the Studium. The temper of that young Telesian caught the attention of the Duke of Tuscany, Cosimo de’ Medici. Between 1575 and 1576 Antonio Persio published three works: Liber Novarum positionum (1575), Disputationes Libri novarum positionum, and Trattato dell’ingegno dell’huomo (1576). By 1577, Patrizi completed L’amorosa filosofia, a dialogue wherein the philosopher of Cherso mentions the acquaintance he had with Bernardino Telesio. In the late 1570s Telesio came back to Naples, and, at that time, the humanist Bonifacio Vannozzi, the rector of Pisa university, wrote a letter to Telesio, defining him as “our Socrates” (Artese 1998, 191).

Living in Naples in the first half of the 1580s, Telesio immersed himself in the production of the third edition of De rerum natura, printed in 1586. He dedicated the work to Ferrante Carafa, IV Duke of Nocera. In that edition, Telesio unpacked in nine books the earlier and later topics of his thought, from cosmology to psychology and moral philosophy. Meanwhile, his thought came to be renowned in England. During his Grand Tour of Italy (1581-1582), the mathematician Henry Savile bought a copy of the second edition of De rerum natura. Just a few decades later, Telesio’s works will have spread in the cultural circles of the early Jacobean England. James I Stuart and Francis Bacon owned copies of Telesio’s works, as did churchmen and royal physicians like John Rainolds ([Reynolds], 1549-1607; a translator of King James’s Bible) and Sir William Paddy (1554-1634, a fellow of the Royal College of Physicians of London), and aristocrats like Sir Henry Percy (1564-1632), IX Earl of Northumberland. Even if with different views and motivations, they all read Telesio’s writings. Moreover, his thought attracted the attention of the most eminent men of the “Northumberland circle”: Sir Walter Raleigh (1552-1618), Walter Warner (ca.1557-1643), Thomas Harriot (1560-1621), and Nicholas Hill.

In 1587, a Neapolitan lawyer, Giacomo Antonio Marta (1559-1628), wrote a pamphlet against Telesio, titled Pugnaculum Aristotelis contra principia Bernardini Telesii (1587). A few years later, the young Campanella, in his Philosophia sensibus demonstrata (Naples, 1591)—the most remarkable manifesto of Telesian philosophy—launched a fierce attack against Marta’s book. In the first pages of his work, Campanella made a summary of Telesio’s epistemology, pointing out the need to clarify the first grounds of the new method before commencing the inquiry into the main issues of natural philosophy. By means of Telesio, Campanella contributed—in his own way—to the development of the early modern debate about the scientific method (Firpo 1949, 182-183). The empiricist approach adopted by Telesio and Campanella did not yet have the complexity and articulation of Galileo’s method, composed by the “manifest experiences” (sensate esperienze) and “necessary demonstrations” (certe dimostrazioni); nonetheless, a number of early 17^th century Italian writers did not hesitate to label the Calabrian thinkers as just as dangerous as the novatores of the Galenic circle.

On July 23^rd, 1587, Telesio came back to Cosenza, and wrote his will, most likely because of ill health. He died a year later in October. Among the participants of the burial ceremony were Sertorio Quattromani, Giovan Paolo d’Aquino, the members of the “Accademia Cosentina” (Iacopo di Gaeta, Vincenzo Bombini, Giulio Cavalcanti and others), and the young Tommaso Campanella, at that time a friar of Ordo Praedicatorum, hosted in the convent of San Domenico in Cosenza. For the occasion, Campanella composed some verses dedicated to Bernardino (Al Telesio Cosentino, in Campanella 1622, n° 68).

2. Psychology and Theory of Knowledge

Telesio’s natural philosophy is based on a new methodological approach to the study of nature. This is exactly what he points out in the first pages of his De natura (1565)—a work rightly characterized by some scholars as “Telesio’s masterpiece”. Such approach does not depend uniquely on his alleged modern “empiricism”. The main elements of Telesio’s “modernity” lie in his novel approach to psychology, animal physiology, and theory of science. On the one hand, Telesio offers a number of arguments for the similarity of animals and humans: for example, both animals and humans are able to perceive their own passions through the way of senses. On the other hand, the spiritus of humans is “purer” and “more abundant” than that of animals (Telesio 1586, VIII.14-15). Therefore, humans are better equipped than animals in the art of reasoning.

Telesio’s principal aim was to inquire the causes of natural phenomena without viewing those natural phenomena through the lenses of Platonic and Aristotelian metaphysics. As he states in the incipit of book I of De rerum natura (1570), “the construction of the world and the nature of the bodies contained in it should be not inspected by reason, as the Ancients did, but must be perceived by sense, and grasped from things themselves”. He did not belittle nor underestimate the role of reason. Nonetheless, he prioritizes the direct evidence that comes from the senses. The beginning of his natural philosophy lies in the experience deriving from sense perception, sensus being a cognitive power closer to natural things than reason itself. As Aristotle himself asserted, “there is nothing in the intellect that is not first in sense perception”. The first moves of Telesio’s thought were to develop this principle of classical empiricism in a new and more coherent way.

The perception of a physical object establishes a causal relation to the external world, and the first task of a scientist is to investigate the nature of that relation. In opposition to Aristotle, Telesio affirms that the ability of a sentient creature to reach knowledge of natural things is not “actualized” by the “form” of the perceived thing. He does not believe that all acts of sense perception simply mirror the natural beings of the external world. Rather, he thinks that knowledge of the world depends on the sensible data perceived by the sentient creature. That kind of affection (perceptio passionis) is the very starting point to reach knowledge of the world, as sense perception is the basic and most important property of all animals, while the act of understanding is nothing more than reckoning and reminding similarities and differences between previous sensations. In that perspective, Telesio abandoned the traditional doctrine of species, denying that natural things are the result of the combination of matter and form. According to him, the Peripatetic answer to the problem of human knowledge left unsolved the relation between causes and effects in the cognitive process. Once more, it is the concept of spiritus that lies at the core of Telesio’s psychology. As imperceptible, thin, fiery body, it constitutes our sensible soul (Telesio 1586, VII.4, and V.3); as anima sentiens of human bodies, it is present mainly in the nervous system, in order to guarantee the unity of the perception; consequently, it is the bearer of sensibility and movement (Telesio 1586, V.5., V.10, and V.12). In other words, Telesio provides a theory of mind where the spirit does produce actual internal representations in response to external stimuli—which are considered as passions—and to internal stimuli, which are the affections or the motions of spirit itself. Then, mental representations are simple reconstructions of the world. Telesio upheld that the material soul grasps natural beings by means of a corporeal, physical interaction with them. Consequently, scientific knowledge is not the result of a hierarchical process, nor does it consist in the gradual abstraction of similitudes or species (Telesio 1586, VIII.15-28). In some way, Telesio’s psychology anticipated the empiricist approach of the 17^th century critics to Descartes’s doctrine of the cogito: in order to be aware of the knowledge of the natural things, humans do not need the intellectual self-consciousness of the sensible data coming from sense perception. Further, the editing process of sense perception can be improved only by sense perception, supported—when necessary—by the corporeal principle of spiritus.

Reasoning is nothing but the outcome of the self-organization of the “material soul” (spiritus) in cooperation with the “ways” or “means” of sense perception and the principal functions (memory, imagination) of the brain activated by the same principle of material soul. In order to follow their natural aim, that is, self-preservation (conservatio sui), both humans and animals are ruled by the opposed sensations of pleasure and pain, with the key function of the spirit at the core of bodily functions (Telesio 1586, VII.3; IX.2). In his early writings, Telesio did not directly challenge the theory of intelligible species of the Scholastic tradition; however, his opposition to that theory is evident in the basic principles of his psychology. They may be unfolded in five topics:

(a) intellectual cognition is based on a perceived similitude, which does not consist in a mental representation of an external object resulting from the encounter between the active intellect and passive sensation; rather, sense perception is an active operation of the spiritus, the material soul (Telesio 1586, V.34-47);

(b) the sensible data resulting from a perceptive experience has a cognitive role (as Campanella and Hobbes later explained, sensus is already a kind of iudicium; whereas understanding or imagination are nothing but “decaying sense”);

(c) the material agents involved in the cognitive process, from the “ways of sense” to the spiritus (“anima sentiens”), are merely corporeal (Telesio 1586, V.3-5, 10-12);

(d) the spirit is able to perceive because it can be subject to sensible, bodily alterations;

(e) since spiritus is the bearer of motion, a human soul does move in virtue of its own nature; what is at stake is indeed the concept of motion, in some way close to Lucretius’s atomism, even though Telesio himself does not show any intention to claim such a linkage. At the same time, Telesio’s theory excludes any mechanical approach to the physiology of sense perception: motion of bodies has to be explained through the physics of contact, and yet his theory of motion is still far from such kind of explanation that such modern authors as Gassendi, Descartes, and Hobbes later tried to provide.

Telesio’s naturalistic program, then, took sensation as a material process involving only material agents: (corporeal and) sensible objects, the “ways of sense” and the spiritus. Stating that an animal is ruled by one substance residing in its brain, Telesio abandoned Aristotle’s psychology and his threefold partition of the soul (intellective, sensitive, vegetative), as well as Galen’s partition of “pneumata” (animal, vital, natural) and his theory of “temperaments” (Telesio 1570, II.15). According to the Galenic system, the pneuma as a “transmitter substance” had a tripartite structure: a) the spiritus naturalis (pneuma physei) or vegetative, having its seat in the liver, and responsible for digestion, metabolism, production of blood and semen; b) the spiritus vitalis, localized in the heart, active in all kind of affections and motions; c) the spiritus animalis (psychei), situated in the brain, and responsible for the control and organization of the activity of the soul and of the intellect. Now, in the new system, both psychology and physiology, psyche and physis, were unified in one organic theory. Furthermore, the conception of the spiritus as a principle generated from the semen and diffused through the entire nervous system echoed some lines of Lucretius’s On the Nature of Things. Finally, locating the seat of the spirit in the brain, Telesio rejected Aristotle’s biological cardiocentrism (Telesio 1586, V.27).

In the 1586 edition of De rerum natura, Telesio introduced the notion of a divine soul (a deo immissa) to go along with the “material soul” (e semine educta) of his earlier thought (Telesio 1586, V.2-3). The idea of a divine soul capable of surviving the natural dissolution of the body is a conceptual device Telesio used with a twofold scope: on the one hand, Telesio could not deny that the concept of soul was a theological matter: Sacred Scripture teaches us that humans have a divine origin, infused by the Creator itself. Therefore, it would be unjust if God did not give to humans the prospect of an afterlife, as a remuneration for virtue and for vice experienced during the “mundane” lifetime (in that passage, it is evident that the source of Telesio’s argument is Book XIV of Marsilio Ficino’s Theologia Platonica de immortalitate animarum). On the other hand, Telesio remained faithful to the methodology of his early works: in 1586 he just pointed out the existence of a strict separation between the specific subjects of the philosopher’s and the theologian’s work. As a forerunner of the modern scientist, Telesio thought that the role of the philosopher was uniquely to inquire the secrets of nature “according its own principles”.

Telesio goes on to reject Aristotle’s definition of the soul as forma corporis, that is as the form and entelechy of an organic body (Aristotle, De anima II,1). According to Leen Spruit (1995), what matters here is the main topic of the formal mediation of sensible reality in intellectual knowledge. As it is well known, Aristotle regarded the mind as capable of grasping forms detached from matter (materia sensibilis). Aristotelian medieval commentators grounded that theory in the mediating role of representational forms called “intelligible species”.

According to Telesio, on the other hand, scientific knowledge of the world has to be necessarily mediated through sensible knowledge, which has an active role, whereas according to Aristotle the “materials” of sense perception play a passive role, from which the intellect grasps the form of each substance or natural being. Here lies another echo of Lucretius’s On the Nature of Things, where (in book III, ll. 359-369) he vigorously criticizes those philosophers who consider the senses as passive “gates” used by the soul.

As it is stated in the chapter V.2 of De rerum natura (1586), the spirit is what allows animals to perceive the external world, so it moves sometimes with the whole body, sometimes with single parts thereof. Probably inspired by the Aristotelian tradition of such authors as Alexander of Aphrodisias (on Aristotle’s Meteor. IV.8, for example), Telesio claimed that the “homeomerus” parts of the body (skin, flesh, tissues, blood, bones, and so forth) are the same for animals and humans: they differ in their appetites and needs, not in their calculations (logoi), and importantly, they all have the same kind of sense perceptions. Thus animal and human souls differ in degree, not in kind or quality.

Analogously, whereas Aristotle (in Meteor. book IV) asserted that all sensitive parts of the body must be homogeneous and be a direct composition of the four elements, in Telesio’s view, the variety of dispositions and functions of the different parts of the body had to be explained in the same way that the majority of the natural bodies are. In other words, the “homeomerous” mixtures cannot be considered as the “ultimate” parts of the “anhomeomerous” bodies (organs such as the eye, the heart, the liver, the lungs and so forth). Even though the spiritus is mostly present in the brain and in the nervous system, it is also spread throughout the entire body and, just like a brain, it was responsible for the motions, changes and the combination of different parts of the body. By the way of the sensus, the dynamics of attraction and repulsion provided for the constant balance of the living body.

3. Cosmology

Telesio eschewed metaphysical speculation; in his view, the most important task of the natural philosopher is to give attention to the observable phenomena in the natural world, looking for the causes of “sensible beings” (Telesio 1586, II.3). Thus it was in the spirit of the natural philosopher that he theorized that all natural things are the result of the two active and mutually antagonistic forces, “heat” and “cold”, acting upon matter, and thereby making possible the creation of inanimate and animate beings. Heat is responsible for the phenomena of elasticity, warmth, dryness, combustion, and lightness, as well as rarefaction of matter and motion and velocity of bodies; cold is responsible for the slowness of bodies in motion, and for their condensation, freezing and hardness. All the other natural phenomena (such as humidity or fermentation) are the results of combinations of different degrees of heat and cold. The interaction of heat and cold affects the nature of matter itself, a notion that Telesio intentionally left unclear. Taken per se, the concept of matter cannot be directly sensed, and its existence can simply be postulated, just like the notion of spiritus.

In this way Telesio rejected Aristotle’s view, according to which the two pairs of opposed qualities (cold/heat, and dry/humid), acting on the matter, gave rise to one of the four primary elements of natural beings (earth, air, fire, and water). According to Telesio, matter, as a physical, corporeal subjectum, has a merely passive role. In fact, what is important according to Telesio are the modifications of the subjectum, that is, the results of interactions between heat and cold (Telesio 1586, II.2).

On Telesio’s view, all things act according to their own nature, starting from the primary forces of cold and heat, by means of the ability to perceive each other. In order to sustain themselves, these primary forces and all beings which arise through their antagonistic interaction must be able to perceive themselves as opposite forces. In other words, they have to sense what is convenient and what is inconvenient or damaging for their own survival. Living bodies do not constitute a specific realm, separated from inanimate beings. They are all determined by the solar heat and the terrestrial matter. Again, it is important to note that sensation is not only a property of animate beings. Telesio’s philosophy can thus be described as a kind of pansensism: all beings, both animate and inanimate, are said to have the power of sensation. In fact, in the third edition (1586) of De rerum natura, the motion of celestial bodies will be explained by means of the principle of “self-preservation” (conservatio sui), in other words, the need to sustain the life itself of those bodies.

At the heart of Telesio’s cosmology, then, is the idea that nature is ruled by its own—internal, not external—principles. Consequently, the natural world does not need to be taken care of by any kind of divine intelligence. Heat and cold share the same “desire” to preserve themselves. The celestial spheres are made of matter, heat and cold (ivi, I.11-12, 8-9). Regarding the Ptolemaic system, Telesio rejected it as unnatural, probably because of the growing suspicion—in the 16^th century cosmology—that it provided a mathematical device to “save the appearances,”, leaving unexplained the question of the actual natural causes of the planetary motions, as well as of other celestial phenomena. Beginning with the first edition of De rerum natura, Telesio’s objective was to replace Aristotle’s geocentrism with one of his own. At the cosmological level, the interplay between heat and cold involves the position of the Sun and of the Earth, being the seats and sources of heat and coldness, respectively. Because of its heat, the Sun was propelled into perpetual motion, whereas the Earth is immobile because of its coldness and its great weight. Consequently, the cosmic balance and harmony of the heavens depend on the struggle and equilibrium between the Sun and the Earth. Unlike Aristotle, Telesio upheld the fiery nature of the heavens. That moved the philosopher of Cosenza to deny the Aristotelian principle of a first-mover of the universe. Planetary motions are not the outcome of the patterns of motion between the several regions of the celestial spheres; rather, they are the consequence of a geocentric system ruled by thermal forces, wherein are still valid the ancient notions of densum and rarum.

Thus, Telesio chose heat and cold as the principal agents for knowledge of the world because together with prime matter (moles), they immediately affect bodies and their functions. As said before, the two primary bodies, the Sun and the Earth, are the subjects of Telesio’s argument. The former is the seat of heat, and the latter is that of cold (Telesio 1565, I.1-4). That statement literally expelled the idea of a creatio ex nihilo. Electing the Sun and the Earth as the celestial seats of heat and cold, Telesio defines the boundaries of the universe as the edges of the corporeal world (extrema corpora universi)Life itself depends on the right balance of heat and cold, and they are lastly called “forces of acting natures”, agentium naturarum vires (Telesio 1586 VII.9). The later Telesio, in fact, was firmly convinced that the world depended from the inner uniformity of nature and from its intrinsic virtue or “wisdom”.

Furthermore, against Aristotle, Telesio denied the theory of the locus as the limit of a body, taking into account the atomistic theory of space as an empty place filled by the bodies. By means of the two forces of heat and cold, and by affirming the idea of a space filled by matter, he abolished the Aristotelian theory of a cosmos divided into a sublunary world, in which generation and corruption take place, and a superlunary world with timeless regular movements. Moreover, he developed a critique of the Peripatetic theory of natural locus, pointing out that Aristotelians did not explain well the reason why the motion of heavy bodies becomes uniformly accelerated.

4. Influence and Legacy

With the publication of his early works (1565, 1566, 1570), Telesio established himself as a key figure in the intellectual milieu of the late 16^th to early 17^th century Italy. Some of his theses were read, commented on, and debated by a number of Italian philosophers, physicians, and amateurs of science, such as Francesco Patrizi, Antonio Persio, Agostino Doni, Giordano Bruno, Giambattista Vecchietti, Latino Tancredi, Tommaso Campanella, Andrea Chiocco, Giulio Cortese, Francesco de’ Vieri, Alessandro Tassoni, and Marco Aurelio Severino. In the early 17^th century his writings circulated around Europe, and were read by Francis Bacon, Marin Mersenne, René Descartes, Pierre Gassendi, Jean-Cécile Frey, Charles Sorel, Walter Warner, Thomas Hobbes, and others.

One of the first authors to officially criticize the philosophy of the “Telesians” is Francesco de’ Vieri (1524-1591), lecturer of Aristotelian philosophy at the University of Pisa. In 1573 he published in Florence a work on the vernacular, Trattato delle metheore, in three books. The same work, augmented with a huge fourth book of 200 pages, edited in 1582, contains a rehearsal of the principal topics of the fourth book of Aristotle’s Meteorologica, and, with the purpose of showing his Platonic reading of Aristotle’s philosophy, he took occasion to attack the “Telesians”, with the aim of persuading them with “their own arguments” (p. 227). His critique of Telesio and of the Telesians is particularly significant because he offers a reassessment of the Aristotelian notion of sensus through the key reading of the Platonic concept of pneuma (a word belonging to the Stoic and pre-Aristotelian lexicon). As said above, Telesio translates the Greek word pneuma into the Latin expression of spiritus or anima sentiens. Some pages after the aforementioned quotation, Verino states that God created souls as eternal beings (ab aeterno), because a soul is not grasped from the alteration of matter (p. 247). This is a clear reference to the Telesian idea of a spiritus grasped from a material seed (spiritusex semine educta). In a manuscript kept at the National Library of Florence (Magl. XII.11, f. 23), the same author attacked Telesio and his followers, who erroneously attribute to the sensus “all judgments about the natural things”. It is important to recall that when Francesco de’ Vieri published the 1582 edition of his book, Telesio’s philosophy had already reached, in Tuscany and across Italy, the apex of its fame.

In 1587, a year before Telesio’s death, the Spanish philosopher Oliva Sabuco de Nantes y Barrera published a book, Nueva filosofía de la naturaleza del hombre, where she elaborated on a psychophysiology of the human body deeply influenced by Telesio’s doctrines (Bidwell-Steiner 2012). Then, in 1588 Francesco Muti published a work entitled Disceptationum libri V contra calumnias Theodori Angelutii in maximum philosophum Franciscum Patritium, in quibus pene universa Aristotelis philosophia in examen adducitur, in which he defended Telesio, taking in consideration the quarrel that took place at Ferrara during 1584 and 1585 between Patrizi and Angelucci (Sergio 2013, 71-72, 74). In 1589, Sertorio Quattromani, the new founder of the “Accademia Cosentina,” published a summary of Telesio’s thought called La filosofia di Bernardino Telesioris tretta in brevità e scritta in lingua Toscana (Naples, 1591).

In the last decade of the 16^th century, an important role was played by Antonio Persio (1542-1612). Among Telesio’s disciples, Persio was the one who worked on the Venetian edition of the Varii de naturalibus rebus libelli (Apud Felice Valgrisium, 1590). That edition included both the booklets already published in 1570 (De his, quae in aerefiunt et de terrae motibus; De colorum generatione; De Mari) plus a number of writings Telesio had left unpublished (De cometis et lacteo circulo; De iride; Quod animal universum ab unica animae substantia gubernator; De usure spirationis; De coloribus; De saporibus; De somno). Some years later, one of Telesio’s former disciples, Giovan Paolo d’Aquino, published Oratione in morte di Berardino [sic] Telesio Philosopho Eccellentissimo agli Academici Cosentini (Cosenza, per Leonardo Angrisano, 1596), the first biography of the philosopher of Cosenza.

As noted above, in 1591 Campanella wrote a stunning defense of Telesian philosophy against Giacomo Antonio Marta’s Pugnaculum Aristotelis. In his work, Campanella took occasion to unfold and reassess the principles of Telesio’s naturalism, somehow anticipating (in his Praefatio) the basic essentials of Galileo’s methodology (above all, the alliance between the “sensate esperienze” and the “certe dimostrazioni”). Another early modern thinker to note, Alessandro Tassoni, devoted a number of pages of his works to Telesio’s meteorology (Trabucco 2019).

In the first decades of the 17^th century, in Italy, Telesio’s ideas entered a wider scientific context, a constellation populated by a number of scientists interested in the so-called “mathematization of the world”, such as Galileo and the network of his disciples and correspondents. However, the new mathematical trend of natural philosophy did not eclipse Telesio’s merits and the scientific value of his work. Authors such as Latino Tancredi, Colantonio Stigliola, Marco Aurelio Severino, and Tommaso Cornelio will continue to spread his thought. Especially in Southern Italy, Telesio’s name became the distinctive mark of a philosophical tradition dating back to the greatest authors of the ancient, pre-Aristotelian period, such as Pythagoras, Empedocles, Philolaus, Alcmaeon, Timaeus of Locri, and so forth.

Meanwhile, in England, Francis Bacon devoted some pages of his writings to Telesio: firstly in his Advancement of Learning (1605), then in De principiis atque originibus, secundum fabulas Cupidinis et Coeli (1613), and finally in his Sylva sylvarum. Bacon’s reading of Telesio’s philosophy mainly focused on the portrayal of Telesio as the restorer of Parmenides’s philosophy, freezing the Calabrian thinker in the role of an innovator who took inspiration from the Eleatic monism for the setting of his materialistic world-view (Rees 1977, De Mas 1989, Bondì 2001, Garber 2016). At same time, Bacon himself expressed some concerns about the limits of Telesio’s theory of matter. According to Lord Verulam, Telesio’s concept of matter lies unexplained in regards to its specific function in the processes of generation and transformation of natural beings. However, Bacon admired such authors as Telesio, Cardano, and Della Porta with respect to the notion of spiritus, the power of imagination, and the sympathy between animate and inanimate objects (Gouk 1984). In that way, it is fair to say that Bacon contributed to the construction of the mythical conception of Telesio as a freethinker deeply indebted to the pre-Socratic tradition, which is not to say that the myth is altogether misleading (see Giglioni 2010: 70).

Back in the European continent, some 17^th century traces of Telesio’s legacy can be found in such authors as Marin Mersenne (Quaestiones celeberrimae in Genesim, Paris, 1623); Gabriel Naudé (Apologie pour tous les grand personnages qui ont esté faussement soupçonnez de magie, Paris, 1625; Advis pour dresser une bibliotheque, Paris, 1627); Jean-Cécile Frey (Cribrum philosophorum qui Aristotelem superiore et hacaetate oppugnarunt, in Opuscula varia nusquamedita, Paris, 1646); Charles Sorel (Le sommaire des opinions les plus estranges des novateurs modernes en la philosophie comme de Telesius, de Patritius, de Cardan, de Ramus, de Campanelle, de Descartes, et autres, in De la perfection de l’homme, où les vrays biens sont considérez et spécialement ceux de l’âme, Paris, 1655; reprinted in La science universelle, vol. 4, 1668); Guy Holland (The grand prerogative of human nature, namely, the soul’s natural or native immortality, and freedom from corruption, London, 1653); and Pierre Gassendi (Syntagma philosophicum, in Opera Omnia, vol. I, Paris, 1658).

Another testimony of the role of Telesio’s legacy in the 17^th century Naples is contained in Tommaso Cornelio’s Progymnasmataphysica (Venetiis, 1663): compare the Progymn. II. De initiis rerum naturalium; the Epistolade Platonica Circompulsione, and Epistola M. Aurelij Severini nomine conscripta (repr. Venetiis, 1683, pp. 41-42, 140, 144, 146, and 190-191).

In the French context, Pierre Gassendi was one of the most important authors to give attention to the Cosentine thinker. In his writings such novatores as Telesio and Campanella are mentioned in regards to the theories of space and time as well as the theory of sensory qualities including heat and cold (Syntagma, in Opera, I, 245b), as well as in Gassendi’s tripartite conception of void—that is to say, the inane separatum, that is, the idea of an infinite void expanding beyond the atmosphere; the inane disseminatum, that is, the interparticle void between the basic corpuscles of bodies; and the inane coacervatum, that is, the interparticle void “cobbled” together by experimental means (Opera I, 185a-187a, 192a-196a, 196b-203a). On that subject, in Gassendi’s notion of vacuum coarcevatum, there is no way to explain how bodies may divide and separate at the level of basic particles without the supposition of that kind of void; here, Gassendi found evidently insufficient the explanation of Telesio according to which heat and cold are the active principles of matter (for further details, see Fisher 2005, and Henry 1979).

Finally, a specific debt towards Telesio is also identifiable in Thomas Hobbes’s works. In the first chapter of Leviathan (1651), Hobbes openly rejected the doctrine of species, and in successive chapters he asserted a cohesive relationship between sense, imagination and reasoning, consistent with the Telesian approach (a first trace of that influence dates back to the Elements of Law, Natural and Politic, written in 1640). What is more, the notion of “self-preservation” (conservatio sui) was reassessed in Hobbes’s anthropology. Telesio’s influence became more explicit in 1655 in Hobbes’s De corpore, sect. IV., chap. XXV. In the fifth article of that chapter, Hobbes unfolds the basic properties of sensation and cognition in the simplest structures of the organized matter in motion. In the same place he provides a suggestion which allows us to place his materialism close to the Renaissance pansensism advanced by Telesio and Campanella. After he explained in a general way his physiology of sensation and animal locomotion, he stated:

I know there have been philosophers, and those learned men, who have maintained that all bodies are endued with sense. Nor do I see how they can be refuted, if the nature of sense be placed in reaction only. And, though by reaction of bodies inanimate a phantasm might be made, it would nevertheless cease, as soon as ever the object were removed. For unless those bodies had organs, as living creatures have, fit for the retaining of such motion as is made in them, their sense would be such, as that they should never remember the same (Hobbes 1656, XXV.5, p. 226. On the subject, see Schuhmann 1988: 109-133; Sergio 2008: 298-315).

5. References and Further Reading

a. Primary Sources

Telesio, Bernardino, 1565, De natura iuxta propria principia liber primus, et secundus (Romae, Antonium Bladum, 1565) – Ad Felicem Moimonam iris (Rome, Mattheus Cancer, 1566), ed. by R. Bondì, Rome, Carocci, 2011.
Telesio, Bernardino, 1570, De rerum natura iuxta propria principia, liber primus, et secundus, denuo editi – Opuscula (Neapoli, Josephum Cacchium, 1570), ed. by R. Bondì, Rome, Carocci, 2013.
Telesio, Bernardino, 1572, Delle cose naturali libri due – Opuscoli – Polemiche telesiane (Biblioteca Nazionale Centrale, Florence, Ms. Pal. 844, cc. 12r-204r; Cod. Magl. XII B 39), ed. by A. L. Puliafito, Rome, Carocci, 2013.
Telesio, Bernardino, 1586, De rerum natura iuxta propria principia, libri IX (Naples, Horatius Salvianus, 1586), ed. by G. Giglioni, Rome, Carocci, 2013.
Telesio, Bernardino, 1590, Varii de naturalibus rebus libelli ab Antonio Persio editi (Venice, F. Valgrisius, 1590), ed. by Miguel A. Granada, Rome: Carocci, 2012.
Telesio, Bernardino, 1981, Varii de naturalibus rebus libelli, ed. by L. De Franco, Florence, La Nuova Italia.

b. Secondary Sources

d’Aquino, Giovan Paolo, 1596, Oratione in morte di Berardino Telesio, philosopho eccelentissimo, Cosenza, Leonardo Angrisano.
Artese, Luciano, 1991, “Il rapporto Parmenide-Telesio dal Persio al Maranta,” Giornale Critico della Filosofia Italiana, 70: 15-34.
Artese, Luciano, 1994, “Bernardino Telesio e la cultura napoletana,” Studi Filosofici, 17: 91-110.
Artese, Luciano, 1998, “Documenti inediti e testimonianze su Francesco Patrizi e la Toscana,” Bruniana&Campanelliana, 4: 167-191.
Bacon, Francis, 1613, De principiis atque originibus, secundum fabulas Cupidinis et Coeli, in The Works of Francis Bacon, ed. by R. L. Ellis, J. Spedding, D. D. Heath, London, Longmans, 1858, vol. 5: 289-346.
Barbero, Giliola, Paolini, Adriana, 2017, Le edizioni antiche di Bernardino Telesio: censimento e storia, Paris, Les Belles Lettres.
Bianchi, Lorenzo, 1992, “Des novateurs modernes en philosophie: Telesio tra eruditi e libertini nella Francia del Seicento,” in Bernardino Telesio e la cultura napoletana, ed. by R. Sirri and M. Torrini, Naples, Guida: 373-416.
Bidwell-Steiner, Marlen, 2010, “Metabolism of the Soul. The Psychology of Bernardino Telesio in Oliva Sabuco’s Nueva filosofía de la naturaleza del hombre,” (1587) in Blood, Sweat and Tears. The Changing concepts of Physiology from Antiquity into Early Modern Europe, ed. by M. Horstmansoff, H. King, C. Zittel, Leiden, Brill: 662-684.
Boenke, Michaela, 2005, “Psicologie im System der naturphilosophischen Monismus; Bernardino Telesio,” in Körper, Spiritus, Geist: Psychologie vor Descartes, München, Paderborn: 120-142.
Boenke, Michaela, 2013, “Bernardino Telesio,” in Stanford Encyclopedia of Philosophy (http://plato.stanford.edu/entries/telesio/).
Bondì, Roberto, 2018a, Il primo dei moderni. Filosofia e scienza in Bernardino Telesio, Rome, Edizioni di Storia e Letteratura.
Bondì, Roberto, 2018b, “Dangerous Ideas: Telesio, Campanella and Galileo,” in Copernicus Banned. The Entangled Matter of the anti-Copernican Decree of 1616, ed. by N. Fabbri and F. Favino, Florence, Olschki, 1-27.
Campanella, Tommaso, 1622, Al Telesio Cosentino, in Scelta d’alcune poesie filosofiche (1622), n° 68 (available on-line in Archivio Tommaso Campanella, http://www.iliesi.cnr.it/ATC/testi.php?tp=1&iop=Scelta&pg=123).
De Franco, Luigi, 1995, Introduzione a Bernardino Telesio, Soveria Manelli: Rubbettino.
De Frede, Carlo, 2001, Docenti di filosofia e medicina nella università di Napoli dal secolo XV al XVI, Naples, Lit. Editrice A. De Frede.
De Lucca, Jean-Paul, 2012, “Giano Pelusio: ammiratore di Telesio e poeta dell’«età aurea»,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore: 115-132.
De Miranda, Girolamo, 1993, “Una lettera inedita di Telesio al cardinale Flavio Orsini,” Giornale Critico della Filosofia Italiana 72: 361-375.
Ebbersmeyer, Sabrina, 2013, “Do Humans Feel Differently? Telesio on the Affective Nature of Men and Animals,” in The Animal Soul and the Human Mind. Renaissance Debates, ed. by C. Muratori, Pisa-Roma, Fabrizio Serra Editore, 97-111.
Ebbersmeyer, Sabrina, 2016, “Telesio’s Vitalistic Conception of the Passions,” in Sense, Affect and Self-Preservation in Bernardino Telesio (1509-1588), ed. by G. Giglioni and J. Kraye, Dordrecht, Springer.
Ebbersmeyer, Sabrina, 2018, “Renaissance Theories of the Passion. Embodied Minds,” in Philosophy of Mind in the late Middle Ages and Renaissance. The History of Philosophy of Mind, vol. 3, ed. by S. Schmid, London, Routledge, 185-206.
Firpo, Luigi, 1951, “Filosofia italiana e Controriforma. iv. La proibizione di Telesio,” Rivista di Filosofia, 42/1: 30-47 (see also 41, 1950: 150-173 e 390-401).
Fisher, Saul, 2005, Pierre Gassendi’s Philosophy and Science. Atomism for Empiricists, Leiden, Brill.
Fragale, Luca Irwin, 2016, “Bernardino Telesio in due inediti programmi giovanili,” in Microstoria e Araldica di Calabria Citeriore e di Cosenza. Da fonti documentarie inedite, Milan, The Writer, 11-32.
Garber, Daniel, 2016, “Telesio Among the Novatores: Telesio’s Reception in the Seventeenth Century,” in Early Modern Philosophers and the Renaissance Legacy, ed. by C. Muratori and G. Paganini, Dordhecht, Kluwer, 119-133.
Gaukroger, Stephen, 2001, Francis Bacon and the Transformation of Early-Modern Philosophy, Cambridge, Cambridge University Press.
Giglioni, Guido, 2010, “The First of the Moderns or the Last of the Ancients? Bernardino Telesio on Nature and Sentience,” Bruniana & Campanelliana 16: 69-87.
Gómez López, Susana, 2013, “Telesio y el debate sobre la naturalezza de la luz en el Renacimiento italiano,” in Bernardino Telesio y la nueva imagen de la naturalezza en el Renacimiento italiano, ed. by Miguel Á. Granada, Siruela, Biblioteca de Ensayo, 194-235.
Granada, Miguel Ángel, 2013, Telesio y las novedades celestese: la teoría telesiana de los cometas, e Telesio y la Via Láctea, in Bernardino Telesio y la nueva imagen de la naturalezza en el Renacimiento italiano, ed. by Miguel Á. Granada, Siruela, Biblioteca de Ensayo, 116-149 and 150-193.
Hatfield, Gary, 1992, “Descartes’ physiology and its relation to his psychology,” in The Cambridge Companion to Descartes, ed. by J. Cottingham, Cambridge, Cambridge University Press, 335-370.
Henry, John, 1979, “Francesco Patrizi da Cherso’s Concept of Space and Its Later Influence,” Annals of Science, 36: 549-575
Hirai, Hiro, 2012, “Il calore cosmico di Telesio fra il de generazione animalium di Aristotele e il De carnibus di Ippocrate,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore, 71-83.
Iovine, Maria Fiammetta, 1998, “Henry Savile lettore di Bernardino Telesio. L’esemplare 537.C.6 del De rerumnatura 1570,” Nouvelles de la République des Lettres 17: 51-84.
Leijenhorst, Cees, 2010, “Bernardino Telesio (1509-1588): New fundamental principles of nature,” in Philosophers of the Renaissance, ed. by P. R. Blum, 168-180, Washington, The Catholic University of America Press.
Lattis, James M, 1994, Between Copernicus and Galileo: Christoph Clavius and the Collapse of Ptolemaic Cosmology, Chicago, Chicago University Press.
Lerner, Michel-Pierre, 1986, “Aristote “oublieux de lui-meme” selon Telesio,” Les Études philosophiques, 3: 371-389.
Lerner, Michel-Pierre, 1992, “Le ‘parménidisme’ de Telesio: Origine et limites d’un hypothèse,” in Bernardino Telesio e la cultura napoletana. ed. by R. Sirri and M. Torrini, Naples: Guida, 79-105.
Lupi, Walter F., 2011, Alle origini della Accademia Telesiana, Cosenza: Brenner.
Mandressi, Rafael, 2009, “Preuve, expérience et témoignage dans les «sciences du corps»,” Communications 84: 103-118.
Margolin, Jean-Claude, 1990, “Bacon, lecteur critique d’Aristote et de Telesio,” in Convegno internazionale di studi su Bernardino Telesio, Cosenza, Accademia Cosentina, 135-166.
Mulsow, Martin, 1998, Frühneuzeitliche Selbsterhaltung. Telesio und die Naturphilosophie der Renaissance, Tübingen: Max Niemeyer Verlag.
Mulsow, Martin, 2002, “Reaktionärer Hermetismus vor 1600? Zum Kontext der venezianischen Debatte über die Datierung von Hermes Trismegistos,” in Das Ende des Hermetismus. Historische Kritik und neue Naturphilosophie in der Spätrenaissance. Dokumentation und Analyse der Debatte um die Datierung der hermetischen Schriften von Genebrard bis Casaubon (1567-1614), ed. by M. Mulsow, Tübingen, Max Niemeyer Verlag, 161-185.
Ottaviani, Alessandro, 2010, “Da Antonio Telesio a Marco Aurelio Severino: fra storia naturale e antiquaria,” Bruniana & Campanelliana, 16/1: 139-148.
Ottaviani, Alessandro, 2012, “Telesio, Bernardino,” in Il Contributo italiano alla storia del Pensiero – Filosofia (2012) (http://www.treccani.it/enciclopedia/bernardino-telesio_(Il-Contributo-italiano-alla-storia-del-Pensiero:-Filosofia)/).
Plastina, Sandra, 2012, “Bernardino Telesio nell’Inghilterra del Seicento,” in Bernardino Telesio tra filosofia naturale e scienza moderna, ed. by G. Mocchi, S. Plastina, E. Sergio, Pisa-Rome, Fabrizio Serra Editore, 133-143.
Puliafito, Anna Laura, 2013, Introduzione a Telesio 1572, xxxiii-xlv.
Pousseur, Jean-Marie, 1990, “Bacon, a Critic of Telesio,” in Francis Bacon’s Legacy of Texts: ‘The Art of Discovery Grows with Discovery’, ed. by W. Sessions, New York, AMS Press, 105-117.
Purnell, Fredrick, Jr., 2002, “A Contribution to Renaissance Anti-Hermeticism: The Angelucci-Persio Exchange,” in Das Ende des Hermetismus. Historische Kritik und neue Naturphilosophie in der Spätrenaissance, ed. by M. Mulsow, Tübingen, Max Niemeyer Verlag, 127-160.
Rees, Graham, 1977, “Matter Theory: A Unifying Factor in Bacon’s Natural Philosophy?,” Ambix 24: 110-125.
Schuhmann, Karl, 1988, “Hobbes and Telesio,” Hobbes Studies 1: 109-133.
Schuhmann, Karl, 2004, “Telesio’s Concept of Matter,” in Selected Papers on Renaissance Philosophy and on Thomas Hobbes, ed. by P. Steenbakkers and C. Leijenhorst, Dordrecht, Kluwer, 99-116.
Sciaccaluga, Nicoletta, 1997, “Movimento e materia in Bacone: uno sviluppo telesiano,” Annali della Scuola normale superiore di Pisa, classe di lettere e filosofia, Ser. 4, 2, 329-355.
Sergio, Emilio, 2007, “Campanella e Galileo in un «English Play» del circolo di Newcastle: «Wit’s Triumvirate, or the Philosopher» (1633-1635),” Giornale Critico della Filosofia Italiana, 86, 2, 298-315.
Sergio, Emilio, 2010, “Telesio e il suo tempo. Alcune considerazioni preliminari,” Bruniana & Campanelliana, 16, 1, 111-124.
Sergio, Emilio, 2013, Bernardino Telesio: una biografia, Naples: Guida.
Sergio, Emilio, 2014, “Bernardino Telesio (1509-1588),” in Galleria dell’Accademia Cosentina – Archivio dei filosofi del Rinascimento, vol. I, ed. by E. Sergio, Rome, CNR-ILIESI, 155-218.
Simonetta, Marcello, 2015, “Due lettere inedite del giovane Bernardino Telesio,” Bruniana & Campanelliana, 21, 2, 429-435.
Siraisi, Nancy G., 2011, “Giovanni Argenterio and Medical Innovation,” in Medicine and the Italian Universities, 1250-1600, Leiden, Brill, 329-355.
Spruit, Leen, 1995, “Bernardino Telesio,” in Species intelligibilis. From Perception to Knowledge, Leiden, Brill, vol. 2, 198-203.
Spruit, Leen, 1997, “Telesio’s reform of the philosophy of mind,” Bruniana&Campanelliana, 3: 123-143.
Spruit, Leen, 1998, Telesio’s Psychology and the Northumberland Circle, Durham Thomas Harriot Seminar, Occasional paper, Durham University, History of Education Project, 1-36.
Spruit, Leen, 2018, “Bernardino Telesio on Spirit, Sense, and Imagination,” in Image, Imagination, and Cognition. Medieval and Early Modern Theory and Practice, ed. by C. Lüthy, C. Swan, P. Bakker, C. Zittel, Brill, Leiden, 94-116.
Trabucco, Oreste, 2019, “Telesian Controversies on the Winds and Meteorology,” in Bernardino Telesio and the Natural Sciences in the Renaissance, ed. by P. D. Omodeo, Leiden, Brill.
Tutrone, Fabio, 2014, “The body of the soul. Lucretian echoes in the Renaissance theories on the psychic substance and its organic repartition,” Gesnerus, 71, 2, 204-236.

Author Information

Emilio Sergio
Email: es.disu@gmail.com
University of Calabria
Italy

Philosophy of Peace

Peace is notoriously difficult to define, and this poses a special challenge for articulating any comprehensive philosophy of peace. Any discussion on what might constitute a comprehensive philosophy of peace invariably overlaps with wider questions of the meaning and purpose of human existence. The definitional problem is, paradoxically, a key to understanding what is involved in articulating a philosophy of peace. In general terms, one may differentiate negative peace, that is, the relative absence of violence and war, from positive peace, that is, the presence of justice and harmonious relations. One may also refer to integrative peace, which sees peace as encompassing both social and personal dimensions.

Section 1 examines potential foundations for a philosophy of peace through what some of the world’s major religious traditions, broadly defined, have to say about peace. The logic for this is that throughout most of human history, people have viewed themselves and reality through the lens of religion. Sections 2 through 5 take an historical-philosophical approach, examining what key philosophers and thinkers have said about peace, or what might be ascertained for possible foundations for a philosophy of peace from their work. Section 6 examines some contemporary sources for a philosophy of peace.

Sections 7 through 15 are more exploratory in nature. Section 7 examines a philosophy of peace education, and the overlap between this and a philosophy of peace. Sections 8 through 15 examine a range of critical issues in thinking about and articulating a philosophy of peace, including paradoxes and contradictions which emerge in thinking about and articulating a philosophy of peace. Section 16 concludes with how engaging in the practice of philosophy may itself be a key to understanding a philosophy of peace, and indeed a key to establishing peace itself.

Religious Sources for a Philosophy of Peace
Classical Sources for a Philosophy of Peace
Medieval Sources for a Philosophy of Peace
Renaissance Sources for a Philosophy of Peace
Modern Sources for a Philosophy of Peace
Contemporary Sources for a Philosophy of Peace
The Philosophy of Peace Education
The Notion of a Culture of Peace
The Right to Peace
The Problem of Absolute Peace
Peace and the Nature of Truth
Peace as Eros
Peace, Empire and the State
An Existentialist Philosophy of Peace
Decolonizing Peace
Concluding Comments: Philosophy and Peace
References and Further Reading

1. Religious Sources for a Philosophy of Peace

It is logical that we should examine the theory of peace as set down in the teachings of some of the world’s major religious traditions, given that, for most of human history, people have viewed themselves and the world through the lens of religion. Indeed, the notion of religion as such may be viewed as a modern invention, in that throughout most of human history individuals have seen the spiritual dimension as integrated with the physical world. In discussing religion and peace, there is an obvious problem of the divergence between precept and practice, in that many of those professing religion have often been warlike and violent. Some writers, such as James Aho and René Girard, go further, and see religion at the heart of violence, through the devaluation of the present and through the notion of sacrifice. For the moment, however, we are interested in the teachings of the major world religions concerning peace.

If we examine world religious traditions and peace, it is appropriate that we examine Indigenous spirituality. There are a number of ways that such spirituality may provide grounds for a philosophy of peace, such as the notion of connectedness with the environment, the emphasis on a caring and sharing society, gratitude for creation and the importance of peace within the individual. This is not to deny that Indigenous societies, as with all societies, may be extremely violent at times. This is also not to deny that elements of Indigenous spirituality may be identifiable within other major world religious traditions. Yet many peace theorists look to Indigenous societies and Indigenous spirituality as a reference point for understanding peace.

Judaism enjoys prominence not merely as a world religion in its own right, and arguably the most ancient monotheistic religion in the world, but also as a predecessor faith for Christianity and Islam. Much of the contribution of Judaism towards theorizing on peace comes from the idea of an absolute deity, and the consequential need for radical ethical commitment. Within the Tanakh (Hebrew Scriptures), the Torah (Law) describes peace as an ultimate goal and a divine gift, although at times brutal warfare is authorized; the Nevi’im (Prophetic Literature) develops the notion of the messianic future era of peace, when there will be no more war, war-making or suffering; and the Ketuvim (Wisdom Literature) incorporates notions of inner peace into Judaism, such as the idea that a person can experience peace in the midst of adversity, and the notion that peace comes through experience and reflection.

Hinduism is a group of religious traditions geographically centered on the Indian sub-continent, which rely upon the sacred texts known as the Vedas, the Upanishads, and Bhagavad Gita. There are a number of aspects of Hinduism which intersect with peace theory. Karma is a view of moral causality incorporated into Hinduism, wherein good deeds are rewarded either within this lifetime or the next, and by contrast bad deeds are punished in this lifetime or the next. Karma presents a strong motivation to moral conduct, that is, one should act in accordance with the dharma, or moral code of the universe. A further element within Hinduism relevant to a peace theory is the notion of the family of humankind, and accordingly there is a strong element of tolerance within Hinduism, in that the religion tolerates and indeed envelopes a range of seemingly conflicting beliefs. Hinduism also regards ahimsa, strictly speaking the ethic of doing no harm towards others, and by extension compassion to all living things, as a virtue, and this virtue became central to the Gandhian philosophy of nonviolence.

Buddhism is a set of religious traditions geographically centered in Eastern and Central Asia, and based upon the teachings of Siddharta Gautama Buddha, although the dearth of any specific deity lead some to question whether Buddhism ought to be considered a religion. The significance of Buddhism for peace is the elevation of ahisma, that is, doing no harm to others, as a central ethical virtue for human conduct. It can be argued that the Buddhist ideal of avoidance of desire is also an important peaceful attribute, given that desire of all descriptions is often cited as a cause of war and conflict, as well as being a cause of the accumulation of wealth, which itself arguably runs counter to the creation of a genuinely peaceful and harmonious society.

Christianity is a set of monotheistic religious traditions, arising out of Judaism, and centered on the life and teachings of Jesus of Nazareth. The relationship of Christianity to a philosophy of peace is complex. Christianity has often emerged as a proselytizing and militaristic religion, and thus one often linked with violence. Yet there is also a countervailing undercurrent of peace within Christianity, linked to the teachings of its founder and also linked to the fact that its founder exemplified nonviolence in his own life and death. Forgiveness and reconciliation are also dominant themes in Christian teaching. Some Christian theologians have begun to reclaim the nonviolent element of Christianity, emphasizing the nonviolence in the teaching and life of Jesus.

Islam constitutes a further set of monotheistic religious traditions arising out of Judaism, stressing submission to the will of the creator, Allah, in accordance with the teachings of the Prophet Muhammed, as recorded in sacred texts of the Holy Qur’an. As with Christianity, the relationship of Islam to a philosophy of peace is complex, given that Islam also has a history of sometimes violent proselytization. Yet Islam itself is a cognate word for peace, and Islamic teaching in the Qur’an extols forgiveness, reconciliation, and non-compulsion in matters of faith. Moreover, one of the Five Pillars of Islam, Zakat, is an important marker of social justice, emphasizing giving to the poor.

There is an established scholarly tradition that interprets communism, the theory and system of social organization based upon the writings of Karl Marx and Friedrich Engels, as a form of nontheistic religion. Communist theory promises a peaceful future, through the elimination of inequality, the emergence of an ideal classless society, with a just distribution of resources, no class warfare and no international wars, given war in communist theory is often viewed as the result of capitalist imperialism. Communism envisages an end to what Engels described as social murder, premature deaths within a social class due to exposure to preventable yet lethal conditions.

Yet scholars such as Rudolph Rummel have suggested that communist societies have been the most violent and genocidal in human history. Idealism can be lethal. Others point to examples of peaceful communist societies. Importantly, scholars such as Noam Chomsky argue that, far from reflecting the ideals of Marx and Engels, communist societies of the twentieth century, in practice, betrayed those original ideals. Irrespective of this, the example of mass violence in communist societies suggests that a proper theory of peace must encompass not merely a goal or aspiration, but a way of life.

It is useful to enquire what commonalities we might discern in religious traditions regarding peace, and it seems fair to say that peace is usually viewed as the ultimate goal of human existence. For some religions, this is phrased in eschatological notions such as heaven or paradise, and in other religions this is phrased in terms of an ecstatic state of being. Even in communism, there is an eschatological element, through the creation of a future classless society. There is also an ethical commonality in traditions, in that peaceful existence and actions are set forth as an ethical norm, notwithstanding that there are exceptions to this.

It is in defining and understanding the exceptions that there is a degree of complexity. There is also a common conflict between universalism and particularism within religious traditions, with particularistic emphases, such as in the notion of the Chosen People, arguably embodying the potential for exclusion and violence.

2. Classical Sources for a Philosophy of Peace

The writings of Plato (428/7-348/7 B.C.E.) would not normally be thought of as presenting a source for a philosophy of peace. Yet there are aspects of Plato’s work, based upon the teaching of Socrates, which may constitute such a source. Within his major work Politeia (Republic), Plato focuses on what makes for justice, an important element in any broad concept of peace. Plato, in effect, presents a peace plan based upon his city-state. This ideal society is essentially static, involving three distinct classes, although it is, nevertheless, a society which provides for at least an internally peaceful polis or state. Plato also develops a theory of forms or ideals, and it is not too difficult to see peace as one of those forms or ideals, and, in contributing to the polis or state, we contribute to the development of that form or ideal. In his work Nomoi (Laws), Plato enunciates the view that the establishment of peace and friendship constitute the highest duty of both the citizen and the legislator, and in the work Symposium, Plato articulates the idea that it is love which brings peace among individuals.

The writings of Aristotle (384-322 B.C.E.) similarly do not present an obvious reference point for a philosophy of peace. Yet there may be such a reference point in his development of virtue ethics, notably in Ethica Nicomachea (Nichomachean Ethics). Virtue ethics may legitimately be linked to a philosophy or ethics of peace. The mean of each of the virtues described by Aristotle may be viewed as qualities conducive to peace. In particular, the mean of the virtue of andreia, usually translated as courage or fortitude, may be seen as similar to the notion of assertiveness, a quality which many writers see as important within nonviolence. Aristotle also identifies justice as a virtue, and many peace theorists emphasize the inter-relationship between peace and justice. Further, some writers have specifically identified peace or peacefulness as a virtue in itself. Interestingly, Aristotle sees the telos or goal of life as eudaimonia, or human flourishing, a concept similar to the ideals set forth in writing on a culture of peace.

3. Medieval Sources for a Philosophy of Peace

Saint Augustine of Hippo (354-430 C.E.) was both a bishop and theologian, and he is widely recognized as capably integrating classical philosophy into Christian thought. His thought is often categorized as late Roman or early medieval. One element of Augustinian thought relevant to a philosophy of peace is his adaptation of the neo-Platonic notion of privation, that evil can be seen as the absence of good. It is an idea which resonates with notions of positive and negative peace. Negative peace can be seen as the absence of positive peace. The notion of privation also suggests that peace ought to be seen as a specific good, and that war is the absence or privation of that good.

The best-known contribution of Augustine to a philosophy of peace, however, is his major work De civitate Dei (The City of God). Within this, Augustine contrasts the temporal human city, which is marked by violent conflict, and the eternal divine city, which is marked by peace. As with many religious writers, the ideal is peace. Augustine is also noteworthy for articulating the notion of just war, wherein Christians may be morally obliged to take up arms to protect the innocent from slaughter. However, this concession is by way of a lament for Augustine, as a mark that Christians are living in a temporal and fallen world. That is a concession which contrasts with the way that others have used just war theory, and in particular the work of Augustine, to justify and glorify war.

Saint Thomas Aquinas (ca.1225-1274) is perhaps best known for his attempt to synthesize faith with reason, for his popularization of Aristotelian thought, and for his focus on virtues. The significant contribution of Aquinas to a philosophy of peace is his major work Summa Theologica (Summary of Theology), and in particular the discussion on ethics and virtues in Part 2 of the work. At Question 29 of Part 2, Aquinas examines the nature of peace, and whether peace itself may be considered a virtue. Aquinas concludes that peace is not a virtue, and further concludes that peace is a work of charity (love). An important qualification, however, is that peace is also described as being, indirectly, a work of justice. We see here the inter-relationship of peace and justice, something taken up by contemporary peace theorists. Aquinas also refined the just war theory, including articulating the requirements of proper authority, just purpose, and just intent when resorting to war.

4. Renaissance Sources for a Philosophy of Peace

The Renaissance was a period of a revival of learning in Europe, and it is often identified as a period of transition from the medieval to the modern. The Renaissance is also known for the growth of humanism, that is, an era involving the rediscovery of classical literature, an outlook focusing on human needs and on rational means to solve social problems, and a belief that humankind can shape its own destiny. One central human problem for humanists, and indeed for many thinkers, was and is the phenomenon of war, and Renaissance humanists refused to see war as inevitable and unchangeable. This in itself is an important contribution to a philosophy of peace. Renaissance humanism was not necessarily anti-religious, and indeed most of the humanist writers from this time worked from specifically religious assumptions. It can be argued that in the 21^stcentury we are still part of this humanist project, and an important part of the humanist project is to solve the problem of war and social injustice.

Erasmus of Rotterdam (ca.1466-1536), otherwise known as Desiderius Erasmus, is perhaps the foremost humanist writer of the Renaissance, and arguably also one of the foremost philosophers of peace. In numerous works, Erasmus advocated compromise and arbitration as alternatives to war. The connection between humanism and peace is perhaps best discernable in Erasmus’ 1524 work De libero arbitrio diatribe sive collatio (The Freedom of the Will), where Erasmus points out that if all that we do is predetermined, there is no motivation for improvement. The principle can apply to social dimensions as well. If everything is predetermined, then there is little point in attempting to work for peace. If we say that war and social injustice are inevitable, then there is little motivation to change. Further, saying that war and social injustice are inevitable serves as a self-fulfilling statement, and individuals will tend not to do anything to challenge war and social injustice.

De libero arbitrio is also useful for pondering a philosophy of peace in that the work presents an example of the idea that peace is a means or method, and not merely a goal. Although Erasmus wrote the work in debate with Martin Luther, Erasmus avoids polemics, is reticent to make assertions, strives for moderation, and is anxious to recognize the limitations of his argument. He points out in the Epilogue that parties to disputes will often exaggerate their own arguments, and it is from the conflict of exaggerated views that violent conflict arises. This statement was prophetic, given the religious wars which engulfed Europe following the Protestant Reformation.

However, the best-known peace tract from Erasmus is perhaps the adagium Dolce bellum inexpertis, (War is Sweet to Those Who Have Not Experienced It). Erasmus is quoting from the Greek poet Pindar, and in this adagium he is, in effect, presenting a cultural view of war, namely that war is at least superficially attractive. The implication, although Erasmus does not develop this, is that there is an element to peace which lacks the emotive appeal of war. This is an insight which explains much of the complex relationship between war and peace. Later writers would explore this idea to advocate for a vision of peace which would embrace some of the moral challenges associated with war.

Sir Thomas More (1478-1535) was another leading humanist writer of the Renaissance, and a friend and correspondent of Erasmus. In his 1516 book De optimae rei publicae statu deque nova insula utopia (On the Best Government and on the New Island Utopia), More outlines an ideal society based upon reason and equality. In Book One of Utopia, More articulates his concerns about both internal and external violence. Within Europe, and England in particular, there is senseless capital punishment, for instance in circumstances where individuals are only stealing to find something to eat and thus keep themselves alive. Further, there is a world-wide epidemic of war between monarchs, which debases the countries monarchs seek to lead. Book Two of Utopia provides the solution, with a description of an agrarian equalitarian society; where there is no private property; where the young are educated into pacifism; where war itself only resorted to for defensive reasons or to liberate the oppressed from tyranny; where psychological warfare is preferred to battle; and where there are no massacres nor destruction of cities. This utopian society suggested by More reflects a broad theory of peace. One of the interesting ramifications of More’s vision is whether such a peaceful society, and indeed peace, is ever attainable. The common meaning of the word “utopian” connotes something or a state which is not attainable, although it seems unlikely More would have written his work if he, in common with other humanists of his era and since, did not have at least some belief that the principles he was putting forth were in some way attainable.

5. Modern Sources for a Philosophy of Peace

Thomas Hobbes (1588-1679) was both a writer and a politician, whose writing was motivated by an overarching concern on how to avoid civil war, and the carnage and suffering resulting from this. He had observed this first-hand in England, and he famously articulated a statist view of peace as a contrast to the anarchy and violence of nature. In his two most noted works, De Civi (The Citizen) and Leviathan, Hobbes articulates a view that human nature is essentially self-interested, and thus the natural state of humankind is one of chaos. Hobbes also sees the essence of war as not merely the action of fighting, but a disposition to fight, and this exists only because there is a dearth of an overarching law-enforcing authority. The only way to introduce a measure of peace is therefore through submission of citizens to a sovereign, or, in more contemporary terminology, the state. Thus, a Hobbesian worldview is often taken to be pessimistic, it holds that the natural condition of humankind is one of violence, and that this violence inevitably predominates where there is no humanizing and civilizing impact of the state. Hobbes raises the important issue of how important is it to have an overarching external authority for lasting peace to exist. If we accept that such an external authority is necessary for peace, then arguably we have the capacity to invent mechanisms to set in place such an external authority.

Baruch or Benedictus de Spinoza (1632-1677) was a Dutch philosopher, of Jewish background, who wrote extensively on a range of philosophical topics. His relevance for a philosophy of peace in general may be found in his advocacy of tolerance in matters of religious doctrine. It is notable also that in his Tractatus Politicus (Political Treatise), written 1675-6 and published after his death, Spinoza asserts: “For peace is not mere absence of war but is a virtue that springs from force of character”. This is a definition of peace that anticipates later expositions, especially those that see peace as a virtue, but also twenty-first century peace theory that differentiates positive from negative peace.

John Locke (1632-1704) is arguably one of the most influential contributors to modern philosophy. Like other philosophers of the time, Locke is important for advancing the notion of tolerance, most clearly in his 1689 Letter Concerning Toleration. The background of this had been the destructive religious wars of the time, and Locke logically suggests that this violence can be avoided through religious tolerance. Within the work of Locke one can also discern elements of the idea of the right to peace. Around 1680, Locke composed his Two Treatises of Government, and, in the second of these at Chapter 2, Locke argues that each individual has a right not to be harmed by another person, that is, a right to life, and it is the role of political authority to protect this right. The right to life and the right not to be harmed arguably anticipate the later notion of the right to peace.

Jean-Jacques Rousseau (1712-1778) was a Genevan philosopher of history, and was both a leader and critic of the European Enlightenment. The idea of the noble savage, who lives at peace with his/her fellows and with nature, can be found in many ancient philosophers, although the noble savage is most often associated with the work of Rousseau. In his 1750 Discours sur les sciences et les arts (Discourse on the Sciences and the Arts), Rousseau posited that human morality had been corrupted due to culture; in his 1755 Discours sur l’origine et les fondements de l’inégalité parmi les hommes (Origins of the Inequality of Man), he posits that social and economic developments, especially private property, had corrupted humanity; in his 1762 work Du contrat social (The Social Contract), he posits that authority ultimately rests with the people and not the monarch; and in his 1770 Les Confessions (Confessions), Rousseau extols the peace which comes from being at one with nature. Rousseau anticipates common themes in much peace theory, and especially the counter-cultural and alternative peace movements of the 1960s and 1970s, namely that peace involves a conscious rejection of a corrupting and violent society, a return to a more naturalistic and peaceful existence, and a respect for and affinity with nature. In short, Rousseau suggests that the way to peace is through a more peaceful society, rather than through systems of peace.

Immanuel Kant (1724-1804) is often seen as the modern philosopher who, in his universal ethics and cosmopolitan outlook, has provided what many argue is the most extensive basis for a philosophy of peace. The starting point for the ethics of Kant is the philosophy of duty and an ethics based on duty, and, in particular, the duty to act so that what one does is consistent with what are reasonably desired universal results, what Kant called the categorical imperative. Kant introduced this notion in his 1785 work Grundlegung zur Metaphysik der Sitten (Foundation of the Metaphysics of Morals), and developed this in his 1788 Kritik der praktischen Vernunft (Critique of Practical Reason). It has been argued by many, including Kant himself, that we have a duty to peace and that we have a duty to act in a peaceful manner, in that we can only universalize ethics if we consider others, and this at the very least implies a commitment to peace.

A second important Kantian notion is that of das Reich der Zwecke, often translated as the realm or kingdom of ends. In Grundlegung zur Metaphysik der Sitten (Foundation of the Metaphysics of Morals), Kant suggests an ethical system wherein persons are ends-in-themselves, and each person is a moral legislator. It is a notion which has important implications for peace, in that the notion implies that each person has an obligation to regard others as ends-in-themselves and thus not engage in violence towards others. In other words, the notion implies that each person has a responsibility to act in a peaceful manner. If all persons acted in this way, it would also mean that the phenomenon of war, wherein moral responsibility is surrendered to the state, would become impossible.

Finally, Kant’s 1795 essay Zum ewigen Frieden (On Perpetual Peace) is the work most often cited in discussing Kant and peace, and this work puts forward what some call the Kantian peace theory. Significantly, in this work Kant suggests more explicitly than elsewhere that there is a moral obligation to peace. For instance, Kant argues in the Second Definitive Article of the work that we have an “immediate duty” to peace. Accordingly, there is also a duty for nation-states to co-operate for peace, and indeed Kant suggests a range of ways that this can be achieved, including republicanism and a league of nations. Importantly, Kant also suggests that the public dimension of actions, which can be understood as transparency, is important for international peace.

The work of Georg Wilhelm Friedrich Hegel (1770-1831) is contentious from the perspective of a philosophy of peace, as he holds what might be called a statist view of morality. Hegel sees human history as a struggle of opposites, from which new entities arise. Hegel sees the state, and by this he means the nation-state, as the highest evolution of human society. Critics, such as John Dewey and Karl Popper, have seen in Hegel a philosophical rationalization of the authoritarian and even totalitarian state. Yet the reliance on the state as an object of stability and peace does not necessarily mean acceptance of bellicose national policies. Further, just as human organization is evolving, one could equally argue that evolution towards a supra-national state with the object of world peace may also be consistent with the organic philosophy of Hegel. It is possible to view Hegel as a source for a philosophy of peace.

6. Contemporary Sources for a Philosophy of Peace

William James (1842-1910) was a noted American pragmatist philosopher, and his 1906 essay ‘The Moral Equivalent of War’, originally an oration, was produced at a time when many who had experienced the destruction and loss of life of the American Civil War were still alive. James provides an interesting potential source for a pragmatist philosophy of peace. James argues that it is natural that humans should pursue war, as the exigencies of war provide a unique moral challenge and a unique motivating force for human endeavor. By implication, there is little value in moralizing about war, and moralizing about the need for peace. Rather, what is needed is a challenge which will be seen as an equivalent or counterpoint to war – in other words a moral equivalent of war. The approach of James is consistent with the notion of positive peace, in that peace is seen to be something which embodies, or should embody, cultural challenges.

Mohandas Karamchand Gandhi (1869-1948) is widely regarded as the leading philosopher of nonviolence and intrapersonal peace. Through his life and teaching, Gandhi continually emphasized the importance of nonviolence, based upon the inner commitment of the individual to truth. Thus, Gandhi describes the struggle for nonviolence as truth-force, or satyagraha. Peace is not so much an entity or commodity to be obtained, nor even a set of actions or state of affairs, but a way of life. In Gandhism, peaceful means become united with and indistinguishable from peaceful ends, and thus the call for peace by peaceful means. The thought of Gandhi has been influential in the development of the intrapersonal notion of peace, that peace consists not so much as a set of conditions between those in power, but rather the inner state of a person. Gandhi is also noteworthy in that he linked nonviolence with economic self-reliance.

The philosopher Martin Buber (1878-1965) is well known for emphasizing the importance of authentic dialogue, which comes about when individuals recognize others as persons rather than entities. In his influential 1923 book Ich und Du (I and Thou), Buber suggests that we only exist in relationship, and those relationships are necessarily of two types: personal relationships involving trust and reciprocity, which Buber characterized as Ich-Du, or I-Thou relationships; and instrumental relationships, involving things, which Buber characterized as Ich-Es, or I-It relationships. The book was commenced during the carnage of World War One, and it is not too difficult to see the book as a philosophical reflection on the true nature of peace, in that peace involves dialogue with the other, with war constituting the absence of such dialogue.

There are commonalities between the philosophy of Buber and the ethics of care. Both indicate that we need to see the other as an individual and as a person, that is, we need to see the face of the other. If we recognize the other as human, and engage with them in dialogue, then we are less likely to engage in violence against others, and are more likely to seek for social justice for others. It is also noteworthy that Buber emphasized that all authentic life involves encounter. Thus, if we are not engaging in dialogue with others, then we ourselves do not have peace, at least not in the positive and full construction of the concept.

Martin Luther King Jr. (1929-1968) is perhaps best known as a civil rights campaigner, although he also wrote and spoke extensively on peace and nonviolence. These ideals were also exemplified in his life. One could argue that King did not develop any new philosophy as such, but rather expressed ideas of peace and nonviolence in a uniquely powerful way. Some of the key themes articulated by King were the importance of loving one’s enemies, the duty of nonconformity, universal altruism, inner transformation, the power of assertiveness, the interrelatedness of all reality, the counterproductive nature of hate, the insanity of war, the moral urgency of the now, the necessity of nonviolence in seeking for peace, the importance of a holistic approach to social change, and the notion of evil, especially as evidenced in racism, extreme materialism and militarism.

Gene Sharp (1928-2018) was also an important theorist of nonviolence and nonviolent action, and his work has been widely used by nonviolent activists. Central to his thought are his insights into the power of the state, notably that this power is contingent upon compliance by the subjects of a state. This compliance works through state institutions and through culture. From this, Sharp developed a program of nonviolent action, which works through subverting state power. Critics of Sharp argue that he was in effect a supporter of an American-led world order, especially as his program of nonviolent struggle was generally applied to countries not complying with US geostrategic priorities or with countries not compliant with corporate interests.

Johan Galtung (1930 -) is widely recognized as the leading contemporary theorist on peace, and he is often described as the founder of contemporary peace theory. Galtung has approached the challenge of categorizing peace through describing violence, and specifically through differentiating direct violence from indirect or structural violence. From this distinction, Galtung has developed an integrated typology of peace, comprising: direct peace, where persons or groups are engaged in no or minimal direct violence against another person or group; structural peace, involving just and equitable relationships in and between societies; and cultural peace, where there is a shared commitment to mutual support and encouragement. More recently, a further dimension has been developed, namely, environmental peace, that is, the state of being in harmony with the environment.

The notions of positive and negative peace derive largely from the work of Galtung. Direct peace may be seen as similar to negative peace, in that this involves the absence of direct violence. Structural and cultural peace are similar notions to positive peace, in that these notions invite reflection on wider ideas of what we look for in a peaceful society and in peaceful interactions between individuals and groups. Similarly, an integrated notion of peace, involving personal and social dimensions of peace, derives substantially from Galtung, in that Galtung sees the notions of peace and war as involving more than an absence of violence between nation-states, which is what people often think of when we speak of a time of peace or a time of war.

The value of the various Galtungian paradigms is that these encourage thinking about the complex nature of peace and violence. Yet a problem with the Galtungian approach is that it can be argued as being too all-encompassing, and thus too diffuse. Peace researcher Kenneth Boulding summed up this problem by suggesting, famously, that the notion of structural violence, as developed by Galtung, is, in effect, anything that Galtung did not like. By implication, Galtung’s notion of peace too can be argued to be too general and too diffuse. Interestingly, Galtung has suggested that defining peace is a never-ending task, and indeed articulating a philosophy of peace might similarly be regarded as a never-ending exercise.

7. The Philosophy of Peace Education

In investigating a philosophy of peace, it is useful to examine writing on what might reasonably constitute a philosophy of peace education. The reason is that when defining peace education, we are in effect defining peace, as the encouragement and attainment of peace is the ultimate goal of peace education. Just as peace is increasingly seen as a human right, so too peace education may be thought of as a human right. Thus any philosophy of peace education is very closely linked with what might be seen as a philosophy of peace. For convenience, we can divide approaches to a philosophy of peace education into the deontological and non-deontological.

James Calleja has argued that the philosophical basis for peace education may be found in deontological ethics, that is, we have a duty to peace and a duty to teach peace. Calleja relies strongly on the work of Immanuel Kant in developing this argument, and, in particular, on the Kantian notion of the categorical imperative, and in the subsequent categorical imperative of peace. The first formulation of the categorical imperative from Kant is that one should act in accordance with a maxim that is universal, that is, one should wish for others what one wishes for oneself. In effect, this is can be seen as a philosophical basis for nonviolence and for universal justice, in that as we would wish for security and justice for ourselves, so too we ought to desire this for others.

James Page has developed an alternative philosophical approach to peace education, identifying virtue ethics, consequentialist ethics, conservative political ethics, aesthetic ethics and care ethics as potential bases for peace education. Equally, however, each of the above may also be argued as providing an ethical and philosophical basis for a general theory of peace. For instance, peace may be seen as a settled disposition on the part of the individual, that is, a virtue; peace may be seen as the avoidance of the destruction of war and social inequality; peace may be seen as the presence of just and stable social structures, that is, a social phenomenon; peace may be seen as love for the world and the future, that is, an aesthetic disposition; and peace may be seen as caring for individuals, that is, moral action.

8. The Notion of a Culture of Peace

The realization that peace is more than the absence of conflict lies at the heart of the emergence of the notion of a culture of peace, a notion which has been gaining greater attention within peace research in the late twentieth and early twenty-first centuries. The notion was implicit within the UNESCO mandate, with the acknowledgment that since wars begin in human minds, it follows that the defense against war needs to be established in the minds of individuals. An extensive expression of this notion was set forth in the United Nations General Assembly resolution 53/243, the Declaration and Programme of Action on a Culture of Peace, adopted unanimously on 13 September 1999, which describes a culture of peace as a set of values, attitudes, traditions and modes of behavior and ways of life. Article 1 of the document indicates that these are based upon a respect for life, ending of violence and promotion and practice of nonviolence through education, dialogue and cooperation.

Any attempt at a philosophy of a culture of peace is complex. One of the challenges is that conflict is a necessary part of human experience and an important element in the emergence of culture. Even if we differentiate violent conflict from mere social conflict, this does not solve the problem entirely, as human culture has still been very much dependent upon the phenomenon of war. A more thorough solution is to admit that war and violence are indeed important factors in human experience and in the formation of human culture, and, rather than denying this, to attempt to seek and foster alternatives to war as a crucial motivating cultural factor for human endeavor, such as William James suggested in his famous essay on a moral equivalent of war.

9. The Right to Peace

Another emerging theme in peace theory has been the notion of peace as a human right. There is some logic to the notion of peace as a human right. The emergence of the modern human rights movement arose very much out of the chaos of global war and the emerging consensus that the recognition of human rights was the best way to establish and maintain peace. The right to peace may arguably be found in Article 3 of the Universal Declaration of Human Rights, which posits the right to life, sometimes called the supreme right. The right to peace arguably flows from the right to life. This right to peace has been further codified with United Nations General Assembly resolution 33/73, the Declaration on the Preparation of Societies for Life in Peace, adopted on 15 December 1978; with the United Nations General Assembly resolution 39/11, the Declaration of the Right of the Peoples of the World to Peace, adopted on 12 November 1984; and most recently with the United Nations General Assembly resolution 71/189, the Declaration on the Right to Peace, adopted on 19 December 2016.

In a lecture to the International institute of Human Rights in 1970, Karel Vastek famously suggested categorizing human rights in terms of the motto of the French revolution, namely, “liberté, égalité, fraternité.” Following this analysis, first generation rights are concerned with freedoms, second generation rights are concerned with equality, and third generation rights are concerned with solidarity. The right to peace is often characterized as a solidarity or third generation right. Yet one can take a wider interpretation of peace, for instance, that peace implies the right to development and the enjoyment of individual human rights. In this light, peace can be seen as an overarching human right. It is noticeable that there seems to have been such an evolution in thinking about the human right to peace, in that this is gradually being interpreted to include other rights, such as the right to development.

In examining the philosophical foundations for a human right to peace it is useful to examine some of the philosophical bases for human rights generally, namely, interest theory, will theory, and pragmatic theory. Interest theory suggests that the function of human rights is to promote and protect fundamental human interests, and securing these interests is what justifies human rights. What are fundamental human interests? Security is generally identified as being a basic human interest. For instance, John Finnis refers to “life and its capacity for development” as a fundamental human interest, and that “A first basic value, corresponding to the drive for self-preservation, is the value of life” (1980, p. 86). The best chance for self-preservation is that there be a norm for non-harm, which is an important element within a culture of peace. The right to peace therefore serves that basic need for life, both in the sense of protection from violence but also in serving the interests of a good life.

Will theory focuses on the capacity of individuals for freedom of action and the related notion of personal autonomy. For instance, those such as Herbert Hart have argued that all rights stem from the equal right of all individuals to be free. Any right to personal freedom, however, contains an inherent limitation, in that one cannot logically exercise one’s own freedom to impinge upon another person’s freedom. This is captured in the adage that my right to swing my fist ends at another person’s nose. Why is that adage correct? One answer is that within the notion of will theory there is an implicit endorsement of a right to peace, that is, not to harm or do damage to others.

The pragmatic theory of human rights posits that such rights simply constitute a practical way that we can arrive at a peaceful society. For instance, John Rawls suggests that the laws of people, as opposed to the laws of states, is a set of ideals and principles by which people from different backgrounds can agree on how their actions towards each other should be governed and judged, and through which people can establish the conditions of peace. This is not to deny those critics who point out that human rights can function as a rationale for the powerful to engage in collective violence, and that there can be a tension between human rights and national sovereignty. Thus, paradoxically, national sovereignty can sometimes serve to promote and provide peace, and human rights can sometimes be used underscore violence.

The importance of the human right to peace is perhaps best summed up by William Peterfi, who has described peace as a corollary to all human rights, such that “without the human right to peace no other human right can be securely guaranteed to any individual in any country no matter the ideological system under which the individual may live” (1979, p.23). The notion of the human right to peace also changes the nature of discourse about peace, from something to which individuals and groups might aspire, to something which individuals and groups can reasonably demand. The notion of the human right to peace also changes the nature of the responsibility of those in positions of power, from a vague aspiration that those in power need to provide for peace, to the expectation and duty that those in power will provide peace.

10. The Problem of Absolute Peace

Given the challenges of defining peace, the philosophical problem of peace may be phrased in terms of a question: is there any such thing as absolute peace? Or ought we be satisfied with an imperfect peace? For instance, can there ever be a complete elimination of all forms of armed conflict, or at least the elimination of reliance on armed force as the ultimate means of enforcement of will? Similarly, one may ask: is there any such thing as absolute co-operation and harmony between individuals and groups, an absolute sense of well-being within individuals, and an absolute oneness with the external environment?

The philosophical solution to this problem may be to point out that there is always an open-ended dimension to peace, that is, if we take a broad interpretation of peace, we will always be moving towards such a goal. Some might articulate this as the eschatological dimension of peace, suggesting that the contradictions which are raised in any discussion on peace can only be resolved, ultimately, at the end of time. It is relevant to note, however, that peace theorists have pointed out that if we assert that a certain outcome, such as peace, is not attainable, our actions will serve to make this a self-fulfilling prophecy. In other words, if we assert that peace, relative or absolute, is not attainable, then there will be a reduced expectation of this, and a reduced commitment to making this happen.

11. Peace and the Nature of Truth

It is worthwhile looking at the relationship of the theory of peace to the theory of truth. The relationship can be seen to operate at a number of levels. For instance, Mohandas Gandhi described his theory of nonviolence as satyagraha, often translated as truth force. Similarly, Gandhi entitled his autobiography ‘The Story of My Experiments with Truth’. Gandhi saw nonviolence, or ahimsa, as the noblest expression of truth, or sat, and argued there is no way to find truth except through nonviolence. For Gandhi, peace was not merely an ideal, rather it was based on what he saw as the truth of the innate nonviolence of individuals, which the institutions of war and imperialism distorted. Further, peace involves authenticity, a notion related to truth, in that the person involved in advocating peace ought to themselves be peaceful. We thus arrive at the Gandhian dictum that there is no way to peace as such, rather peace is the way, that is, peace is an internal life-style commitment on the part of the individual.

Conversely, war arguably operates as a form of untruth. This was summed up succinctly by Erasmus, in his dictum that war is sweet to those who have not experienced it. IN 1985, Elaine Scarry wrote that the mythology of war obscures what war is actually about, namely, the body in pain. Similarly, Morgan Scott Peck has written about a lack of truthfulness, especially in war, as being the essence of evil. Typically, those advocating war will concede that the recourse to war is not a good option, but suggest that there is no other option, or that war is the least bad option. The empirical history of nonviolence suggests that this is not the case, and that there are almost always alternatives to violence.

If peace is about establishing societies with harmonious and cooperative relationships, then a key component in establishing such societies is arguably knowledge about ourselves, or accepting the truth about ourselves. Without this, it is unlikely that we will be able to establish peaceful societies, as we will not have resolved the inclinations to violence within ourselves. The notion of what constitutes the true self, or the truth about one’s self, is a complex one. Carl Gustav Jung usefully wrote about the shadow or the normally unrecognized side of one’s character. The extent to which the shadow side of our personality can result in participation in and support for violence can be shocking to us. This is not to say that human nature is irretrievably attracted to violence or cruelty. For instance, the Seville Statement on Violence, sponsored by UNESCO, argues that war is a human invention. Yet there is a strong argument that peace involves recognition of the potential within one’s self for violence. Put another way, peace involves peace with one’s self.

12. Peace as Eros

In the work of Sigmund Freud, and especially in his 1930 work Das Unbehagen in der Kulture (Civilization and its Discontents), Eros is the life instinct, which includes sexual instincts and the will to live and survive. The nominal opposite of the life instinct is the death instinct, which is the will to death. Later theorists described this as Thanatos. Freud developed his theory of competing drives in his therapeutic dealings with soldiers from World War One, many of whom were suffering from psychological trauma as a result of their war experiences. It is not too difficult to see Eros as a synonym for peace, in that peace involves all that Eros represents. Psychiatrist and peace activist Eric Fromm developed this theme further, writing of biophilia as the love of life, from which all peace comes, and necrophilia, as the love of death and destruction, which is the basis of war.

Even if we acknowledge a link between the death instinct and war, the relationship between the life instinct and the death instinct is not simple. Freud wrote of the basic desire for death seemingly competing with the desire for life. Yet the two instincts may also be viewed as complementary. It is because we are all aware, at least subconsciously, of our impending mortality, that we a driven to risk death, especially in the enterprise we call war. Many writers have explored this complexity. For instance, the psychiatrist Elizabeth Kubler-Ross writes: “Is war perhaps nothing else but a need to face death, to conquer and master it, to come out of it alive—a peculiar form of denial of our own mortality?” (2014, p.13).

If we think of Eros as peace, then a logical extension is to think of human sexuality and the expression of human sexuality as one embodiment of peace. The post-Freudians Herbert Marcuse and Wilhelm Reich both developed this theme, arguing that the origins of war and unjust social organization rested in repressed sexual desire, and that conversely peace implies sexual freedom. This idea was neatly summed up in the 1960s radical slogan, “Make love not war”. An important qualification to the peace-as-sexuality theory is that this always involves consensual sexual relationships. Many writers have identified rape and other exploitative sexual relationships as important components of war and social injustice.

13. Peace, Empire and the State

In considering a philosophy of peace, the phenomenon of empire presents a paradox for peace theory. The establishment of an empire may be seen as establishing a form of peace. It is thus common to refer to Pax Romana, as the form of peace which was established by virtue of the Roman Empire, and Pax Britannica, Pax Sovietica, and Pax Americana, referring to later periods of empire. It is true that within empires, it can be argued that there is no war, at least not in the conventional sense. Critics of imperialism, however, point to violence being moved to the periphery of the empire; there is the problem of inter-imperial rivalry; and there is also the problem that empires frequently engage in the violent suppression of minorities within the borders of the empire.

Similarly, the phenomenon of the state presents a paradox for peace theory. The establishment of a stable state generally means that citizens can live and work free from violence, and ideally, at least in democratic states, within a framework of social justice. Yet, as sociologist Max Weber famously pointed out, it is in the very nature of the state that it claims a monopoly over the legitimate use of violence. The legitimate use of violence finds its ultimate expression in the phenomenon of war. Thus, anarcho-pacifists argue if one wants to eliminate war, then one needs to eliminate the state, at least in its current nation-state form.

14. An Existentialist Philosophy of Peace

Existentialism may be defined in philosophical terms as the view that truth cannot be objectified, but rather it can only be experienced. This is not to deny the objective reality of an entity, but rather to say that the limitations of language are such that this cannot be objectified. We can apply this to a philosophical analysis of peace, and suggest that ultimately peace cannot be objectified, but rather it can be experienced. Thus, attempts to specify what peace is are likely to be problematic. Rather we can represent peace by way of illustration, to say that peace involves a set of behaviors and attitudes, and we can represent peace by way of negation, to say that peace is not deliberate violence to other persons. Or we can say, in true existentialist fashion, that we can only know peace through encounter or relationship.

Another way of articulating the idea of existentialist peace is by referring to the metaphysics of peace. The existentialist theologian John Macquarrie writes: “By a metaphysical concept, I mean one the boundaries of which cannot be precisely determined, not because we lack information but because the concept itself turns out to have such depth and inexhaustibility that the more we explore it, the more we see that something further remains to be explored” (1973, p.63), and further: ”If peace … is fundamentally wholeness, and if metaphysics seeks to maximize our perception of wholeness and inter-relatedness, then peace and metaphysics may be more closely linked than is sometimes supposed; while, conversely, the fragmented understanding of life may well be connected with the actual fracturing of life itself, a fracturing which is the opposite of peace. But the true metaphysical dimensions of peace emerge because even to seek a wholeness for human life drives us to ask questions which take us to the very boundaries of understanding. What is finally of value? What is real and what is illusory? What conditions would one need to postulate as making possible the realization of true peace?” (1973, p.64).

15. Decolonizing Peace

Postcolonial theory posits, in general terms, that not only has global colonial history determined the shape of the world as we know it today, but the power relationships implicit in colonialism have determined contemporary thinking. Thus, the powerless tend to be marginalized in contemporary thinking. Some writers, such as Victoria Fountain, have suggested there is a need to decolonize peace theory, including taking into account the everyday experience of ordinary people, transcending liberal peace theory which tends to assume the legitimacy of power, and transcending the view that the Global North needs to come to rescue of the Global South. Thus the discourse on peace, so it is argued, needs to be less Eurocentric. The argument is that the narrative of peace needs to change.

Postcolonial peace theory intersects with much feminist peace theory, represented by writers such as Elizabeth Boulding, Cynthia Enloe, Nel Noddings, and Betty Reardon. The suggestion is often made by such theorists that a feminine or maternal perspective is uniquely personal, caring and peace-oriented. The corollary to this is that a male perspective tends to be less personal, less caring, and more war-centric. Feminist peace theorists have also pointed out that war and militarism work on patriarchal assumptions, such as women need protecting and it is the duty of men to protect women, and that there is no alternative to the current system of security through power and domination. The argument is also made that war and patriarchy are part of the same system.

Postcolonial and feminist peace theory are highly contested. For instance, it can be argued that, as current philosophical discourse has evolved from European origins, articulating peace in terms of concepts articulated by European authors is a merely a matter of utilizing this global language. Similarly, one can argue since it is a historical reality that most influential philosophers in history have hitherto been male, therefore the existing narrative will naturally tend to have more male sources and male voices. One can arguably apply a quota system to some areas such as contemporary politics, but it is more difficult to argue that a quota system ought to be applied to narrative and to discourse. Critics of postcolonial peace theory also allege that postcolonial peace theory tends to avoid universalist statements on human rights, which itself is important, given the key role of human rights in peace, and given the emerging human right to peace itself.

16. Concluding Comments: Philosophy and Peace

One interesting way to address the issue of a philosophy of peace is to think of war as representing the absence of philosophy, in that war is prosecuted on the assumption that one person or group itself possesses truth, and that the views of that individual or group ought to be imposed, if necessary, by violent force. War may also be seen as the absence of philosophy in that war represents an absence of the love of wisdom. This is not to deny there are philosophies and philosophers who justify war and injustice. Ultimately, however, these philosophies are not sustainable, as war is an institution which involves destruction of both the self and societies. Similarly, social injustice is not sustainable, as within social injustice we find the seeds of war and destruction.

Conversely, it can be argued that philosophy itself represents the presence of peace, in that philosophy generally does not or should not involve assumptions that one person or group by itself uniquely possesses truth, but rather the way to truth is through a process of questioning, sometimes called dialectic. Therefore, philosophy by its essence is or should be a tolerant enterprise, and it is also an enterprise which involves or should involve debate and discussion. Philosophy thus presents a template for a peaceful society, wherein differing viewpoints are considered and explored, and which, through the love of wisdom, encourages thinking and exploring about positive and life-enhancing futures. This means that engaging in philosophy may well be a useful start to a peaceful future.

17. References and Further Reading

Aho, J. (1981) Religious Mythology and the Art of War. Westport: Greenwood.
Aquinas (1964-1981) Summa Theologiae: Latin Text and English Translation. (T. Gilbey and others, Eds.) Cambridge: Blackfriars, and New York: McGraw-Hill.
Aristotle (1984) The Complete Works of Aristotle. The Revised Oxford Translation. (J.Barnes, Ed.) Princeton: Princeton University Press.
Aron, R. (1966) Peace and War: A Theory of International Relations (R.Howard and A.B. Fox, Transl.) London: Weidendfeld and Nicholson.
Augustine (1972) Concerning the City of God against the Pagans. (H. Bettenson, Transl.) Harmondsworth: Penguin.
Boulding, E. (1988) Building a Global Civic Culture: Education for an Interdependent World. San Francisco: Jossey-Bass.
Boulding, E. (2000) Cultures of Peace: The Hidden Side of History. Syracuse: Syracuse University Press.
Boulding, K. (1977) Twelve friendly quarrels with Johan Galtung. Journal of Peace Research. 14(1): 75-86.
Buber, M. (1984) I and Thou. (R. Gregor-Smith, Transl.) New York: Scribner.
Calleja, J.J. (1991) A Kantian Epistemology of Education and Peace: An Evaluation of Concepts and Values. PhD Thesis. Bradford: Department of Peace Studies, University of Bradford.
Chomsky, N. (2002) Understanding Power: The Indispensable Chomsky. (P.R. Mitchell and J.Schoeffel, Eds.). New York: The New Press.
Ehrenreich, B. (1999) Men Hate War, Too. Foreign Affairs 78 (1): 118–22.
Enloe, C. (2007) Globalization and Militarism: Feminists Make the Link. Lanham: Rowman and Littlefield.
Erasmus, D. (1974) Collected Works of Erasmus. Toronto: University of Toronto Press.
Finnis, J. (1980) Natural Law and Natural Rights. Oxford: Clarendon Press; New York: Oxford University Press.
Fontan, V.C. (2012) Decolonizing Peace. Lake Oswego: Dignity Press.
Galtung, J. (2010) Peace, Negative and Positive. In: N.J. Young (Ed.). The Oxford Encyclopedia of Peace. (pp. 352-356). Oxford and New York: Oxford University Press.
Galtung, J. (1996) Peace by Peaceful Means. London: SAGE Publications..
Gandhi, M.K. (1966) An Autobiography: The Story of my Experiments with Truth. London: Jonathan Cape.
Girard, R. (1977) Violence and the Sacred. (P. Gregory, Transl.) Baltimore: John Hopkins University Press.
Hobbes, T. (1998) On the Citizen (R.Tuck and M.Silverthorne, Eds.) Cambridge: Cambridge University Press.
Hobbes, T. (1994) Leviathan (E. Curley, Ed.) Indianapolis: Hackett.
Kant, I. (1992-) The Cambridge Edition of the Works of Immanuel Kant. (P.Guyer and A. Woods, Eds.) Cambridge: Cambridge University Press.
King, M.L. (1963) Strength to Love. Glasgow: Collins.
Kübler-Ross, E. (2014) On Death and Dying. New York: Scribner.
Locke, T. (1988) Two Treatises of Government. (P. Laslett, Ed.) Cambridge: Cambridge University Press.
Locke, T. (2010) A Letter Concerning Toleration and Other Writings. (M. Goldie, Ed.) Indianapolis: Liberty Fund.
Macquarrie, J. (1973) The Concept of Peace. London: SCM.
More, T. (1999) Utopia. (D. Wootten, Ed.) Cambridge: Hackett Publishing.
Noddings, N. (1984) Caring: A Feminine Approach to Ethics and Moral Education. Berkeley: University of California Press.
Page, J.S. (2008) Peace Education: Exploring Ethical and Philosophical Foundations. Charlotte: Information Age Publishing.
Page, J.S. (2010) Peace Education. In: E. Baker, B. McGaw, and P. Peterson (Eds.) International Encyclopedia of Education. (Volume 1, pp. 850–854). Oxford: Elsevier.
Page, J.S. (2014) Peace Education. In: D. Phillips (Ed.) Encyclopedia of Educational Theory and Philosophy. (Volume 2, pp. 596-598). Thousand Oaks: Sage Publications.
Peck, M.S. (1983) People of the Lie. New York: Simon and Schuster.
Peterfi, W. (1979) The Missing Human Right: The Right to Peace. Peace Research, 11(1): 19-25.
Plato (1987) Plato: Complete Works. (J.Cooper and D.Hutchinson, Eds.) Indianapolis: Hackett.
Rawls, J. (1999) The Law of Peoples. Cambridge: Harvard University Press.
Reardon, B. (1993) Women and Peace: Feminist Visions of Global Security. Albany: State University of New York Press.
Roche, D. (2003) The Human Right to Peace. Toronto: Novalis.
Rousseau, J. (1990-2010) Collected Writings. (R. Masters and C. Kelly, Eds.) 13 volumes. Dartmouth: University Press of New England.
Rummel, R. (1994) Death by Government. New Brunswick: Transaction Press.
Scarry, E. (1985) The Body in Pain. New York and London: Oxford University Press.
Spinoza, B. (2002) Baruch Spinoza: The Complete Works. (M.L. Morgan, Ed., S. Shirley, Transl.) Indianapolis: Hackett.
Watson, P.S. and Rupp. E.G. (Eds.) (1969) Luther and Erasmus: Free Will and Salvation. London: SCM Press.

Author Information

James Page
Email: jpage8@une.edu.au
University of New England
Australia

David Lewis (1941–2001)

David Lewis was an American philosopher and one of the last generalists, in the sense that he was one of the last philosophers who contributed to the great majority of sub-fields of the discipline. He made central contributions in metaphysics, the philosophy of language, and the philosophy of mind. He also made important contributions in probabilistic and practical reasoning, epistemology, the philosophy of mathematics, logic, the philosophy of religion, and ethics, including metaethics and applied ethics. He published four monographs and over one hundred articles.

Lewis’s contributions in metaphysics include foundational work in the metaphysics of modality, in particular his peculiar view of concrete modal realism. He also developed influential views about properties, dispositions, time, persistence, and causation. In the philosophy of language, he made important contributions to our understanding of conditionals—counterfactuals in particular. He also developed an influential account of what it is for a group of individuals to use a language, based on his similarly influential account of what it is for a group of individuals to adopt a convention. In the philosophy of mind, Lewis gave an important defense of mind-brain identity theory, and also developed an account of mental content that was based on his metaphysics of properties and modality.

This article discusses in detail only Lewis’s most popularized and influential views and arguments in metaphysics, the philosophy of language, and the philosophy of mind. His views on metaphysics are discussed first, but his views on language and mind are no less influential. The focus is on representative examples of his most important views and arguments concerning particular issues. The article begins with a few short remarks about his biography, and it ends with a discussion of some of his other philosophical contributions.

Life
Modality
Properties
Time and Persistence
Humean Supervenience
Causation
Counterfactuals
Convention
Mind
Other Work and Legacy
References and Further Reading

1. Life

David Kellogg Lewis was born in 1941 in Oberlin, Ohio. He did his undergraduate studies at Swarthmore College in Pennsylvania. He studied abroad for a year in Oxford, where he was tutored by Iris Murdoch, and where he had the opportunity to attend lectures by J. L. Austin. These experiences inspired him to major in philosophy when he returned to Swarthmore. He did his Ph.D. at Harvard, studying under W. V. O. Quine, who supervised his dissertation, which was the basis of his first book, Convention (1969). There he met his wife Stephanie, with whom he ultimately co-authored three papers. He worked at UCLA from 1966 to 1970, moving from there to Princeton, where he remained until his death in 2001. He spent a lot of time visiting and working in Australia from 1971 onward. As a result, his work was deeply influenced by a number of Australian philosophers, and, in turn, his work has made an indelible mark on analytic philosophy in Australia.

2. Modality

If you are looking for what Lewis had to say about modality, you most likely want to learn about his well-known but rather idiosyncratic view, concrete modal realism. The study of modality is the study of the meanings of expressions like ‘necessarily’ and ‘possibly’. One can assert that Socrates was a blacksmith, which is, of course, false. But one can also assert something weaker, that, possibly, Socrates was a blacksmith (that is, Socrates could have been a blacksmith). Or one can assert something stronger, that, necessarily, he was a blacksmith (that is, he could not have failed to be a blacksmith). There are different senses of the words ‘necessarily’ and ‘possibly’. One is related to what someone knows. Perhaps you are unsure whether Socrates was a philosopher. You might say that Socrates could have been a philosopher, meaning that, for all you know, Socrates was a philosopher (that is, nothing you know contradicts it). Or perhaps you are certain that he was a philosopher, in which case you might simply say that Socrates was a philosopher. Or you might say something stronger—that Socrates must have been a philosopher (that is, what you know contradicts his not having been a philosopher). This sort of modality is epistemic modality. The sort of modality Lewis was most concerned with in his development of concrete modal realism is alethic modality, and concerns how things might have been, or how things must be, regardless of what anyone thinks or knows about it.

One of the central questions in the study of (alethic) modality is what ‘necessarily’ and ‘possibly’ mean. Most discussions of modality are framed in terms of modal logic, which is a formal language that is an extension of propositional or first-order logic, generated by adding the modal operators ‘necessarily’ and ‘possibly’, abbreviated by ‘ $\Box$ ’ (the box) and ‘ $\Diamond$ ’ (the diamond). One approach to the question of what the modal operators mean is simply not to answer to it, and to take them as primitive, that is, to take their meanings to be unanalyzable. But, the reader might think, this is not all that satisfying an approach to take. And Lewis would agree. One of the first things he does in his seminal work on concrete modal realism, On the Plurality of Worlds (1986b)—hereafter ‘Plurality’, is to argue that the modal operators should not be taken to be primitive, but instead should be given some sort of analysis in non-modal terms. In the mid-20th century, logicians developed semantics for a variety of systems of modal logic. These semantics provide truth conditions for the box and diamond in terms of mathematical objects which came to be called ‘possible worlds’, since they were naturally interpretable as ways that the world could have been. Trump won the 2016 U.S. presidential election. But it could have been otherwise. He could have lost. Imagine that Trump lost the 2016 election, and that as few other facts as possible are different in order for that to have happened. What you are imagining is a possible world. The basic idea behind any possible-worlds-based analysis of the modal operators is rather simple. One can state the conditions in which sentences involving the modal operators are true in terms of possible worlds, by quantifying over them with quantifiers that behave exactly like those of the universal and existential quantifiers from standard first-order logic, as follows:

$\Box p =_{df}$ for every possible world w, p is true at w.

$\Diamond p =_{df}$ for some possible world w, p is true at w.

So a statement is necessarily true if it is true at every possible world and false otherwise. And it is possibly true if it is true at at least one possible world and false otherwise. It is actually true if it is true at the actual world (that is, the possible world which we inhabit).

Lewis was not the first to interpret the objects quantified over in these analyses, and at which propositions are true (and false), as possible worlds. Thus he was not the first to admit possible worlds into his ontology. What sets him apart from many of those who came before was how he conceived of possible worlds. Typically, worlds were thought of as abstract objects, for example, as maximal consistent sets of sentences of some interpreted language (1986b: 142 ff.). A maximal set of sentences is one that contains, for every sentence p, either p or its negation. A consistent set of sentences is one which does not imply a contradiction. So, {grass is green, grass is not green} is not consistent. Nor is {grass is green, if grass is green then snow is white, snow is not white}. However, {grass is green, snow is white}, is consistent, though not maximal. For Lewis, a possible world is not some abstract object like a set of sentences. Instead, it is something akin to our own world—a continuum of spacetime filled with objects of various sorts, like the ones we ourselves are surrounded by—galaxies, stars, mountains, people, chairs, atoms, and so forth. Possible worlds, for Lewis, are concrete, just like this world in which we find ourselves. Strictly speaking, modal realism is just the view that possible worlds exist (whether one thinks they are abstract or concrete). Concrete modal realism is the view that they exist and are concrete objects. It is this latter, more controversial thesis that Lewis is famous for defending.

Lewis’s argument for concrete modal realism has two main parts. The first part consists in arguing for the ‘realist’ part of concrete modal realism, thereby providing reasons against the alternative of taking the modal operators as primitive. His argument for this consists in showing what possible worlds are good for. He highlights some things that can be done, or can more easily be done, if possible worlds are available. He highlights four such things. The first concerns certain modal locutions of natural language (English) that do not appear to be translatable into sentences with just the box and diamond. One sort of such locution involves modal comparisons. The example Lewis gives is: “a red thing could resemble an orange thing more closely than a red thing could resemble a blue thing” (1986b: 13). Lewis’s analysis involves quantification over possible individuals:

For some x and y (x is red and y is orange and for all u and v (if u is red and v is blue, then x resembles y more than u resembles v)). (1986b: 13)

But, he points out, one would not be able to translate the original sentence with just boxes and diamonds, since “formulas [of modal logic] get evaluated relative to a world, which leaves no room for cross-world comparisons” (1986b: 13). A realist about modality like Lewis, according to whom possible worlds, including the things in them, are as real as our own world and the things in it, is able to make these cross-world comparisons, and thus do justice to modal locutions of natural language that the modal primitivist cannot. He points out that this problem extends past natural language and into philosophical quasi-technical language. The basic idea behind supervenience, the philosophical workhorse of Lewis’s day, used to formulate various theses about dependence, is that the Fs supervene on the Gs if and only if there could be no difference in the Fs without a difference in the Gs. But, he notes (1986b: 14 ff.), attempts to capture this basic notion strictly in terms of the modal operators have failed, either resulting in something too weak or too strong.

The other jobs that Lewis thinks possible worlds can do are briefly outlined as follows. The second job is that talk of possible worlds allows us to make sense of the idea that some possibilities are closer to actuality than others (for example, Hillary Clinton’s having won the 2016 election is a closer possibility to actuality than her being in command of a colonial expedition to the Andromeda galaxy). Such comparisons are useful in making sense of counterfactual claims, that is, claims of the form ‘if it were the case that p then it would be the case that q’. Discussion of Lewis’s account of counterfactuals, and the role possible worlds play in it, occurs in section 7. The third job Lewis thinks that possible worlds can do is that they provide us with the resources to formulate what he takes to be the best theory of mental content, that is, the best theory about what our thoughts are about. He thinks such a theory will construe such contents as sets of possibilities, that is, as sets of possible worlds or possible individuals. The fourth job is that Lewis thinks that sets of possible individuals can play the role of properties, a discussion of which occurs in detail in the next section (section 3). One who takes the modal operators as primitive will not be able to accomplish these things—at least not as easily. This is already clear in the case of jobs three and four; a primitivist about modality will simply not have the worlds and individuals hanging around which they can collect up into sets to act as properties or the contents of our thoughts. While some may balk at some of the consequences of modal realism (such as that there exist infinitely many talking donkeys in other possible worlds, in virtue of it being possible that infinitely many talking donkeys exist), Lewis thinks that these theoretical benefits nonetheless provide reason to prefer modal realism to the primitivist alternative.

The second part of Lewis’s argument for concrete modal realism consists in arguing for the ‘concrete’ component of the view, and comprises a number of arguments against various forms of modal realism which regard possible worlds as abstract entities of one sort or another—what he calls ‘ersatz realism’. Often times, Lewis’s strategy is to argue that concrete modal realism does a better job solving certain problems as compared to these ersatzist alternatives. These arguments can be found in chapter three of Plurality. Just one example, conveniently connected to issues already discussed, is Lewis’s first argument against what he calls ‘linguistic ersatzism’, the view, already introduced, that possible worlds are maximal consistent sets of sentences. Lewis’s complaint is that linguistic ersatzism is committed to a primitive conception of modality—something which Lewis has already argued against, and something to which his own view is not similarly committed. Lewis provides two reasons to think linguistic ersatzism is committed to primitive modality, of which only the first is discussed here. The notion of consistency, in part in terms of which the linguistic ersatzist characterizes possible worlds, appears to be a modal notion: “a set of sentences is consistent iff those sentences, as interpreted, could all be true together” (1986b: 151 ital. orig.). Since Lewis’s own view is not committed to primitive modality, he is able to give a complete analysis of modality in terms of his particular brand of possible worlds, while the linguistic ersatzist is not.

Lewis’s view about modality is distinctive not only in that he takes possible worlds to be concrete. It is also distinctive in the way it analyzes possibility and necessity claims about individuals. Consider possibility claims. One might think that, for something to possibly be some way, there is a possible world at which that very thing is that way. So, for example, one might think that, for it to be true that Hubert Humphrey could have won the 1968 United States presidential election, there is a possible world at which Humphrey—the very same person who lost the 1968 election in the actual world—won the 1968 election. This is a very natural way to think about the analysis of possibility claims. The thesis that objects exist in more than one possible world is known as ‘transworld identity’. When worlds are taken to be concrete, transworld identity amounts to the claim that worlds share constituents, and, for this reason, Lewis calls it ‘(concrete) modal realism with overlap’. It is typically understood as the idea that a thing in this world which could have been qualitatively different than it actually is itself inhabits another possible world as well, in which it is qualitatively different. Instead of taking this approach, Lewis elects to reject any overlap among possible worlds, and to analyze possibility and necessity claims about individuals in terms of counterparts. In particular:

$\Box Fa =_{df}$ for every possible world w at which a counterpart of a exists, $Fa$ is true at w.

$\Diamond Fa =_{df}$ for some possible world w at which a counterpart of a exists, $Fa$ is true at w.

Lewis’s analysis of modality in terms of counterparts is known as ‘counterpart theory’. His complete view about modality, then, is what could be called ‘concrete modal realism with counterpart theory’.

Lewis discusses counterpart theory in Plurality, Ch. 4, ‘Counterpart Theory and Quantified Modal Logic’ (1968), and ‘Counterparts of Persons and Their Bodies’ (1971). When do x and y stand in the counterpart relation? Lewis thinks an object’s counterparts will track intrinsic similarity to some extent. But the notions come apart. This is mainly because the counterpart relation is context-sensitive. This is connected to a factor that Lewis thinks constitutes an advantage of counterpart theory to concrete modal realism with overlap, namely, it can help us make sense of variability in our judgments about what properties are essential to an object (1986b: 252–53). Consider a statue of a human being made of clay standing in a grotto. Many are inclined to say that it is essential to the statue that it has the shape it has. Were it another shape (for example, the shape of a horse), it would be a different statue. The lump of clay, however, would have been the same object even if it were shaped differently than it is. One solution to this problem is to say that there are actually two objects in the grotto: the statue, with a certain set of essential properties, and the lump of clay, with a different set. But Lewis took it to be a cost to be saddled with the possibility of multiple objects that occupy exactly the same spatial region. Lewis’s solution was to note that there can be a single object in the grotto but, when we are describing it as a statue (context 1), we are particularly interested in a certain set of the object’s properties, while, when we are thinking of it as a lump of clay (context 2), we are interested in a different set. In context 1, a lump of clay that was sourced from exactly the same place as the lump of clay in our world was sourced will not count as a counterpart of the object in the grotto if it has a different shape. But it will count as a counterpart of the object in the grotto in context 2. This allows Lewis to explain why, in context 1 but not context 2, we are inclined to say that the object has its shape essentially. In every possible world in which the object has a counterpart (described as a statue), that counterpart will have the same shape that it does.

Lewis’s key arguments against concrete modal realism with overlap appear in chapter four of Plurality. Another important argument is based on what Lewis calls ‘the problem of accidental intrinsics’. If possible worlds share parts (like Humphrey), it is not clear, given modal realism with overlap, how Humphrey could have different intrinsic properties at each world. He presumably does so, since, for at least some of the intrinsic properties he actually has, he could have lacked them, and for at least some of those he actually lacks, he could have had them. Lewis’s example concerns Humphrey’s shape. He actually has five fingers on his left-hand. But he could have had six. It will not do, Lewis thinks, for the proponent of overlap to relativize Humphrey’s property instantiation to worlds, saying, for example, that he has five fingers on his left-hand relative to the actual world, but that the world relative to which he has six fingers on his left-hand is a distinct world. This might work for a tower having different cross-sectional shapes on different levels, Lewis says, for example, being square on the third floor but circular on the fourth. But, he points out, it is only a part of the tower that has the shape at each level. According to modal realism with overlap, the whole of Humphrey exists at each world at which Humphrey exists. Similarly, the relativization strategy might work when Humphrey is honest according to one media source and dishonest according to other. The sources represent Humphrey in different ways. This might work for the ersatzist, whose ersatz individuals merely represent actual objects (as would, for example, a collection of predicates which are sufficient to represent Humphrey and no one else). According to the concrete modal realist, however, possible individuals are individuals, not representations of individuals. Finally, the relativization strategy might work with extrinsic relations like being a father of. A man might be father of Ed and son of Fred, that is, he might be father relative to Ed but not to Fred. But Humphrey’s five-fingeredness concerns his shape, and, as Lewis points out, “If we know what shape is, we know that it is a property, not a relation” (1986b: 204).

Counterpart theory is not without its detractors. Saul Kripke (1980: 45, fn. 13), for example, complains that, on Lewis’s view, possibility claims about an individual are not actually about that individual him-, her-, or itself, but, rather, about one of his, her, or its counterparts. When one says, for example, ‘Humphrey could have won the 1968 election’, the complaint goes, one is not saying something about the Humphrey we are acquainted with—that is, one is not strictly saying something about that very individual who, in our actual world, lost the 1968 election. Instead, one is saying something about an individual that exists in some other possible world, who is similar to our actual Humphrey in certain relevant respects and to sufficient degrees, who won the 1968 election in that world. Lewis is unimpressed with this objection (see, for example, Plurality: 196). He thinks that ‘Humphrey could have won the 1968 election’ is about our Humphrey—the Humphrey in the actual world. Granted, the analysis of this claim involves invoking a distinct entity—one of Humphrey’s counterparts. But it is the actual Humphrey who has the modal property of possibly winning. His counterpart, in contrast, has the property of winning (simpliciter) as well.

3. Properties

Lewis was a realist about properties. That is, he thought that properties exist. Properties can be intuitively understood as ways that things can be. Beyond that very general conception, disagreement arises. One major point of disagreement is about whether properties are repeatable—that distinct things which can be truly ascribed to be similar, in some respect, literally share something in common. This sort of property is usually termed a ‘universal’. Those who endorse this view are realists about universals. According to realists, greenness, for example, is a sui generis entity, distinct from any particular green thing, that is had, or instantiated by each green thing. Realists typically seek to explain the similarity among similar things (such as green things), by appealing to the fact that each instantiates the same universal (so each green thing instantiates greenness). Those who deny the claim that properties are repeatable are nominalists about universals. (This form of nominalism is stricter than that most commonly at issue in the philosophy of mathematics, which denies the existence of all abstract entities, including sets.) Nominalists about universals come is many flavors. David Armstrong (1978a) provides a relatively comprehensive taxonomy of them. Of particular relevance to Lewis’s views on the matter are class nominalists, who identify properties with the sets of the individuals that can be truly described as having them. On such a view, the property of greenness, for example, is identified with the set of green things S_G, that is, as that set which contains frogs, grass, the Statue of Liberty, and so forth. To instantiate the property of greenness is just, according to the class nominalist, to belong to the set S_G.

Lewis is officially a nominalist. He elected to identify properties with sets, and thus his view was a form a class nominalism. (Lewis had perfectly analogous views about relations.) As such, Lewis’s view faces challenges similar to those class nominalists face. Chief among them is the problem of coextensive properties, which is the concern that class nominalism must identify any properties which have the same extension (that is, apply to the same individuals), whether those properties are intuitively the same or not. The set of those organisms which have hearts, for example, is, as it happens, the same as that which have kidneys. As such, the class nominalist is forced to identify the property of being a cordate with that of being a renate. This seems wrong, however. The former property seems to concern one sort of organ, the latter a completely different sort of organ. These properties seem to be distinct.

Lewis’s solution to this problem is made possible by his views on modality. Lewis identifies each property not with the set of individuals in the actual world to which it can be truly ascribed. Rather, he identifies it with the set of individuals in all possible worlds to which it can be truly ascribed. Due to his views about modality, such individuals exist, and are thus available to be members of sets. The result is a class nominalism that is immune to the aforementioned problem. While it is actually true that every cordate is a renate and vice versa, this is an accident—the result of a long and complex series of events in the evolutionary history of life on Earth. But this history could have unfolded differently. Thus there are possible worlds, according to Lewis, which contain organisms which have hearts but which filter toxins in a different way. And there are worlds which contain organisms which have kidneys but deliver oxygen to cells in a different way. The existence of organisms of either sort ensures that the set of cordates is distinct from the set of renates, and so ensures that these properties are distinct. Of course, one might raise the concern that Lewis’s view has a perfectly analogous problem with properties whose extensions are identical in every possible world, as that of being a triangular polygon and being a trilateral (three-sided) polygon presumably are. For more on this issue, see section 2 of Sophie Allen’s article ‘Properties.’

So far, Lewis looks to be nothing more than a class nominalist, if a relatively sophisticated one, owing to the tricks he can draw from his concrete modal realist bag. But he recognizes that universals do important philosophical work. He enumerates the jobs that universals can do in ‘New Work for a Theory of Universals’ (1983a). To take just one example, Lewis admits that universals can serve to distinguish laws of nature from mere accidental regularities. Armstrong (1978b and 1983) employs universals in this way in his theory of lawhood. According to Armstrong, what ensures, for example, that:

(G1) All uranium spheres are less than one mile in diameter

is a law of nature, while:

(G2) All gold spheres are less than one mile in diameter

is not, is that (G1) is made true not just by the contingent fact that there are no uranium spheres that are one mile in diameter or larger. It is made true by a certain fact that holds at certain worlds about the universals being a uranium sphere and being less than one mile in diameter. These universals jointly instantiate a second-order universal (second-order because it relates universals rather than particulars), which relates these two universals in such a way that it guarantees, at any world at which these universals stand in this relationship, that there will never be a uranium sphere with a diameter of one mile or more (since the relationship between the universals will ensure that any such sphere will explode). There is no such fact concerning the universals being a gold sphere and being less than one mile in diameter. What makes (G2) true is a fact that has nothing to do with these universals. Instead, it has to do only with certain historical contingencies about our world that suffice to explain why, in fact, no gold spheres one mile in diameter or larger ever naturally developed or were artificially constructed. With just his properties, Lewis does not have the resources to explain this difference. Lewis’s properties are abundant. Any old collection of things count as a property. Thus Lewis would have no basis on which to say that the property of being a uranium sphere is related to the property of being less than one mile in diameter in any way that is more (or less) significant than the relation between being a gold sphere and being less than one mile in diameter. He can say that the set-theoretic intersection of each pair is empty, that is, the properties do not share any members (remember, for Lewis, properties are sets). But the similarity of being a uranium sphere and being a gold sphere in this respect would provide him with no basis on which to say that the first figures into a law of nature while the second does not.

Lewis rejects Armstrong’s approach to lawhood (along with his commitment to the existence of universals), and instead characterizes a law as a statement of a regularity that belongs to a suitable deductive system, which (i) is true, (ii) is closed under strict implication (that is, whatever is necessarily implied by any set of statements in the system is also in the system), and (iii) is balanced with respect to simplicity and empirical informativeness. In particular, the system must be as simple as it can be without being informationally too impoverished to do justice to the empirical facts about the world, but, to the extent that it does not sacrifice a sufficient degree of simplicity, it must be as informative as it can be. Nonetheless, Lewis recognizes a problem with his view, and, while he does not need to endorse universals to solve it, he requires something more than his ontology of properties. The problem is that there is a way for a deductive system to meet Lewis’s criteria (i)–(iii) that is clearly undesirable. Suppose we have discovered the best system S for describing the actual world. The way scientists have currently formulated it is rather complicated. But some wiseacre comes up with the idea to introduce a new predicate F into our language and stipulate that F is satisfied by all and only those things at the worlds at which S is true. But suppose further that this wiseacre refuses to provide an analysis of F. S can then be axiomatized with the single axiom ‘ $\forall x Fx$ ’. This theory is very simple, and it is, in a sense, as informationally enriched as it can be, since it perfectly selects the worlds at which S is true. Nonetheless, the theory is useless to the curious inhabitant. It tells them nothing about what their world is like.

The first step of Lewis’s solution to this problem is to adopt some primitive distinctions among properties. There are those that are perfectly natural, those which are natural to some degree (though not perfectly natural), and those which are unnatural. Lewis (1983a: 346 ff.) imagines that the perfectly natural properties will be those properties that would correspond to universals in Armstrong’s metaphysics, which is sparse enough to enable him to distinguish between laws (for example, being made of uranium). Less natural (but still comparatively natural) properties would correspond to families of suitably related universals (for example, being metallic). The spectrum would continue until wholly unnatural, gerrymandered properties are reached (for example, being either the Eiffel Tower or a part of the moon). Lewis notes that admitting universals into one’s ontology can provide the basis for a distinction between more and less natural properties, in the way just gestured at in the comparison with Armstrong’s metaphysics. But he notes that the distinction can be taken to be a primitive one between properties (classes) instead. This is Lewis’s preference; it allows him to avoid realism about universals and thus remain a nominalist. Lewis then solves the problem of the true but useless theory ‘ $\forall x Fx$ ’ by imposing a further criterion that the most suitable deductive system which sets the laws apart from the non-laws is one whose axioms are stated in a way that refers only to perfectly natural properties.

4. Time and Persistence

Lewis’s most well-known writings about time have to do with the persistence of objects. Lewis was a four-dimensionalist. That is, he believed that there exist four-dimensional objects, extended not just in space, but in time as well. Four-dimensionalism is to be contrasted with three-dimensionalism, according to which the only objects which exist are extended in space only (if they are extended at all that is, so as not to rule out the existence of non-extended points of space). Lewis’s commitment to four-dimensionalism was a result of his endorsement of two theses: (1) unrestricted composition, and (2) eternalism. Unrestricted composition is the thesis that any objects compose some object. So not only do my head, torso, arms, and legs compose an object (me), my head and the near side of the moon compose an object as well. Eternalism is a view about the ontology of time, according to which past, present, and future times, objects, and events are equally real. Eternalism is to be contrasted with presentism, the view that only the present time and present objects and events are real, and with the growing block theory, the view that past and present times, objects, and events are real, but future ones are not. Committing oneself to unrestricted composition and eternalism requires one to countenance four-dimensional objects. Not only do any presently existing objects compose an object, past ones do too. And, crucially, objects which exist at different times compose objects as well, such as the object that is composed of George Washington’s first wig and the sandwich someone just made for lunch.

As strange a view as four-dimensionalism might seem, Lewis has good reasons for adopting it. These reasons concern issues connected to the persistence of objects through time. Lewis is a perdurantist, and as such believes that for an object to persist through an interval of time is for it to perdure, that is, to have proper parts, one of which is wholly present at each moment of that interval. Perdurantism is to be contrasted with endurantism, according to which an object’s persistence through an interval of time amounts to the whole object being wholly present at each moment of that interval. Perdurantism, obviously, requires the truth of four-dimensionalism, at least assuming that some objects do in fact persist through time. This is because any such object must have parts which exist at different times. According to perdurantism (at least Lewis’s version—Theodore Sider develops another version of it in 1996 and 2001), the objects that we refer to with our names and definite descriptions are actually four-dimensional worm-like objects. We are acquainted with them by being acquainted with some of their parts at various times. So, for example, the Taj Mahal is a spacetime worm that extends back to about 1653. I am acquainted with it only insofar as I am acquainted with one of its parts, which extends through time for about two hours, which I toured on November 28, 2015. Even human beings, according to Lewis, are actually spacetime worms. They are not themselves shaped like those objects depicted in anatomy textbooks. Instead, those diagrams depict certain parts of human beings that exist at instants of time.

Lewis’s perdurantism might seem like an odd view, but, he thinks, it solves an important problem. Its competitor endurantism faces an important problem which Lewis calls the ‘problem of temporary intrinsics’ (1986b: 202–04 and 2002), which is analogous to the problem of accidental intrinsics which faces concrete modal realism with overlap (see the discussion in section 2). Everyone agrees that objects change over time. A person may previously have been standing and currently be sitting. The endurantist must say that the very same object has both the property of standing and sitting. This looks, at least at first glance, to be a contradiction. Endurantists typically say that the contradiction is only apparent, and they explain it away in various ways. But Lewis does not think any of those strategies succeed. One strategy endurantists use is to say that what we thought were properties, instantiated by a single object, are actually relations, instantiated by an object and a time. There is no contradiction involved in one’s both standing and sitting, since one is standing in relation to one (past) time and sitting in relation to another (the present time). But Lewis thinks that if an intrinsic property like shape (that is, a property having only to do with an object, and nothing to do with how it is related to other objects) is anything, it is not a relation (see the Lewis quotation at the end of section 2). Another strategy endurantists use to explain away the apparent contradiction resulting from temporary intrinsics is to adopt presentism. Since only the present is real, the person has the property of sitting. They do not have the property of standing. (They did have the property of standing when that moment was present. But it is present no longer, and thus is not real.) But, Lewis thinks, presentism comes at a high cost. The presentist must reject the idea that a person has a past and (typically) a future as well, since, according to presentism, neither the past nor future exists. Lewis points out that perdurantism solves the problem nicely. There is something that has the property of sitting—a part of the person that is wholly present at a certain moment in the past. And there is something that has the property of standing—a part of the person that is wholly present at the present moment. But there is no contradiction since these are distinct parts of this person. Lewis’s perdurantist solution appeals to the same consideration which allows us to say that there is no contradiction in my left hand currently being fist-shaped and my right hand currently being open-palmed. They are different parts of me, and so are distinct objects. There is no contradiction in distinct objects having incompatible properties.

5. Humean Supervenience

Lewis believes that everything in the actual world is material. He also defends a thesis he calls ‘Humean supervenience’. Humean supervenience is the thesis that, in Lewis’s words, “all there is to the world is a vast mosaic of local matters of particular fact, just one little thing and another” (1986c: ix). Hume was known for rejecting the idea that there were hidden connections behind conjoined phenomena which necessitate their conjunction. He was not against there being regularities in the world. His objection was to these regularities being explained by necessary connections (such as Armstrong’s second-order states of affairs relating universals—see section 3). Lewis is sympathetic to this view, and also likes the idea that macroscopic phenomena are reducible to certain basic microscopic phenomena. These microscopic phenomena Lewis takes to be just the geometrical arrangement of the world’s spacetime points, and the instantiation of certain perfectly natural properties at each of those points. Lewis takes this to mean that fundamental entities are point-sized, or, perhaps, that the fundamental entities are the spacetime points themselves.

Lewis is willing to admit that other possible worlds sufficiently different from our own might be different in this last respect. In particular, he thinks that it might take more than just the point-wise distribution of instantiations of perfectly natural properties to determine all of the phenomena in the world. Now the scientifically informed reader might object that our current physical theories show that this is not true even at our world. Some of our most promising physical theories, for example, posit spatially extended fields as being among the fundamental constituents of reality, rather than point-like entities. As Daniel Nolan (2005: 29 ff.) and Brian Weatherson (2016: sec. 5) point out, Lewis is concerned more with illustrating the defensibility of this latter thesis than with its truth. It could be regarded as an idealization or simplification, suitable for philosophical purposes, in terms of which Lewis formulates his thesis of Humean supervenience. If it turns out that the fundamental furniture of the world actually consists of spatially extended entities, rather than point-like entities, Lewis will be content to backpedal a bit, and formulate Humean supervenience in a way that is consistent with that, such as, for example, claiming that what is true at a given world is determined by the geometrical arrangement of its spacetime points and where perfectly natural properties are instantiated at the spacetime regions occupied by the fundamental entities. But, as Lewis suggests in ‘Humean Supervenience Debugged’ (1994a: 474), he expects that, even once we have settled on the nature of the physical world, we will find that the profusion of phenomena at our world can be explained by a comparatively sparse base of simple entities instantiating comparatively basic properties and perhaps also standing in comparatively basic relations.

6. Causation

Lewis is known for his counterfactual analysis of causation. Lewis made significant contributions to the semantics of counterfactuals, which will be discussed in the next section. The following is perhaps the most straightforward way to provide an analysis of causation in terms of counterfactuals, though, as we will see, it is importantly different from Lewis’s account:

x causes y iff x and y occur, and if x had not occurred, then y would not have occurred.

Counterfactual analyses of causation are to be contrasted with productive accounts, according to which x causes y iff x produces some change in properties in y, where the notion of production is typically taken to be primitive. Both sorts of analysis face their own characteristic set of problems. This article discusses only what is the most well-known problem for the above counterfactual account, the problem of causal preemption (or causal redundancy) since it will help the reader understand why Lewis develops his own counterfactual analysis of causation in the way that he does. Suppose that Alice and Bob are throwing rocks at bottles and Alice throws her rock at one of the bottles and hits it, shattering it. Intuitively, Alice’s throw caused the bottle to shatter. But suppose also that Bob was ready to throw his rock at the same bottle just in case Alice did not throw, and, moreover, he has perfect aim. Thus Bob’s rock would have struck the bottle, causing it to shatter, had Alice not thrown. Due to this fact, the right side of the above counterfactual analysis of causation is not satisfied in this case. It is not the case that, had Alice not thrown, the bottle would not have shattered. This is because, given the way the case was set up, Bob’s throw would have ensured that the bottle would shatter. Yet, intuitively, Alice’s throw caused the bottle to shatter. Something seems to be wrong with the above counterfactual analysis of causation.

In order to avoid this problem, in ‘Causation’ (1973a), Lewis distinguishes between causation and causal dependence. The above analysis is actually the analysis Lewis provides of causal dependence. He defines causation in terms of chains of causal dependence (where a chain might, but typically will not, have only two nodes). So, for example, if y causally depends on x, and z causally depends on y, then x causes z, even if z might have occurred even if x had not. Lewis thinks there is independent motivation for this move, as he thinks there are often cases in which it is natural to say that x causes z even when z does not counterfactually depend on x. This is explained by Lewis by positing a chain of causal dependence. In general, counterfactual dependence is not transitive. The light would not have come on if I had not flicked the switch. I would not have flicked the switch if I had been out running errands. But the light may well have come on just then even if I had been out running errands. Another member of my family might have walked into the room and flicked the switch. Lewis deals with cases of causal preemption, like the one involving Alice and Bob, by pointing out that, in such cases, there will nonetheless be a chain of counterfactual (and thus causal) dependence which we can invoke to secure the truth of the causal claims we think are true. Lewis grants that it is not the case that, if Alice had not thrown her rock, then the bottle would not have shattered (since Bob would have fired). But, he thinks this establishes only that the bottle’s shattering doesn’t causally depend on Alice’s throw. Since causes need only be linked by chains of causal dependence to their effects, Lewis can still say that Alice’s throw caused the bottle to shatter. He would note first that:

(CF1) the bottle would not have shattered if Alice’s rock had not been speeding toward it.

This is true because, by the time the rock was speeding toward the bottle, Bob has seen that Alice had thrown her rock, and so has refrained from throwing his own rock. Lewis would note second that:

(CF2) Alice’s rock would not have been speeding toward the bottle if Alice had not thrown it.

This sets up a chain of causal dependence between Alice’s throw and the bottle’s shattering, which is enough, on Lewis’s account, to secure the desired conclusion that Alice’s throw caused the bottle to shatter.

Lewis’s counterfactual account of causation, as just explicated, still has a problem with preemption. This is the problem of late preemption, in which one causal process is preempted by the effect rather than by an event earlier in the process. So, for example, rather than Bob’s throw being preempted by Alice’s throwing her rock, suppose Bob threw his rock a split second after Alice threw hers, and that his rock did not hit the bottle only because the bottle had shattered a split second before Bob’s rock reached the bottle’s former position. In this case (adapted from Hall 2004), (CF1) would be false, and so Lewis would be unable to set up a chain of counterfactual dependence on which he could base a determination that Alice’s throw caused the bottle to shatter. This problem led Lewis to revise his view significantly in ‘Causation as Influence’ (2000a and 2004), wherein he analyzes causation in terms of the notion of influence. Lewis characterizes influence as follows:

C influences E iff there is a substantial range C1, C2,… of different not-too-distant alterations of C (including the actual alteration of C) and there is a range E1, E2,… of alterations of E, at least some of which differ, such that if C1 had occurred, E1 would have occurred, and if C2 had occurred, E2 would have occurred, and so on. Thus we have a pattern of counterfactual dependence of whether, when, and how on whether, when, and how. (2000a: 190 and 2004: 91)

The precise circumstances in which an event occurs, including the exact time at which it occurs, and the manner in which it occurs, are relevant to whether one event influences another. On this characterization, Alice’s throw influenced the bottle’s shattering, since it made a difference, for example, to the exact manner in which it occurred. Let’s say, for example, that her rock hit the right side of the bottle, and that it shattered to the left. But if she had thrown a bit to the left, the bottle would have shattered towards the right. The same is not true of Bob’s throw. If he had thrown a bit to the left, the bottle still would have shattered in the way that it did, since Alice’s rock would still have hit it in the way that it did. This allows Lewis to say that Alice’s throw caused the bottle to shatter, despite the fact that Bob’s rock was on its way to ensure that it shatters in case Alice’s aim happened to be off.

Another sort of problem that gives Lewis trouble involve absences. It is not clear how Lewis’s view can deal with cases like when an absence of light causes a plant to die. There is no event in terms of which we can formulate any counterfactuals of the form ‘if x had not occurred, then y would not have occurred’ in such cases. Lewis (for example, 2000a, sec. X) deals with absences by admitting that there are some instances of causation that do not have causes (understood as events). Instead, he thinks that it is true to say that the absence of light caused the plant to die as long the right sorts of counterfactuals are true, for example, ‘if there had been more light over the past few weeks, the plant would have survived’.

7. Counterfactuals

Lewis makes use of some of the tools of his theory of modality in his contributions to the literature on the semantics of counterfactuals. A counterfactual is a certain type of conditional. A conditional is a sentence synonymous to one of the form ‘if…, then…’. An indicative conditional is a conditional whose verbs are in the indicative mood, for example:

(1) If Tom is skiing, then he is not in his office.

Other conditionals are in the subjunctive mood, for example:

(2) If Tom were a skiing instructor, then he would be in great shape.

Many of the subjunctive conditionals that we use on a day-to-day basis, such as (2), are counterfactual conditionals, that is, conditionals whose antecedents express statements that are contrary to what is actually the case. (Suppose Tom is in fact an accountant.) The material conditional ‘→’ from propositional logic can be used to adequately translate many natural language conditionals. Recall that, as an operator that is truth-functional, all there is to the meaning of ‘ $p \rightarrow q$ ’ is its truth conditions as given by its truth table, according to which it is true if either p is false or q is true, and it is false otherwise (that is, when p is true and q is false).

\begin{tabular}{c c | c}<br /> $p$ & $q$ & $p \rightarrow q$ \\<br /> \hline<br /> T & T & T \\<br /> T & F & F \\<br /> F & T & T \\<br /> F & F & T \\<br /> \end{tabular}

But there are many other natural language conditionals which cannot be adequately translated with the material conditional. Counterfactuals form an important class of such conditionals.

Before Lewis, the most well-worked-out accounts of counterfactuals construed them as strict conditionals meeting certain conditions (in particular, see Goodman 1947 and 1955). A strict conditional is just a material conditional that holds of necessity, that is, a statement of the form ‘ $\Box (p \rightarrow q)$ ’. The simplest strict-conditional account of counterfactuals (which is admittedly simpler than Goodman’s, but will be sufficient to motivate Lewis’s account) analyzes each counterfactual in terms of the corresponding strict conditional, that is,

‘ $p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q$ ’ is true iff $\Box (p \rightarrow q)$ .

(Following Lewis in Counterfactuals, (1973b, 1–2), ‘if it had been the case that p then it would have been the case that q’ is abbreviated with ‘ $p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q$ ’.) This account is inadequate because a strict conditional is like a material conditional insofar as strengthening its antecedent cannot take the entire conditional from being true to being false, whereas this is not so for counterfactuals (see Lewis 1973b: ch. 1, Nolan 2005: 74 ff., and Weatherson 2016: sec. 3.1). Recall from propositional logic that the following inference pattern is valid.

$ \begin{array}{l} p \rightarrow q \\ \hline (p \land r) \rightarrow q \end{array} $

The analogous inference pattern involving the strict conditional is also valid:

$ \begin{array}{l} \Box (p \rightarrow q) \\ \hline \Box [(p \land r) \rightarrow q] \end{array} $

But the analogous inference for the counterfactual conditional is not valid:

$ \begin{array}{l} p \ensuremath{\mathrel{\Box\kern-1.5pt\raise1pt\hbox{$\rightarrow$}}} q \\ \hline (p \land r) \ensuremath{\mathrel{\Box\kern-1.5pt\raise1pt\hbox{$\rightarrow$}}} q \end{array} $

Suppose that the counterfactual (2) above is true, and consider the following strengthening of it:

(3) If Tom were a skiing instructor and he always wore a robotic exoskeleton so that he did not ever expend any energy, then he would be in great shape.

(3) appears to be false. If he never expended any energy, he would not be in great shape. But (3) follows from (2) on the strict conditional account because of the validity of the above inference pattern involving the strict conditional. It does not, however, follow on Lewis’s account.

Lewis analyzes counterfactuals in terms of possible worlds, and the basic idea behind his analysis is similar to that of Robert Stalnaker (1968). Stalnaker proposed the following analysis of counterfactuals in terms of the similarity of worlds:

‘ $p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q$ ’ is true iff the most similar p-world to the actual world is also a q-world, where a p-world is just a world at which p is true.

(Technically this only specifies the truth conditions for counterfactuals that are non-vacuously true, that is, when there is at least one p-world most similar to the actual world. But we can ignore vacuously true counterfactuals.) Lewis has a helpful metaphor which he employs when thinking about the similarity between worlds. He thinks about possible worlds as if they were arranged in a space, with the actual world at the center, with larger and smaller degrees of similarity to the actual world being represented by larger and smaller distances from (closeness to) the actual world. Counterfactual (2) above, for example, is true, on Stalnaker’s account, because the most similar (closest) world to the actual world at which Tom is a skiing instructor is one at which he is in great shape. A world in which Tom wears a robotic exoskeleton while teaching people to ski (thus keeping him in poor shape) is plausibly less similar to (farther away from) the actual world than one in which he teaches people to ski using his own muscles. (3), however, requires one to look at the closest world at which both Tom is a skiing instructor and Tom wears a robotic exoskeleton. And in that world, plausibly, Tom is not in great shape. It would require even more changes in the actual facts to ensure that Tom would be in great shape in such a world (for example, Tom has taken a pill—the result of a medical breakthrough that has not occurred at the actual world—that keeps his body in great shape even if he does not exercise).

There are important differences between the analysis Lewis ultimately settles on and Stalnaker’s. For one, Lewis rejects Stalnaker’s assumption that there will always be a unique p-world that is most similar to the actual world. As a result, the analysis that Lewis adopts is closer to the following:

‘ $p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q$ ’ is true iff all p-worlds that are most similar to the actual world are also q-worlds.

Lewis also challenges the tempting assumption that there is a closest “sphere” of p-worlds to the actual world (this is the Limit Assumption—see 1973b: 19 ff.). Without it, counterfactuals are best analyzed as follows:

‘ $p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q$ ’ is true iff there is a $(p \land q)$ -world that is more similar to the actual world than any $(p \land \neg q)$ -world.

Finally, Lewis questions the tempting assumption that each world is more similar to itself than any other world (1973b: 28 ff.). Making this assumption results in $p \land q$ entailing $p \mathrel{\Box\kern-1.5pt\raise1pt\hbox{\rightarrow}} q$ . So, for instance, ‘Tom is a skiing instructor and Tom is in great shape’ would entail (2). But it would seem odd for this counterfactual to be true if its antecedent were not in fact false. In the end, Lewis sticks with this assumption for technical reasons (cf. Weatherson 2016: sec. 3.2).

Lewis’s analysis of counterfactuals is not without problems. Kit Fine (1975), for instance, argues that Lewis’s account, as it stands, makes the following counterfactual false, though it is presumably true:

(4) If Nixon had pressed the button, there would have been nuclear war.

It seems that any of the worlds in which Nixon pressed the button that are most similar to the actual world are ones in which there was no nuclear war, but in which instead some relatively minor miracle occurred—some violation of the natural laws of our world, perhaps specific to the exact location of the button and the specific time at which Nixon pressed it—which renders the button momentarily useless. To surmount this problem, Lewis says more about similarity in ‘Counterfactual Dependence and Time’s Arrow’ (1979b). He had already noted that similarity would be context-sensitive in his book Counterfactuals. That is, he had already noted that the “distance” that possible worlds are from the actual world might be different for the same counterfactual when it is uttered in different contexts. If, for example, (2) were uttered in a context in which it had already been established that Tom owned a robotic exoskeleton and was considering using it, the closest worlds to the actual world would include those in which he wore it and thus maintained a poor physique, thus rendering the counterfactual false instead of true. But Lewis says little else about similarity there.

To deal with Fine’s challenge, Lewis outlines a number of rules which one should abide by while measuring similarity given a context:

(1) It is of the first importance to avoid big, widespread diverse violations of law.

(2) It is of the second importance to maximize the spatiotemporal region throughout which a perfect match of particular fact prevails.

(3) It is of the third importance to avoid even small, localized, simple violations of law.

(4) It is of little or no importance to secure approximate similarity of particular fact, even in matters that concern us greatly. (1979b: 472)

Lewis assumes determinism throughout his discussion. That is, he assumes that everything that occurs is necessitated by the events which occurred earlier together with the laws of nature. Lewis thinks that determinism better explains, in comparison to indeterminism, the fact that counterfactuals which concern events which occur at different times exhibit an asymmetry which encodes the fixedness of the past and the openness of the future (1979b: 460). Given the assumption of determinism, and the assumption that Nixon did not press the button in the actual world, any world in which Nixon did press the button must either (i) be a world in which a small miracle occurred to enable Nixon to press the button despite having the same history as the actual world or (ii) be a world that has a completely different history than our own world, to enable Nixon’s pressing of the button to be necessitated by that history. By Lewis’s rules above, type (i) worlds are more similar to the actual world than type (ii) worlds, since the latter violate the more important rule (2). Type (i) worlds are identical to the actual world up to the point at which Nixon is considering pressing the button. Type (ii) worlds have completely different histories. Type (i) worlds violate only the less important rule (3), since they feature a small miracle. Lewis grants that there will be worlds with the same history as the actual world in which Nixon presses the button but no nuclear war ensues because another miracle causes a malfunction in the button, preventing the warheads from launching. But these worlds will have to involve miracles in addition to the one which enables Nixon to press the button. This is a further violation of rule (3). In contrast, a world in which Nixon presses the button and nuclear war ensues will violate the less important rule (4). As a result, Lewis concludes, the most similar worlds to the actual world are worlds in which Nixon presses the button and nuclear war ensues. Lewis’s account, therefore, makes the above counterfactual (4) true, as it should be.

8. Convention

Lewis’s earliest work is devoted to developing an account of what it is for a group of individuals to use a language. The lion’s share of his work on this issue can be found in his first book, Convention (1969) (see also ‘Languages and Language’ (1975)). Lewis makes use of the notion of a convention in his analysis of language use, and a significant part of the importance of this book is due to the account of conventions that he offers. Conventions about language use are by no means the only ones around. It is, for example, a convention in the United States to drive on the right-hand side of the road. An initial picture of convention that one might have is one of convention as the result of agreement. That is, one might think that a convention among some individuals is the result of an agreement they make with one another. However, individuals appear able to make an agreement only in a language. Thus one cannot give an analysis of what it is for a group of individuals to speak a language in terms of convention, understood in terms of agreement, since it would be circular; it would presuppose that these individuals speak a language (cf. Weatherson 2016: sec. 2). Lewis’s analysis of conventions avoids this problem.

What motivates the implementation of conventions are coordination problems. Roughly, a coordination problem is a problem facing two or more people where the best outcome for each person can result only by the coordination of their actions. Suppose, for example, that each member of a group of people is trying to decide which side of the road to drive on. Consider one such individual, Carol. Carol might have her own basic unconditioned preference on which side to drive. She might, for instance, prefer to drive on the right-hand side of the road because the steering wheel of her car is situated on the right-hand side, and she would like to place herself as far from oncoming traffic as possible. Still, she has a conditional preference concerning driving on the left-hand side of the road. She would prefer to drive on the left-hand side of the road on the condition that everyone else drives on the left-hand side of the road. This is rooted in Carol’s desire to minimize the chances she is hit by oncoming traffic. We can suppose that everyone (or at least almost everyone) in the group has the conditional preferences that she prefers to drive on the left (right) side of the road on the condition that everyone else drives on the left (right) side of the road. Notice that there are two ways to solve these individuals’ coordination problem: (1) they might adopt the convention that everyone drive on the left side of the road, and (2) they might adopt the convention that everyone drive on the right side of the road. When everyone in the group settles on one of these options, what results is a coordination equilibrium.

It is important to note that there is more than one equilibrium which the members of the group can adopt to create the best outcome for all of them. It is in such circumstances that a convention must be adopted. In other words, some coordination problems will have only a single solution, in which case there is no need for a convention. People will act in such a way just because it creates the best outcome for them (and for everyone else). Suppose, for example, that there is a group of farmers that sell a certain product, say, coffee, to a population. We can suppose that there is a certain price p below which each farmer will fail to make an adequate profit on each item, which would ultimately drive them out of business. And we can suppose that there is certain price p′ above which consumers will forgo the product, substituting it with another less expensive product, like chicory or tea, available from others, or changing their habits altogether to eliminate a bitter morning drink from their diet. Assuming that p′ > p, we can expect these farmers (each of whom, we are supposing, is acting in her own self-interest) to offer their product somewhere within the price range bounded by p and p′. This outcome is not the result of the adoption of a convention among these farmers. It is instead a result of each farmer acting in her own self-interest, of there being only one way for each farmer to achieve the best outcome for herself, and of her accurately observing the character of her market. Solving other coordination problems, however, such as the question of which side of the road everyone should drive on, requires a convention, since there are two possible ways to achieve the best outcome for everyone involved.

Of course, everyone in Carol’s group could get together and have a vote to decide which side of the road everyone in their group should drive on, in effect making an explicit agreement with one another. Perhaps the majority of car owners have an unconditioned preference like Carol’s, and prefer, for whatever reason, to drive on the right-hand side of the road. In this case, the result will be that everyone agrees to drive on the right-hand side of the road. But, importantly, agreement is not the only way to establish a convention (1969: 33–34). It might be that, as a matter pure chance, the first handful of people on the road with their cars happened to share Carol’s unconditional preference to drive on the right, and this effectively forced the latecomers to drive on the right in order to avoid the preexisting oncoming traffic.

In the spirit of the above considerations, Lewis ultimately settles on the following analysis of a convention:

A regularity R in the behavior of members of a population P when they are agents in a recurrent situation S is a convention if and only if it is true that, and it is common knowledge in P that, in almost any instance of S among members of P,

1. 1. 1. almost everyone conforms to R;
    2. almost everyone expects almost everyone else to conform to R;
    3. almost everyone has approximately the same preferences regarding all possible combinations of actions;
    4. almost everyone prefers that any one conform to R, on condition that almost everyone conform to R;
    5. almost everyone would prefer that any one conform to R′, on condition that almost everyone conform to R′,

where R′ is some possible regularity in the behavior of members of P in S, such that no one in almost any instance of S among members of P could conform both to R and to R′. (1969: 78)

One aspect of this analysis worth noting immediately is its tolerance for a certain number of exceptions (embodied by the consistent appearance of occurrences of ‘almost’). This is to prevent the analysis from failing to count as a convention what we would think should be counted as one. Of course, from time to time, there are, unfortunately, those who drive on the wrong side of the road. But these isolated incidents should not preclude the existence of a convention in the population to which these individuals belong, even if it did not come about as a result of an agreement. Suppose that the convention to drive on the right side of the road in Carol’s group arose by chance as described above, with all later drivers conforming to the preference of the first few drivers to drive on the right-hand side of the road. After weeks of this, we would not expect a single individual driving a single time on the left side of the road, for whatever the reason (whether the result of negligence or an intentional act of rebellion), to prevent the regularity that had emerged in the behavior of drivers in the group from being a convention. The convention is still there. It is just that this individual has failed, on this occasion, to act in accordance with it.

Another thing worth noting about Lewis’s analysis of convention is that, by ‘common knowledge that p’, Lewis does not require that p be true (1969: 52 ff.). Instead, it is enough that everyone has reason to believe that p, everyone has reason to believe that everyone has reason to believe that p, and so on. Whether or not anyone in fact believes that p, or in fact believes that everyone has reason to believe that p, and so on, is inconsequential to the analysis. This is why Lewis must specify separately that it is true that conditions (1)–(5) hold. Lewis adopts this characterization of common knowledge because he does not want to require, effectively, that, for a convention to hold, everyone believes that it holds. While he expects many people to be adept enough reasoners that they will come to believe the things they have reason to believe, he wants to allow for exceptions—individuals who never explicitly represent to themselves all of the various conditions which must hold for a convention to be present. But the presence of such individuals, of course, should not prevent a convention from being present (1969: 60 ff.).

Conditions (1) and (2) of Lewis’s analysis of convention are relatively straightforward, and they have been discussed above. Condition (4) is relatively straightforward as well. It requires, for example, that the vast majority of Carol’s group prefers that everyone in the group drives on the right-hand side of the road on the condition that almost everyone drives on the right-hand side of the road. If a substantial portion of the population did not desire that a convention be observed, the convention could easily collapse at any time, even if almost everyone had been observing it up to that time. This sort of situation is often exactly what is present just before a convention is abandoned. Consider public order—the tendency for people in many societies to act in an orderly and organized way while out in public. It is not implausible to say that public order is a convention which exists in these societies. And when it does, it is often, at least in part, the result of people wanting to live in a peaceful and orderly environment. But a sufficient number of grievances can develop within a population to the point where their preference for those grievances to be addressed trumps their preference for a peaceful and orderly environment. In such circumstances, the convention of public order can disappear. Condition (5) is what distinguishes conventions from cases where only one coordination equilibrium is possible, as in the example with the farmers selling their coffee. In that case, there existed no other regularity in the behavior of the farmers other than selling their coffee in the price range between p and p′ that would have resulted in the best outcome for each of them.

Condition (3) is a bit trickier to understand. It is connected to formal issues of game theory—particularly with the question of whether a coordination equilibrium is possible. The basic idea behind it can be illustrated with an example. For simplicity, suppose that Carol and Diane are the only people in the group. There are four possible combinations of actions to the coordination problem of which side of the road on which to drive:

(a) Carol drives on the left and Diane drives on the left.

(b) Carol drives on the left and Diane drives on the right.

(d) Carol drives on the right and Diane drives on the right.

And there are, in principle, twenty-four possible ways for each of Carol and Diane to order these actions according to her preference. By adopting condition (3), Lewis aims to ensure that there is enough agreement between the preferences of Carol and Diane to make a coordination equilibrium possible. If, for example, Carol prefers (d) to (a), and (a) to either (b) or (c), then an equilibrium will be unreachable if, for example, Diane prefers either of (b) and (c) to either of (a) or (d). (This is in part because Diane represents a significant portion of the group.)

Now that Lewis’s analysis of convention has been introduced, one can appreciate how he employs it in his account of what it is for a group of individuals to speak a language. Lewis provides an in-depth discussion of what he takes a language to be (1969: 160 ff.). But it should be noted that, for Lewis, a language is not just a collection of basic vocabulary items (a lexicon) and a set of rules for arranging them into more complex elements of the language, including sentences of arbitrary complexity (a grammar). It also includes an interpretation, that is, a function which assigns to each sentence of the language a set of conditions under which that sentence is true (and false). (Technically, the function assigns truth conditions to each possible utterance of each sentence, since Lewis wants to accommodate the possibility of ambiguous sentences, which are standard features of natural languages. Lewis also makes allowance for imperative sentences as well, which are “true” just in case they are obeyed.) So, a language that is just like English except that ‘p or q’ is true iff p is true and q is true and ‘p and q’ is true iff p is true or q is true would not be English, but some other language. Though it consists of the same basic vocabulary items and grammar as English, and thus the same sentences, it supplies interpretations of some of those sentences that are different from those that English supplies. In particular, it switches the truth conditions of ‘and’ and ‘or’ in English. As a result of this conception of languages, a sentence can only be true or false in a language. Another language could also have that same sentence as one of its elements, but it could supply different truth conditions for it.

For Lewis, what it is for a population P to use a language L is for there to be a convention in P to be truthful in L, that is, it is true for almost all individuals to almost always utter sentences only if they believe them to be true (1969: 177, cf. 1975: 7). That is, it is true that, and common knowledge in P that, in almost any instance of verbal communication among members of P:

1. almost everyone is truthful in L;
2. almost everyone expects almost everyone else to be truthful in L;
3. almost everyone has approximately the same preferences regarding all possible combinations of utterances of L;
4. almost everyone prefers that any one person is truthful in L, given that everyone else is truthful in L; and
5. there is some other possible language L′ which almost everyone would prefer that any one be truthful in, on condition that almost everyone is truthful in L′.

But Lewis is careful to note that a person must occasionally use or respond appropriately to utterances of sentences of L in order to be a member of a population that uses L. If, at some point, she stops using and responding appropriately to such utterances, she will eventually not belong to any population that uses L (1969: 178).

9. Mind

There are two major respects in which Lewis contributes to the philosophy of mind. The first concerns his theory of mind, which is a version of the identity theory. The second is his theory of mental content, that is, an account of the contents of certain mental states like what is believed when one has a belief, and what is desired when one has a desire. This article discusses only the former (aside from the brief discussion of the latter included in section 2). As indicated in section 4, Lewis is a materialist insofar as he believes that everything in the actual world is material. As a result, he rejects idealism, that is, the view that everything is mental, and dualism, the view that there are fundamentally two different types of entity, mental and physical. Thus, he is a physicalist, and, as mentioned above, an identity theorist. He is a type-type identity theorist, and as such, identifies each type of mental state (each type of experience we can have) with a type of neurophysiological state. So, for example, for Lewis, pain is identical to, say, c-fiber firing. (C-fibers are nerve fibers in the human central nervous system, activation of which is responsible for certain types of pain.) Such views are typically contrasted with token-token identity theories, which say only that each token mental state is identical to some token physical state. A token-token identity theorist will reject the rather general identity between pain and c-fiber firing, though they will recognize an identity between, say, the specific token of pain that Ronald Reagan felt when he was struck by John Hinkley Jr.’s bullet on March 30, 1981 and the appropriate token neurophysiological event which occurred in Reagan’s brain and which was caused by his nerves firing as a result of the bullet strike.

Lewis’s commitment to his theory of mind can be found in his earliest published work, in ‘An Argument for the Identity Theory’ (1966). Given the title, the reader will not be surprised that his main argument for it can be found there too. He argues that because mental states are defined in terms of their causal roles, being caused by certain stimuli and causing certain behaviors, and because every physical phenomenon’s occurrence can be explained by appeal only to physical phenomenon, the phenomena to which we appeal to explain our behaviors, which are usually rendered in the vocabulary of folk psychology (for example, Alice felt/believed x, so she did y), must themselves be physical phenomena. Folk psychology is the largely unscientific theory that each of us uses in order to explain and predict the behavior of others, by appealing to such things as pleasure, pain, beliefs, and desires. We are using folk psychology, for example, when we say that Alice screamed because she was in pain.

Concerning his first premise, Lewis thinks that, for instance, pain is defined by a set of pairs of causal inputs and behavioral outputs that is characteristic only to it. That set might include, for example, the causal input of a live electrode being put into contact with a human being, and the causal output of that human being vocalizing loudly. If this sounds behaviorist, that is because the view has its roots in behaviorism. But, unlike the behaviorist, Lewis does not think that that is all there is to say about mentality. He thinks that each mental state must still be a physical entity. While each is definable in terms of causal roles, each is a neurophysiological state. Furthermore, Lewis thinks that the mental concepts afforded to us by folk psychology pick out real mental states—at least for the most part. Thus Lewis expects that, by and large at least, each mental state that is part of our folk psychological theory will be definable in terms of a unique set of causal inputs and outputs. This sets Lewis (and other reductionists about the mind) apart from eliminativists, who expect no such accuracy in our folk psychological theory, and, indeed, often argue against its adequacy (as in, for example, Churchland 1981).

Lewis’s second premise is that the physical world is explanatorily closed. For any (explicable) physical phenomenon, there are some phenomena in terms of which it can be explained that are themselves physical. (Lewis leaves room for physical phenomena that have no explanations because they depend on chance, such as why a particular atom of uranium-235 decayed at a particular time t.) What is important for Lewis’s project is that this means we will never have to appeal to any non-physical (read: mental) entity in order to explain any physical phenomenon. And, because the causes and effects in the characteristic set that defines any given mental state are always physical (things like the placement of live electrodes and vocalizations), we will never need to invoke mental phenomena in order to explain any of these phenomena. We will be able to find some physical phenomena in terms of which to do so.

Very often, token-token identity theorists are role functionalists, who identify each type of mental state with a type of functional role. This role can, in principle, be realized by more than one type of physical state. And hence each type of mental state can, in principle, be realized by more than one type of physical state. But, according to role functionalists, a mental state itself is not identical to any physical state. So, for example, a role functionalist might identify pain with the functional state of bodily damage detection. That functional state is (we are supposing) realized in humans by c-fiber firings. As a result, pain is realized in humans by c-fiber firings. But it is something more abstract than just c-fiber firings; it is just whatever plays the role of bodily damage detection. It just so happens that what plays that role in humans is (we are supposing) c-fiber firings. Lewis was not a role functionalist. As stated, he identified each type of mental state with some type of physical state. So he identified pain with c-fiber firings, rather than saying that the former is realized by the latter.

This opens Lewis’s view up to the problem of the multiple realizability of the mental. This is the idea that human beings (or, more generally, organisms in which the role of bodily damage detection is played by c-fibers) are presumably not the only sorts of creatures that can be in pain. There may be animals on earth which lack c-fibers but which, when subjected to an electric shock, behave in the sort of way human beings behave, vocalizing loudly, moving away from the source of the shock, and so on. And even if there are not, we can imagine beings, perhaps Martians, that meet these conditions. What of them? Presumably, they can be in pain. But if they do not have c-fibers, then Lewis is forced to say that they, in fact, cannot be in pain.

In ‘Mad Pain and Martian Pain’ (1980a), Lewis deals with this problem by essentially biting the bullet. He recognizes that there will be distinct mental states associated with similar causal roles like human pain, jellyfish pain, Martian pain, and so forth. But he does not think this was too big a bullet to bite. The debate is, ultimately, just one about which state—realizer or role—we refer to when we use our folk psychological terminology to refer to mental states (such as ‘pleasure’, ’pain’, ‘belief’, ‘desire’, and so on). But Lewis also thinks there is good reason to prefer his view. Remember that he identifies mental states by their causal roles. Pain is whatever both is caused by certain sorts of stimuli (electric shocks, pricks with a needle, and so forth) and causes certain sorts of behavior (vocalizing loudly, moving away from the stimulus, and so forth). But an abstract functional role is not apt to play this causal role. There must be something physical that does so—that is actually involved in the push-and-pull of each causal chain of physical events. On Lewis’s account, according to which each type of mental state is a type of physical state, and in which each token mental state is a token physical state, there is always a physical state to play the needed causal role, and, moreover, to play that role while keeping the world at large completely material. One cannot help but appreciate how neatly this reply is connected to the argument he originally gives for his identity theory in his 1966 paper.

Another problem Lewis addresses in ‘Mad Pain and Martian Pain’ is, in a certain sense, the reverse of the problem of the multiple realizability of the mental. His terminology calls this ‘the problem of mad pain.’ The basic idea is that it is possible for there to be individual human beings (and as such, individuals we want to count as being capable of being in human pain), who lack the behavioral outputs that are typically associated with certain environmental inputs among humans, or have atypical behavioral outputs associated with certain environmental inputs. So, for example, when subjected to an electric shock, rather than screaming or moving away from its source, such an individual might sigh, relax her posture, and smile pleasantly. And when eating a piece of cake, she might scream and move away from it. Call such an individual a madman.

Even as early as his 1966 paper, Lewis is careful to characterize the characteristic causal role of a mental state as a set of typical associated environmental stimuli and behaviors (1966: 19–20). So the existence of a madman here or there does not cause problems for Lewis’s view. But, of course, one immediately wonders relative to what group these stimuli and behaviors are typically associated. He says, of the group relative to which we should characterize ‘pain’:

Perhaps (1) it should be us; after all, it’s our concept and our word. On the other hand, if it’s X we’re talking about, perhaps (2) it should be a population that X himself belongs to, and (3) it should preferably be one in which X is not exceptional. Either way, (4) an appropriate population should be a natural kind—a species, perhaps. (1980a: 219–20)

In the case of representative individuals of a population, all four criteria pull together. In the case of the Martian, criterion (1) is outweighed by the other three (whether the characteristic set for pain in Martians is exactly the same as it is in humans or if there are some differences between them). And in the case of the madman, it is criterion (3) that is outweighed by the other three. There will be certain cases with which Lewis’s account will have difficulties, to be sure. If a lightning strike hits a swamp and produces a one-off creature that is a member of no population apart from that consisting of just itself, Lewis’s account would provide no direction about how to regard a set of associated stimuli and behaviors which are correlated in the creature. That is, it would not tell us which mental state the set is associated with. But Lewis is prepared to live with such difficult cases, as he think our intuitions would not be reliable in such a situation anyway. As a result, he thinks that the fact that his theory provides no definitive answers in such cases is not a drawback of it, but, in fact, is in line with our pre-theoretic estimation of such cases.

A final issue worth mentioning is qualia—the subjective nature of an experience, for example, what it feels like to be in the sort of pain caused by a live electrode being put into contact with one’s left thumb. Identity theorists, and physicalists in general, often face the problem of qualia, that is, the allegation that their theory cannot make sense of the idea that there is something that it feels like to be in a particular mental state. One of the most famous statements of this problem is by Frank Jackson, in his paper, ‘Epiphenomenal Qualia’ (1982). He asks us to consider an individual, Mary, who has spent her entire life in a black and white room, never seeing any color other than black and white. Nonetheless, she has devoted herself to learning everything she can about color from (black and white) textbooks, television programs, and so forth, and is, at this point, perfectly knowledgeable about the subject. We can suppose she knows every piece of physical information there is to know about electromagnetism, optics, physiology, neuroscience, and so forth, that is related to color and its perception. Jackson then asks us to imagine that one day, Mary steps outside for the first time, and sees a red rose. He maintains that she learns something upon doing so that she did not know before, namely, what it is like to see red. Thus, Jackson concludes, not all information is physical information. This poses a problem for the physicalist because, according to physicalist, this should not be possible. There is nothing to know about color and its perception outside of the complete collection of physical information associated with color and its perception.

Lewis’s response to the qualia problem can be found in his Postscript to ‘Mad Pain and Martian Pain’ (1983b: 130–32), ‘What Experience Teaches’ (1988c), ‘Reduction of Mind’ (1994b), and ‘Should a Materialist Believe in Qualia?’ (1995). He credits it to Laurence Nemirow (1979, 1980, and 1990), and, in short, it is the idea that when Mary exits the room and sees a rose, she does not learn a new piece of information, instead, she gains a new ability. In particular, she gains the ability to make certain comparisons and to imagine certain sorts of objects that she was not able to do before. Now that she has seen the rose, she can go further out into the world and distinguish between things that are the same color as the rose and those which are not. And she can imagine what a red car would look like, even if she has not seen one. These are things she was not able to do before. But they are not propositional knowledge, in the sense that they are not things that can be expressed by a sentence of a language.

10. Other Work and Legacy

There are numerous aspects of Lewis’s work which this article has not discussed. He has influential views about the nature of dispositions, a discussion of which can be found in ‘Finkish Dispositions’ (1997b). He writes on free will in ‘Are We Free to Break the Laws?’ (1981a). And his discussions of his theory of mental content can be found in, for example, ‘Attitudes De Dicto and De Se’ (1979a) and ‘Reduction of Mind’ (1994b: 421 ff.). In addition to metaphysics, the philosophy of language, and the philosophy of mind, Lewis contributed to other subfields, including epistemology and philosophy of mathematics. The reader can find what Lewis has to say about knowledge in ‘Elusive Knowledge’ (1996b). His main focus in the philosophy of mathematics is on squaring his materialistic commitments with his liberal use of set theory (in, for example, his theory of properties). After all, sets are, prima facie, abstract objects. Lewis’s strategy is to provide an analysis of set theory in mereological terms. The parthood relation does much of the work that the membership relation does in set theory. A set of some objects is, for him, just their mereological sum. With this idea in place, Lewis is able to make sense of set-theoretic talk in terms of concrete objects which stand in parthood relationships to one another. The interested reader can find discussions of this issue in his book Parts of Classes (1991) and his articles ‘Nominalistic Set Theory’ (1970c) and ‘Mathematics is Megethology’ (1993b).

Lewis discusses central issues in the philosophy of religion, including the ontological argument in ‘Anselm and Actuality’ (1970a), and the problem of evil in ‘Evil for Freedom’s Sake’ (1993a) and the posthumous ‘Divine Evil’ (2007). In the philosophy of science, he discusses inter-theoretic reduction in ‘How to Define Theoretical Terms’ (1970b) and verificationism in ‘Statements Partly About Observation’ (1988b). Lewis also writes extensively on chance and probabilistic reasoning in, for example, ‘Prisoners’ Dilemma Is a Newcomb Problem’ (1979c), ’A Subjectivist’s Guide to Objective Chance’ (1980b), ‘Causal Decision Theory’ (1981b), ‘Why Ain’cha Rich?’ (1981c), ‘Probabilities of Conditionals and Conditional Probabilities’ (1976a), ‘Probabilities of Conditionals and Conditional Probabilities II’ (1986d), ‘Human Supervenience Debugged’ (1994a), and ‘Why Conditionalize?’ (1999b). And he discusses certain issues that fall at the intersection of probabilistic and practical reasoning in ‘Desire as Belief’ (1988a) and ‘Desire as Belief II’ (1996a).

Lewis makes contributions to deontic logic, which is a formal modal language used to express claims of obligation and permission, whose operators are interpreted to mean ‘it is obligatory that’ and ‘it is permissible that’, in, for example, ‘Semantic Analyses for Dyadic Deontic Logic’ (1974). Lewis also has well-developed views about ethics, metaethics, and applied ethics. In ‘Dispositional Theories of Value’ (1989b), Lewis develops a materialism-friendly theory of value in terms of things’ dispositions to affect us in appropriate ways (or to generate appropriate attitudes in us) in ideal conditions. These attitudes are certain (intrinsic, as opposed to instrumental) second-order desires. That is, one values something only if she desires that she desires it. As a result, Lewis is officially a subjectivist about value. But he thinks (or at least hopes) that there is enough commonality among moral agents that a more-or-less fixed set of values can be discerned. Lewis does not develop a systematic ethical system. But he delivers critiques of consequentialist ethical theories (according to which what makes an action right or wrong is determined by the nature of its consequences) like utilitarianism (according to which what makes an action right/wrong is that it maximizes/fails to maximize the benefit to the largest number of people). See, for example, ‘Reply to McMichael’ (1978), ‘Devil’s Bargains and the Real World’ (1984), and Plurality (1986b: 128). One general constraint Lewis does make explicit about his positive view is that an ethical theory should be compatible with there being multiple, potentially conflicting, moral values. Similarly, he thinks it might be impossible to provide a binary evaluation of someone’s character as good or bad, overall. It might be that we can only point to respects in which an individual has good or bad character. Nolan (2005: 189) takes it to be likely that Lewis’s positive ethical theory, to the extent it can be discerned in his writings, is a version of virtue ethics, and thus that he bases the rightness or wrongness of a particular act on whether a moral agent with appropriate virtues and in appropriate circumstances would perform it (see, for example, Lewis 1986b: 127). Lewis focuses on several issues in applied ethics, including punishment in ‘The Punishment that Leaves Something to Chance’ (1987) and ‘Do We Believe in Penal Substitution?’ (1997a), tolerance in ‘Academic Appointments: Why Ignore the Advantage of Being Right?’ (1989a) and ‘Mill and Milquetoast’ (1989c), and nuclear deterrence in ‘Devil’s Bargains and the Real World’ (1984), ‘Buy Like a MADman, Use Like a NUT’ (1986a), and ‘Finite Counterforce’ (1989b).

Truly, then, Lewis’s contributions to philosophy range much more widely than his most-known work. It is difficult to summarize Lewis’s legacy. He makes important contributions to understanding probability and probabilistic reasoning, and his work on conditionals—counterfactuals in particular—can only be described as foundational. His work on causation is very important as well. In particular, his move from a simpler counterfactual analysis of causation to one invoking the notion of influence is reflected in more recent interventionist accounts of causation, according to which the cause of an event E is something which, when manipulated in some way (for example, by slightly changing the time at which it occurs or the manner in which it occurs), one can modify E. And, as Woodward (2016, sec. 9) notes, interventionist accounts are ultimately counterfactual accounts, and so they are also in this way indebted to Lewis’s earlier work on causation as well as to his work on counterfactuals. While dualism about the mind is much more popular in the first two decades of the twenty-first century than in Lewis’s day, his argument for his identity theory, which appeals to the explanatory closure of the physical world, is an important foil for the dualists who emerged in the 1980s and 90s. And his and Nemirow’s response to the problem of qualia was also a must-address for those dualists.

Lewis’s discussion of time and perdurance in Plurality generated a large debate in that area, and to a great extent set its parameters. Recall (see section 4) that he sets out three ways of solving the problem of temporary intrinsics: regarding intrinsic properties like shape to be relations to times, presentism, and his own worm theory. A lot of work was done exploring the tenability of each of these options, and exploring other nearby options. In addition, Lewis’s paper ‘The Paradoxes of Time Travel’ (1976b) is arguably responsible for an entire sub-literature on that topic.

Lewis’s metaphysics is, by and large, nominalist. But realism about universals is much more popular today than it was in the mid-20th century. As nominalistic as his views are, Lewis makes important moves away from the ideas which formed the environment in which his philosophical development took place. Quine, of course, believed that there is “no entity without identity” (for example, 1969: 23). What he intended by this is that we must have clear identity conditions for any entity whose existence we posit. This is one of the reasons why Quine was happy to recognize the existence of sets, which are individuated extensionally, that is, according to which members they have, but was skeptical of such things as properties. Lewis makes properties extensional by identifying them with sets, but goes a step further by allowing their extensions to range across all possibilia, rather than just actual entities. Lewis then goes even further in conceding, in ‘New Work for a Theory of Universals’ (1983a), that universals can do things which properties, as conceived by Lewis, cannot do. His basic distinction between properties which are perfectly natural and those which are not is rather anti-nominalistic, and this position can be understood as a bridge connecting the Quinean extensional picture of the world with the new hyperintensional picture of it, which allows for distinctions amongst entities, such as properties or propositions, that are not only extensionally equivalent, in that they apply to the same things or are all true or false at the actual world, but are intensionally equivalent, that is, they do so or are so at every possible world. An example are the properties, mentioned in section 3, being a triangular polygon and being a trilateral (three-sided) polygon. Sider (2011) generalizes Lewis’s idea from properties, which are the worldly correlates of predicates, to other sorts of entities, including the worldly correlates of predicate modifiers, sentential connectives, and quantifiers. He ends up with a very general notion of joint-carving–ness, which is a feature of certain of our linguistic expressions, and he uses the notion to characterize the notion of fundamentality, as Lewis does with naturalness (for Lewis, the perfectly natural properties are the fundamental properties, all other properties being definable in terms of them—see, for example, 1994a: 474). It is hard to say exactly what the philosophical world today would be like without Lewis. But we can be sure that it would be very different than it is.

11. References and Further Reading

Note: Many of the papers below have been reprinted, sometimes with postscripts, in one of the collections Lewis 1983b, 1986c, 1998, 1999a, and 2000b; below, only the first appearance is cited.

a. Primary Sources

Lewis, David K. 1966. An Argument for the Identity Theory. Journal of Philosophy 63, 17–25.
Lewis, David K. 1968. Counterpart Theory and Quantified Modal Logic. Journal of Philosophy 65, 113–26.
Lewis, David K. 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.
Lewis, David K. 1970a. Anselm and Actuality. Noûs 4, 175–88.
Lewis, David K.1970b. How to Define Theoretical Terms. Journal of Philosophy 67, 427–46.
Lewis, David K. 1970c. Nominalistic Set Theory. Noûs 4, 225–40. Reprinted in Lewis 1998, 186–202.
Lewis, David K. 1971. Counterparts of Persons and Their Bodies. Journal of Philosophy 68, 203–11.
Lewis, David K. 1973a. Causation. Journal of Philosophy 70, 556–67.
Lewis, David K. 1973b. Counterfactuals. Oxford: Blackwell.
Lewis, David K. 1974. Semantic Analyses for Dyadic Deontic Logic. In Sören Stenlund (ed.), Logical Theory and Semantic Analysis: Essays Dedicated to Stig Kanger on His Fiftieth Birthday. Dordrecht: Reidel.
Lewis, David K. 1975. Languages and Language. In Keith Gunderson (ed.), Minnesota Studies in the Philosophy of Science. University of Minnesota Press, 3–35.
Lewis, David K. 1976a. Probabilities of Conditionals and Conditional Probabilities. Philosophical Review 85, 297–315.
Lewis, David K. 1976b. The Paradoxes of Time Travel. American Philosophical Quarterly 13, 145–52.
Lewis, David K. 1978. Reply to McMichael. Analysis 38, 85–86.
Lewis, David K. 1979a. Attitudes De Dicto and De Se. The Philosophical Review 88, 513–43.
Lewis, David K. 1979b. Counterfactual Dependence and Time’s Arrow. Noûs 13, 455–76.
Lewis, David K. 1979c. Prisoners’ Dilemma Is a Newcomb Problem. Philosophy and Public Affairs 8, 235–40.
Lewis, David K. 1980a. Mad Pain and Martian Pain. In Ned Block (ed.), Readings in Philosophy of Psychology, Vol. 1. Cambridge, MA: Harvard University Press, 216–22.
Lewis, David K. 1980b. A Subjectivist’s Guide to Objective Chance. In Richard C. Jeffrey (ed.), Studies in Inductive Logic and Probability, Vol. II. Berkeley, CA: University of California Press, 263–93.
Lewis, David K. 1981a. Are We Free to Break the Laws? Theoria 47, 113–21.
Lewis, David K. 1981b. Causal Decision Theory. Australasian Journal of Philosophy 59, 5–30.
Lewis, David K. 1981c. Why Ain’cha Rich? Noûs 15, 377–80.
Lewis, David K. 1983a. New Work for a Theory of Universals. Australasian Journal of Philosophy 61, 343–77.
Lewis, David K. 1983b. Philosophical Papers, Vol. I. Oxford: Oxford University Press.
Lewis, David K. 1984. Devil’s Bargains and the Real World. In Douglas MacLean (ed.), The Security Gamble: Deterrence in the Nuclear Age. Totowa, NJ: Rowman and Allenheld, 141–154.
Lewis, David K. 1986a. Buy Like a MADman, Use Like a NUT. QQ 6: 5–8.
Lewis, David K. 1986b. On the Plurality of Worlds. Oxford: Blackwell.
Lewis, David K. 1986c. Philosophical Papers, Vol. II. Oxford: Oxford University Press.
Lewis, David K. 1986d. Probabilities of Conditionals and Conditional Probabilities II. Philosophical Review 95, 581–89.
Lewis, David K. 1987. The Punishment that Leaves Something to Chance. In Proceedings of the Russellian Society (University of Sydney) 12, 81–97. Also in Philosophy and Public Affairs 18, 53–67.
Lewis, David K. 1988a. Desire as Belief. Mind 97, 323–32.
Lewis, David K. 1988b. Statements Partly About Observation. Philosophical Papers 17, 1–31.
Lewis, David K. 1988c. What Experience Teaches. Proceedings of the Russellian Society (University of Sydney) 13, 29–57.
Lewis, David K. 1989a. Academic Appointments: Why Ignore the Advantage of Being Right? In Ormond Papers, Ormond College, University of Melbourne.
Lewis, David K. 1989b. Finite Counterforce. In Henry Shue (ed.), Nuclear Deterrence and Moral Restraint. Cambridge: Cambridge University Press, 51–114.
Lewis, David K. 1989c. Mill and Milquetoast. Australasian Journal of Philosophy 67, 152–71.
Lewis, David K. 1991. Parts of Classes. Oxford: Blackwell.
Lewis, David K. 1993a. Evil for Freedom’s Sake. Philosophical Papers 22, 149–72.
Lewis, David K. 1993b. Mathematics is Megethology. Philosophia Mathematica 3, 3–23.
Lewis, David K. 1994a. Humean Supervenience Debugged. Mind 103, 473–90.
Lewis, David K. 1994b. Reduction of Mind. In Samuel Guttenplan (ed.), A Companion to the Philosophy of Mind. Oxford: Blackwell, 412–31.
Lewis, David K. 1995. Should a Materialist Believe in Qualia? Australasian Journal of Philosophy 73, 140–44.
Lewis, David K.1996a. Desire as Belief II. Mind 105, 303–13.
Lewis, David K. 1996b. Elusive Knowledge. Australasian Journal of Philosophy 74, 549–67.
Lewis, David K.1997a. Do We Believe in Penal Substitution? Philosophical Papers 26, 203–09.
Lewis, David K. 1997b. Finkish Dispositions. The Philosophical Quarterly 47, 143–58.
Lewis, David K. 1998. Papers in Philosophical Logic. Cambridge: Cambridge University Press.
Lewis, David K. 1999a. Papers on Metaphysics and Epistemology. Cambridge: Cambridge University Press.
Lewis, David K. 1999b. Why Conditionalize? In Lewis 1999a. (Written in 1972.)
Lewis, David K. 2000a. Causation as Influence. Journal of Philosophy 97, 182–97.
Lewis, David K. 2000b. Papers in Ethics and Social Philosophy. Cambridge: Cambridge University Press.
Lewis, David K. 2002 Tensing the Copula. Mind 111, 1–13.
Lewis, David K. 2004. Causation as Influence (extended version). In John Collins, Ned Hall, and L. A. Paul (eds), Causation and Counterfactuals. Cambridge, MA: MIT Press, 75–106.
Lewis, David K. 2007. Divine Evil. In Louise M. Antony (ed.), Philosophers without Gods: Meditations on Atheism and the Secular Life. Oxford: Oxford University Press.

b. Secondary Sources

Armstrong, David M. 1978a. Universals and Scientific Realism, Vol. I: Nominalism and Realism. Cambridge: Cambridge University Press.
Armstrong, David M. 1978b. Universals and Scientific Realism, Vol. II: A Theory of Universals. Cambridge: Cambridge University Press.
Armstrong, David M. 1983. What Is a Law of Nature? Cambridge: Cambridge University Press.
Churchland, Paul 1981. Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy 78, 67–90.
Fine, Kit 1975. Critical Notice of Counterfactuals. Mind 84, 451–58.
Goodman, Nelson 1947. The Problem of Counterfactual Conditionals. Journal of Philosophy 44, 113–28.
Goodman, Nelson 1955. Fact, Fiction, and Forecast. Cambridge, MA: Harvard University Press.
Hall, Ned. 2004. Two Concepts of Causation. In John Collins, Ned Hall, and L.A. Paul (eds), Causation and Counterfactuals. Cambridge, MA: The MIT Press, 225–76.
Kripke, Saul A. 1980. Naming and Necessity. Cambridge, MA: Harvard University Press.
Nemirow, Laurence 1979. Functionalism and the Subjective Quality of Experience. Doctoral Dissertation, Stanford University.
Nemirow, Laurence 1980. Review of Thomas Nagel, Moral Questions. Philosophical Review 89, 475–76.
Nemirow, Laurence 1990. Physicalism and the Cognitive Role of Acquaintance. In William G. Lycan (ed.), Mind and Cognition. Oxford: Blackwell.
Nolan, Daniel 2005. David Lewis. Chesham: Acumen.
Quine, William Van Orman. 1969. Ontological Relativity and Other Essays. New York: Columbia University Press.
Sider, Theodore. 1996. All the World’s a Stage. Australasian Journal of Philosophy 74, 433–53.
Sider, Theodore. 2001. Four-Dimensionalism: An Ontology of Persistence and Time. Oxford: Oxford University Press.
Sider, Theodore. 2011. Writing the Book of the World. Oxford: Oxford University Press.
Stalnaker, Robert C. 1968. A Theory of Conditionals. In Nicolas Rescher (ed.), Studies in Logical Theory, American Philosophical Quarterly Monograph Series, Vol. 2. Oxford: Blackwell, 98–112.
van Inwagen, Peter. 1990. Material Beings. New York: Cornell University Press.
Weatherson, Brian. 2016. David Lewis. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.
Woodward, James. 2016. Causation and Manipulability. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.

c. Further Reading

Nolan, Daniel 2005. David Lewis. Chesham: Acumen.
Jackson, Frank and Graham Priest. 2004. Lewisian Themes: The Philosophy of David K. Lewis. Oxford: Oxford University Press.
Loewer, Barry and Jonathan Schaffer. 2015. A Companion to David Lewis. Oxford: Blackwell.
Weatherson, Brian. 2016. David Lewis. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy.

Author Information

Scott Dixon
Email: ts.dixon@ashoka.edu.in
Ashoka University
India

Meaning and Context-Sensitivity

Truth-conditional semantics explains meaning in terms of truth-conditions. The meaning of a sentence is given by the conditions that must obtain in order for the sentence to be true. The meaning of a word is given by its contribution to the truth-conditions of the sentences in which it occurs.

What a speaker says by the utterance of a sentence depends on the meaning of the uttered sentence. Call what a speaker says by the utterance of a sentence the content of the utterance. Natural languages contain many words whose contribution to the content of utterances varies depending on the contexts in which they are uttered. The typical example of words of this kind is the pronoun ‘I’. Utterances of the sentence ‘I am hungry’ change their contents depending on who the speaker is. If John is speaking, the content of his utterance is that John is hungry, but if Mary is speaking, the content of her utterance is that Mary is hungry.

The words whose contribution to the contents of utterances depends on the context in which the words are uttered are called context-sensitive. Their meanings are guidance for speakers to use language in particular contexts for expressing particular contents.

This article presents the main theories in philosophy of language that address context-sensitivity. Section 1 presents the orthodox view in truth-conditional semantics. Section 2 presents linguistic pragmatism, also known as ‘contextualism’, which comprises a family of theories that converge on the claim that the orthodox view is inadequate to account for the complexity of the relations between meanings and contexts. Sections 3 and 4 present indexicalism and minimalism, which from two different perspectives try to resist the objections raised by linguistic pragmatism against the orthodox view. Section 5 presents relativism, which provides a newer conceptualization of the relations between meanings and contexts.

The Orthodox View in Truth-Conditional Semantics
Departing from the Orthodox View: Linguistic Pragmatism
Defending the Orthodox View: Indexicalism
Defending the Autonomy of Semantics: Minimalism
Defending Invariant Semantic Contents: Relativism
References and Further Reading
1. References
2. Further Reading

1. The Orthodox View in Truth-Conditional Semantics

a. Context-Sensitive Expressions and the Basic Set

The orthodox view in truth-conditional semantics maintains that the content (proposition, truth-condition) of an utterance of a sentence is the result of assigning contents, or semantic values, to the elements of the sentence uttered in accord with their meanings and combining them in accord with the syntactic structure of the sentence. The content of the utterance is determined by the conventional meanings of the words that occur in the sentence.

Conventional meanings are divided into two kinds. Meanings of the first kind determine semantic values that remain constant in all contexts of utterance. Meanings of the second kind provide guidance for the speaker to exploit information from the context of utterance to express semantic values. Linguistic expressions governed by meanings of the second kind are context-sensitive and can be used to express different semantic values in different contexts of utterance. The following is a list of some context-sensitive expressions (Donaldson and Lepore 2012):

Personal pronouns: I, you, she

Demonstratives: this, that

Adjectives: present, local, foreigner

Adverbs: here, now, today

Nouns: enemy, foreigner, native

Cappelen and Lepore (2005) call the set of expressions that exhibit context-sensitivity in their conventional meaning the Basic Set. Compare the following pair of utterances:

(1) I am hungry (uttered by John).

(2) John is hungry (uttered by Mary).

Utterance (1) and utterance (2) have the same truth-conditional content. Both are true if and only if John is hungry. Yet, the sentence ‘I am hungry’ and the sentence ‘John is hungry’ have different meanings. The meaning of the first-person pronoun ‘I’ prescribes that only the speaker can utter it to refer to herself. Only John can utter the sentence ‘I am hungry’ to say that John is hungry. In a context where the speaker is not John, the sentence ‘I am hungry’ cannot be uttered to say that John is hungry. The meaning of the proper name ‘John’, instead, allows for speakers in different contexts of utterance to refer to John. In all contexts of utterance the sentence ‘John is hungry’ can be uttered to say that John is hungry.

b. Following Kaplan: Indexes and Characters

Since David Kaplan’s works (1989a, 1989b) in formal semantics, the conventional meaning of a word is a function from an index, which represents features of the context of utterance, to a semantic value. The features of the context of utterance include who is speaking, when, where, the object the speaker refers to with a demonstrative, and the possible world where the utterance takes place. Adopting Kaplan’s terminology, philosophers call the function from indexes to semantic values character. The semantic values of the words in a sentence relative to an index are composed into a function that distributes truth-values at points of evaluation, pairs of possible worlds and times. The formal semantic machinery determines the condition under which a sentence relative to a given index is true at a world and a time. For example, John’s utterance (1) is represented as the pair formed of the sentence ‘I am hungry’ and an index i that contains John as speaker. The semantic machinery determines the truth-condition of this pair so that the sentence ‘I am hungry’ at the index i is true at a possible world w and a time t if and only if the speaker of i is hungry in w at t; that is, if and only if John is hungry in w at time t. If Mary uttered the sentence ‘I am hungry’, another index i* with Mary as speaker would be needed to represent her utterance. The semantic machinery would ascribe to the sentence ‘I am hungry’ at the index i* the content that is true at a possible world w and a time t if and only if Mary is hungry in w at time t.

In formal semantics, then, context-sensitive meanings are characters that vary depending on the indexes that represent features of the contexts of utterance, where indexes are tuples of slots, or parameters, to be filled in in order for sentences at indexes to have a truth-conditional content. The meanings of context-insensitive expressions, instead, are characters that remain constant in all indexes. For example, the meaning of the proper name ‘John’ is a constant character that returns John as semantic value in all indexes. No matter who is speaking, when, or where, John is the semantic value of the proper name ‘John’, and the sentence ‘John is hungry’, relative to all indexes, is true at a world w and time t if and only if John is hungry in w at time t.

It is convenient here to introduce an aspect relevant to section 5. Since the indexes that are used to represent features of contexts of utterance contain possible worlds and times, the semantic machinery distributes unrelativised truth-values to index-sentence pairs. A sentence S at index i is true (simpliciter) if and only if S is true in i_w at i_t, where i_w and i_t are the possible world and the time of index; see Predelli (2005: 22). For example, if John utters the sentence ‘I am hungry’ at noon on 23 November 2019, the index that represents the features of John’s context of utterance contains the time noon on 23 November 2019 and the actual world. John’s utterance is true (simpliciter) if and only if John is hungry at noon on 23 November 2019 in our actual world.

c. Context-Sensitivity and Saturation

The orthodox truth-conditional view in semantics draws the distinction between the meaning of an expression type and the content of an utterance of the expression. The meaning of the expression type is the linguistic rule that governs the use of the expression. Context-insensitive expressions are governed by linguistic rules that determine their contents (semantic values), which remain invariant in all contexts of utterance. Context-sensitive expressions, instead, are governed by linguistic rules that prescribe how the speaker can use them to express contents in contexts of utterance.

The meanings of context-sensitive expressions specify what kinds of contextual factors play certain roles with respect to utterances. More precisely, the meanings of context-sensitive expressions fix the parameters that have to be filled in in order for utterances to have contents. Philosophers and linguists use the technical term saturation for what the speaker does by filling in the demanded parameters with values taken from contextual factors. Indexicals are typical examples of context-sensitive expressions. For example, the meaning of the pronoun ‘I’ establishes that an utterance of it refers to the agent that produces it. The meaning of the demonstrative ‘that’ establishes that an utterance of it refers to the object that plays the role of demonstratum in the context of utterance. Thus, the meaning of ‘I’ demands that the speaker fill in an individual, typically herself, as the value of the parameter speaker of the utterance. And the meaning of ‘that’ demands that the speaker fill in a particular object she has in mind as the value of the parameter demonstratum.

In formal semantics the parameters that are filled in with values are represented with indexes, and the meanings of expressions are functions—characters—from indexes to contents. The meanings of context-insensitive expressions are constant characters, while the meanings of context-sensitive expressions are variable characters. If a sentence contains no context-sensitive expressions, it can be uttered to express the same content in all contexts of utterance. On the contrary, if a sentence contains context-sensitive expressions, it can be used to express different contents in different contexts of utterances.

d. Grice on What is Said and the Syntactic Constraint

One of the main tenets of the orthodox truth-conditional view is that all context-sensitivity is linguistically triggered in sentences or in their logical forms. The presence of each component of the truth-conditional content of an utterance of a sentence is mandatorily governed by a linguistic element occurring in the uttered sentence or in its logical form. For this reason, some philosophers equate the distinction between the meanings of expression types and the contents of utterances with Paul Grice’s (1989) distinction between sentence meaning and what is said by an utterance of a sentence. The sentence meaning is given by the composition of the meanings of the words that occur in the sentence. What is said corresponds to the truth-conditional content that the speaker expresses by undertaking the processes of disambiguation, reference assignment, and saturation that are required by her linguistic and communicative intentions and by the meanings of the uttered words.

Grice held that what is said is part of the speaker’s meaning. The speaker’s meaning is the content that the speaker intends to communicate by an utterance of a sentence. In Grice’s view, the speaker’s meaning comprises two parts: What is said and what is implicated. What is said is the content that the speaker explicitly and directly communicates by the utterance of a sentence. What is implicated is the content the speaker intends to convey indirectly. Grice called the contents that are indirectly conveyed implicatures. Implicatures can be inferred from what is said and general principles governing communication: the cooperative principle and its maxims. To illustrate Grice’s distinctions, suppose that at a party A, pointing to Bob and speaking to B, utters the following sentence:

(3) That guest is thirsty.

Following Grice, the utterance of (3) can be analysed at three distinct levels. (i) The level of sentence meaning is given by the linguistic conventions that govern the use of the words in the sentence. Due to linguistic competence alone, the hearer B understands that A’s utterance is true if and only if the individual, to whom A refers with the complex demonstrative ‘that guest’, is thirsty. (ii) The second level is given by what A says, that is, the truth-conditional content A’s utterance expresses. What is said—the content of A’s utterance—is that Bob is thirsty. To understand this content, B must consider A’s expressive and communicative intentions. B must understand that A has Bob in mind and wants to refer to him. To do so, B needs to rely on his pragmatic competence and contextual information. Mere linguistic competence is not enough. (iii) Finally, there is the level of what is meant through a conversational implicature. A intends that B offer Bob some champagne. Grice’s idea is that to understand what A intended to communicate, B must first understand the content of what A said—that Bob is thirsty—and then understand the implicature that it would be nice to offer Bob some champagne.

One very important aspect of Grice’s view is that each element that enters the content of what is said corresponds to some linguistic expression in the sentence. Grice maintained that what is said is “closely related to conventional meanings of words” (1989: 25). Grice imposed a syntactic constraint on what is said, according to which each element of what is said must correspond to an element of the sentence uttered. Carston (2009) speaks of the ‘Isomorphism Principle’, which states that if an utterance of a sentence S expresses the propositional content P, then the constituents of P must correspond to the semantic values of some constituents of S or of its logical form.

e. Semantic Contents of Utterances

Some philosophers reject the equation of the notion of content of an utterance with Grice’s notion of what is said. For example, Korta and Perry (2007) maintain that the content of an utterance is determined by the conventional meanings of the words the speaker utters and by the fact that the speaker undertakes all the semantic burdens that are demanded by those meanings, in particular disambiguation, reference assignment, and saturation of context-sensitive expressions. Korta and Perry call the content of an utterance so determined locutionary content (see also Bach 2001) and argue that there are clear cases in which the locutionary content does not coincide with Grice’s what is said, which is always part of what the speaker intends to communicate, that is, the speaker’s meaning. Irony is a typical example of this distinction. When, pointing to X, a speaker utters the sentence:

(4) He is a fine friend

ironically, the speaker does not intend to communicate that X is a fine friend, but the opposite. Nonetheless, without identifying the referent of ‘he’ and the literal content of ‘is a fine friend’, that is, without understanding the locutionary content of (4), the hearer is not able to understand the speaker’s ironic comment.

To illustrate in detail the debate on Grice’s notion of what is said goes beyond the purpose of this article. It is important to remark here that, according to the orthodox truth-conditional view—at least when speakers use language literally—what is said by an utterance of a sentence corresponds to the content that is determined by the conventional meanings of the words in the uttered sentence: The speaker undertakes all the semantic burdens that are demanded by those meanings, such as disambiguation, reference assignment, and saturation of context-sensitive expressions. When a speaker uses language literally, the content of an utterance of a sentence is what one gets by composing the semantic values of the expressions that occur in accord with their conventional meanings and the syntactic structure of the sentence. This content is a fully propositional one with a determinate truth-condition. This picture, which underlies the orthodox truth-conditional view in semantics, has been challenged by philosophers who call for a new theoretical approach. This new approach is called linguistic pragmatism and it expands the truth-conditional roles of pragmatics. The following section presents it.

2. Departing from the Orthodox View: Linguistic Pragmatism

a. Underdetermination of Semantic Contents

Neale (2004) coined the term ‘linguistic pragmatism’, though some philosophers and linguists prefer the term ‘contextualism’. Linguistic pragmatism comprises a family of theories (Bach 1994, 2001, Carston 2002, Recanati 2004, 2001, Sperber and Wilson 1986) that converge on one main thesis, that of semantic underdetermination. Linguistic pragmatists maintain that the meanings of most expressions—perhaps all, according to radical versions of linguistic pragmatism—underdetermine their contents in contexts, and pragmatic processes that are not linguistically governed are required to determine them. The main point of linguistic pragmatism is the distinction between semantic underdetermination and indexicality.

The orthodox view accepts that context-sensitivity is codified in the meanings of indexical expressions, which demand saturation processes. Linguistic pragmatists too accept this form of context-sensitivity, but according to them indexicality does not exhaust context-sensitivity. Linguistic pragmatists say that the variability of contents in contexts of many expressions is not codified in linguistic conventions. Rather, the variability of contents in contexts is due to the fact that the meanings of the expressions underdetermine their contents. Speakers must complete the meanings of the expressions with contents that are not determined by linguistic conventions codified in those meanings. The pragmatic operations that intervene in the process of completing the contents in context are not governed by conventions of the language, that is, by linguistic information, but work on more general contextual information.

Linguistic pragmatists make use of three kinds of arguments to support their view:

(i) Context-shifting arguments test people’s intuitions about the content of sentences in actual or hypothetical contexts of utterance. If people have the intuition that a sentence S expresses differing contents in different contexts, despite the fact that no overt context-sensitive expression occurs in S, it is evidence that some expression that occurs in S is semantically underdetermined. Consider the following example. Mark is 185 cm tall, and George utters the sentence:

(5) Mark is short

in a conversation about the average height of basketball players and then in a conversation about the average height of American citizens. People have the intuition that what George said in the first context is true while what he said in the second context is false. Linguistic pragmatists draw the conclusion that the content of (5) varies through contexts of utterance, despite the fact that the adjective ‘short’ is not an overt context-sensitive expression. They argue that the content of ‘short’ is underdetermined by its conventional meaning and explain the variation in content from context to context as a result of pragmatic processes that are not linguistically governed but nonetheless complete the meaning of ‘short’.

(ii) Incompleteness arguments too test people’s intuitions about the contents of sentences in context, pointing at people’s inability to evaluate the truth-value of a sentence without taking into account contextual information. Suppose George utters the sentence:

(6) Anna is ready.

People cannot say whether George’s utterance is true or false without considering what Anna is said to be ready for. The conclusion now is that (6) does not express a full propositional content with determinate truth-conditions. There is no such thing as Anna’s being ready simpliciter. The explanation is semantic underdetermination: The adjective ‘ready’ does not provide an invariant contribution to a full propositional content and it does not provide guidance to determine such a contribution either, because it is not an overt context-sensitive expression. The enrichment that is required to determine a full truth-conditional content is the result of a pragmatic process that is not governed by the meaning of ‘ready’.

(iii) Inappropriateness arguments spot the difference between the content that is strictly encoded in a sentence and the contents that are expressed by utterances of that sentence in different contexts. Suppose a math teacher utters the following sentence in the course of a conversation about her class:

(7) There are no French girls.

People usually understand the math teacher to say that there are no French girls attending the math class. Some philosophers say that in this case there is an invariant semantic content composed out of the meanings of the words in the sentence: French girls do not exist. However, it seems awkward both to claim that in uttering (7) the speaker says that French girls do not exist and to claim that hearers understand (7) as denying the existence of French girls in general. On the contrary, it seems convenient to suppose that both speakers and hearers restrict the interpretation of (7) to a particular domain, such as the students attending the math class.

b. Completions and Expansions

The claim on which all versions of linguistic pragmatism agree is that very often the content of an utterance is richer than the content obtained composing the semantic values of the expressions in the uttered sentence. Adopting a terminology from Bach (1994), it is common to distinguish two cases of pragmatic enrichments: completions and expansions.

With completions, the content determined by the meanings of the expressions that occur in a sentence is incomplete because it lacks full truth-conditions. These cases often recur in context-shifting arguments and incompleteness arguments:

(5) Mark is short.

(6) Anna is ready.

People do not know what conditions a person must meet to be short or ready simpliciter, so it appears there are no determinate conditions making a person so. To obtain a truth-conditional content it is necessary to add elements that do not correspond to any expression in (5) and (6). Linguistic pragmatists maintain that what is said is a completion of the content that is obtained by composing the meanings of the expressions in the sentence with some completion taken from the context. For instance, the contents of (5) and (6) could be completed in ways that might be expressed as follows:

(5*) Mark is short with respect to the average height of basketball players.

(6*) Anna is ready to climb Eiger’s North Face.

With expansions, the content of an utterance of a sentence is an enrichment of the literal content obtained by composing the semantic values of the expressions in the sentence. Some interesting cases of expansions are employed in inappropriateness arguments. Consider the following examples:

(8) All the students got an A.

(9) Anna has nothing to wear.

In these cases, there is a complete content that does not correspond to the content of the utterance. (8) expresses the content that all students in existence got an A, and (9) expresses the content that Anna has no clothes to wear at all. However, these sentences are usually used to express different contents. For example, (8) can be used by the logic professor to say that all students in her class got an A, and (9) can be used to say that Anna has no appropriate dress for a particular occasion.

c. Saturation and Modulation

Linguistic pragmatists maintain that completions and expansions are obtained through pragmatic processes that are not linguistically driven by conventional meanings. Recanati draws a distinction between saturation and modulation: Processes of saturation are mandatory pragmatic processes required to determine the semantic contents of linguistic expressions (bottom-up or linguistically driven processes). Processes of modulation are optional pragmatic processes that yield completions and expansions (top-down or ‘free’ processes).

Pragmatic processes of saturation are directed and governed by the linguistic meanings of context-sensitive expressions. For instance, the linguistic meaning of the demonstrative ‘that’ demands the selection of a salient object in the context of utterance to determine the referent of the demonstrative. In contrast, pragmatic processes of modulation are optional because they are not activated by linguistic meanings. They are not activated for the simple reason that the elements that form completions and expansions do not match any linguistic expression in the sentence. Recanati distinguishes three types of pragmatic processes of modulation:

(i) Free enrichment is a process that narrows the conditions of application of linguistic expressions. Some of the above examples are cases of free enrichment. In (8) the domain of the quantifier ‘all students’ is restricted to the logic class and in (9) the domain of ‘nothing to wear’ is restricted to appropriate dresses for a given occasion. In (5) the conditions of application of the adjective ‘short’ are restricted to people whose height is lower than the average basketball player. In (6) the conditions of application of the adjective ‘ready’ are restricted to people who acquired technical and physical ability for climbing Eiger’s North Face.

(ii) Loosening is a process that widens the conditions of application of words specifying the degree of approximation. Here is one example used by Recanati:

(10) The ATM swallowed my credit card.

Literally speaking, an ATM cannot swallow anything because it does not have a digestive system. In this case, the conditions of application of the verb ‘swallow’ are made loose so as to include a wider range of actions. Another example of loosening is the following:

(11) France is hexagonal.

This sentence does not say that the borders of France draw a perfect hexagon, but that it does so approximately.

(iii) Semantic transfer is a process that maps the meaning of an expression onto another meaning. The following is an example of semantic transfer. Suppose a waiter in a bar says to his boss:

(12) The ham sandwich left without paying.

Through a process of modulation, the meaning of the phrase ‘the ham sandwich’ is mapped onto the meaning of the phrase ‘the customer who ordered the ham sandwich’.

d. Core Ideas and Differences among Linguistic Pragmatists

The orthodox truth-conditional view distinguishes two kinds of pragmatic processes, primary ones and secondary ones. Primary pragmatic processes contribute to determine the contents of utterances for context-sensitive expressions. Secondary pragmatic processes contribute to conversational implicatures and are activated after the composition of the contents of utterances has been accomplished. The fundamental aspect of the orthodox view that linguistic pragmatists reject is the idea that primary pragmatic processes are only processes of saturation, which are activated and driven by conventional meanings of words. Linguistic pragmatists affirm that primary pragmatic processes also include processes of modulation that are not encoded in linguistic meanings. According to linguistic pragmatism, the process of truth-conditional composition that gives the contents of utterances is systematically underdetermined by linguistic meanings.

The different versions of linguistic pragmatism are all unified by the criticism of the orthodox view. Recanati calls the content of an utterance in the pragmatist conception ‘pragmatic truth-conditions’, Bach speaks of ‘implicitures’, Carston of ‘explicatures’. There are important and substantive differences among these notions. For Bach an impliciture is a pragmatic enrichment of the strict semantic content that is determined by linguistic meanings alone and can be truth-conditionally incomplete. The strict semantic content is like a template that needs to be filled. Recanati argues that Bach’s strict semantic content is only a theoretical abstraction that does not perform any proper role in the computation of what is said. Carston and relevance theorists like Sperber and Wilson adopt a similar view, but—in contrast with Recanati—they affirm that primary and secondary pragmatic processes are, from a cognitive point of view, processes of the same kind that are explained by the principle of relevance, according to which one accepts the interpretation that satisfies the expectation of relevance with the least effort.

However, there is something on which Bach, Recanati, Carston, Sperber and Wilson all agree: Very often, semantic interpretation alone gives at most semantic schemata, and only with the help of pragmatic processes of modulation can a complete propositional content be obtained.

Finally, the most radical views of Searle (1978), Travis (2008), and Unnsteinsson (2014) claim that conventional meanings do not exist. Speakers rely upon models of past applications of words and any new interpretation of a word arises from a process of modulation from one of its past applications. The latest works by Carston (2019) tend to develop a similar view. Radical linguistic pragmatists reject even the idea that semantics provides schemata to be pragmatically enriched by modulation processes. In their view, the difficulty is to explain what such an incomplete semantic content might be for many expressions. Think, for example, of ‘red’. It is difficult to individuate a semantic content, no matter how incomplete, that is shared in ‘red car’, ‘red hair’, ‘red foliage’, ‘red rashes’, ‘red light’, ‘red apple’, etc. It is even more difficult to explain how this alleged incomplete content could be enriched into the contents that people convey with those expressions.

The next section is devoted to indexicalism, a family of theories that react against linguistic pragmatism.

3. Defending the Orthodox View: Indexicalism

a. Extending Indexicality and Polysemy

Indexicalists attempt to recover the orthodox truth-conditional approach in semantics from the charge of semantic underdetermination raised by linguistic pragmatists. Indexicalists reject the thesis of semantic underdetermination and explain the variability of utterances’ contents in contexts with the resources of the orthodox truth-conditional view, mainly by enlarging the range of indexicality and the range of polysemy. The typical examples of variability of contents in contexts invoked by linguistic pragmatists are the following:

(13) John is tall.

(14) Mary is ready.

(15) It is raining.

(16) Everybody got an A.

(17) Mary and John got married and had a child.

In the course of a conversation about basketball players, an utterance of (13) might express the content that John is tall with respect to the average height of basketball players. In the course of a conversation about the next logic exam, an utterance of (14) might express the content that Mary is ready to take the logic exam. If Mary utters (15) while in Rome, her utterance might express the content that it is raining in Rome at the time of the utterance. If the professor of logic utters (16), her utterance might express the content that all the students in her class got an A. Mostly, if a speaker utters (17), she expresses the content that Mary and John got married before having a child.

Linguistic pragmatists argue that, in order for utterances of sentences like (13)-(17) to express those contents, the conventional meanings encoded in the sentences are not sufficient. Linguistic pragmatists hold that the presence in the content expressed of a comparison class for ‘tall’, of a course of action for ‘ready’, of a location for weather reports, of a restricted domain for quantified noun phrases, and of the temporal/causal order for ‘and’ is not the result of a process that is governed by a semantic convention. Linguistic pragmatists generalize this claim and argue that what is true of expressions like ‘tall’, ‘ready’, ‘it rains’, ‘everybody’, and ‘and’, is true of nearly all expressions in natural languages. According to linguistic pragmatists, semantic conventions provide at most propositional schemata—propositional radicals—that lack determinate truth-conditions.

The indexicalists’ strategy for resisting the call for a new theoretical approach raised by linguistic pragmatists is to enlarge both the range of indexicality, thought of as the result of linguistically governed processes of saturation, and the range of polysemy. Michael Devitt says, there is more linguistically governed context-sensitivity and polysemy in our language than linguistic pragmatists think. Indexicalists try to explain examples like (13)-(16) by conventions of saturation: It is by linguistic conventions codified in language that people use ‘tall’ having in mind a class of comparison, ‘ready’ a course of action, ‘it rains’ a location, and ‘everyone’ a domain of quantification. Some indexicalistsexplain examples like (17) by polysemy: ‘And’ is a polysemous word having multiple meanings, one for the truth-functional conjunction and one for the temporally/causally ordered conjunction.

Indexicalism too comprises a family of theories, and there are deep and fundamental differences among them. As said, on an orthodox semantic theory the meaning of context-sensitive expressions sets up the parameters, or slots, that must be loaded with contextual values. Sometimes the parameters are explicitly expressed in the sentence, as with indexicals. Sometimes, instead, the parameters do not figure at the level of surface syntax. Philosophers and linguists disagree on where the parameters, which do not show up at the level of surface syntax, are hidden. Some (Stanley 2005a, Stanley and Williamson 1995, Szabo 2001, Szabo 2006) hold that such parameters are associated with elements that occur in the logical form. Taylor (2003) advances a different theory and argues that hidden parameters are represented in the syntactic basement of the lexicon. They are constituents not of sentences but of words. On Taylor’s view, the lexical representations of words specify the parameters that must be filled in with contextual values in order for utterances of sentences to have determinate truth-conditions. In a different version of indexicalism, some authors (Rothshield and Segal 2009) argue that the expressions that are regularly used to express different contents in different contexts ought to be treated as ordinary context-sensitive expressions and added to the Basic Set.

What all indexicalist theories have in common is the view that the variability of contents in contexts is always linguistically governed by conventional meanings of expressions. In all versions of indexicalism the phenomenon of semantic underdetermination is denied: The presence of each component of the content of an utterance of a sentence is mandatorily governed by a linguistic element occurring in the sentence either at the level of surface syntax or at the level of logical form.

b. Two Objections to Linguistic Pragmatism: Overgeneration and Normativity

There are two connected motivations that underlie the indexicalists’ defence of the orthodox view. One is a problem with overgeneration, the other is a problem with the normativity of meaning.

Linguistic pragmatists aim at keeping in place the distinctions among the level of linguistic meaning, the level of the contents of utterances, and the level of what speakers communicate indirectly by means of implicatures. To this end, linguistic pragmatists need a principled way to distinguish the contents of utterances (Sperber and Wilson’s and Carston’s explicatures, Bach’s implicitures, Recanati’s pragmatic truth-conditions) from implicatures. The canonical definition of explicature—and from now on this article adopts the term ‘explicature’ for pragmatically enriched contents of utterances—is the following:

An explicature is a pragmatically inferred development of logical form, where implicatures are held to be purely pragmatically inferred—that is, unconstrained by logical form.

The difficulty arises because explicatures are taken to be pragmatic developments of logical forms but not all pragmatic developments of logical forms count as explicatures. Linguistic pragmatists need to keep developments of logical forms that are explicatures apart from developments of logical forms that are not. Explicatures result from pragmatic processes that are not linguistically driven. There is a problem of overgeneration. As Stanley points out, if explicatures are linguistically unconstrained, then there is no explanation of why an utterance of sentence (18) can never have the same content as an utterance of sentence (19), or why an utterance of sentence (20) can have the same content as an utterance of sentence (21) but never the same content as an utterance of sentence (22):

(18) Everyone likes Sally.

(19) Everyone likes Sally and her mother.

(20) Every Frenchman is seated.

(21) Every Frenchman in the classroom is seated.

(22) Every Frenchman or Dutchman in the classroom is seated.

Carston and Hall (2012) try to answer Stanley’s objection of overgeneration from within the camp of linguistic pragmatists. For an assessment and criticism of their attempts, see Borg (2016). However, the point of Stanley’s objection of overgeneration is clear: Once pragmatic processes are allowed to contribute to direct contents of utterances in ways that are not linguistically governed by conventional meanings, it is difficult to draw the distinction between what speakers directly say and what they indirectly convey, so that the distinction between explicatures and implicatures collapses.

The other objection against linguistic pragmatism concerns the normativity of meaning. According to indexicalists, the explanation of contents of utterances supplied by semantics in the orthodox approach is superior to the explanation supplied by linguistic pragmatism because the former accounts for the normative aspect of meaning while the latter does not. Normativity is constitutive of the notion of meaning. If there are meanings, there must be such things as going right and going wrong with the use of language. The use of an expression is right if it conforms with its meaning, and wrong otherwise. If literal contents of speech acts are thought of in truth-conditional terms, conformity with meaning amounts to constraints on truth-conditions. In cases of expressions with one meaning the speaker undertakes the semantic burden of using them for expressing their conventional semantic values. In cases of polysemy the speaker undertakes the semantic burden of selecting a convention that fixes a determinate contribution to the truth-conditional contents expressed by utterances of sentences. In cases of expressions governed by conventions of saturation, the speaker undertakes the semantic burden of loading the demanded parameters with contextual values. Whenever the speaker fulfils these semantic burdens, she goes right with her use of language, otherwise she goes wrong, unless the speaker is speaking figuratively. As said above, the speaker who utters sentences (13)-(16) undertakes the semantic burden of loading a comparison class for ‘tall’, a course of action for ‘ready’, a location for ‘it rains’, a restricted domain of quantification for ‘everybody’. And a speaker who utters (17) undertakes the semantic burden of selecting the convention for ‘and’ that fixes the truth-functional conjunction or the convention that fixes the temporal/causal ordered conjunction.

Indexicalists say that the problem for linguistic pragmatism is to provide an account of how the meanings of expressions constrain truth-conditional contents of utterances, if the composition of truth-conditions is not governed by linguistic conventions, and how, lacking such an explanation, linguistic pragmatism can preserve the distinction between going right and going wrong with the use of language.

The remainder of this section gives a short illustration of the version of indexicalism that tries to explain the variability of contents in contexts by adding hidden variables in the logical form of sentences. The next two sections introduce some technicalities, and the reader who is content with a general introduction to context-sensitivity can skip to section 4.

c. Hidden Variables and the Binding Argument

Some indexicalists (Stanley, Szabo, Williamson) reinstate the Gricean syntactic constraint, rejected by linguistic pragmatists, at the level of logical form. They maintain that every pragmatic process that contributes to the determination of the truth-conditional content of a speech act is a process of saturation that is always activated by the linguistic meaning of an expression. If there is no trace of such expression in the surface syntactic structure, then there must be an element in the logical form that triggers a saturation process. The variables in the logical form work as indexicals that require contextual assignments of values. The pragmatic processes that assign the values of those variables are processes that are governed by linguistic rules; they are not optional.

Here are some examples, with some simplifications, given that a correct rendering of the logical form would require more technicalities. Suppose that, while on the phone to Mary on 25 November 2019, answering a question about the weather in London, George says:

(15) It’s raining.

People tend to agree that George said that it is raining in London on that date. Linguistic pragmatists concede that the reference to the day is due to the present tense of the verb, which works as an indexical expression that refers to the time of the utterance. However, the reference to the place, the city of London, is given by free enrichment. For linguistic pragmatists (15) can be represented as follows:

(15*) It’s raining _(t).

The variable ‘t’ corresponds to the present tense of the verb. In the logical form there is no variable taking London as value. On the contrary, indexicalists claim that (17) can be represented as follows:

(15**) It’s raining _{(t, l)}.

In (15**) the variable ‘l’ takes London as a value. The process that assigns London to the variable ‘l’ is of the same kind as the process that assigns a referent to the indexical ‘here’ and it is linguistically driven because it is activated by an element of the logical form.

The variables that indexicalists insert in logical forms have a more complex structure. In (15**) the variable ‘t’ has the structure ‘ƒ(x)’ and the variable ‘l’ has the structure ‘ƒ*(y)’. ‘x’ is a variable that takes contextually salient entities as values and ‘ƒ’ is a variable that ranges over functions from entities to temporal intervals. The variable ‘y’ also takes contextually salient entities as values, and ‘ƒ*’ ranges over functions from entities to locations. The reason for this complexity will be explained in the next section. For now, it suffices to note that in simple cases like (15**), ‘x’ take instants as values and ‘ƒ ’ takes the identity function, so that ƒ(x) = x. Likewise, ‘y’ takes locations as values and ‘ƒ*’ takes the identity function, so that ƒ*(y) = y.

Here is another example. Consider Mark, the player whose coach makes the following assertion:

(5) Mark is short.

The coach said that Mark is short with respect to the average height of basketball players. Indexicalists explain this case by inserting a variable in (5):

(5*) Mark is short _(h).

‘h’ is a variable that takes standards of height as values. The variable ‘h’ too has a structure of the kind ‘ƒ(x)’, where ‘x’ ranges over contextually salient entities (for example, the set of basketball players) and ‘ƒ’ over functions that map the salient entities to other entities (for instance, the subset of the basketball players that are shorter than the average height of basketball players).

Here is an example with quantifiers. Consider the following sentence, asserted by the professor of logic:

(8) All students got an A.

The professor said that all students that took the logic class got an A. Indexicalists claim that in the quantifier ‘all students’ there is a variable that assigns domains of quantification:

(8*) [all x: student x]_ƒ(y) (got an A x).

In this example the value of the variable ‘y’ is the professor of logic and the value of ‘ƒ’ is a function that maps y onto the set of students who took the logic class taught by y. This set becomes the domain of the quantifier ‘all students’.

Stanley and Szabo present a strategy for justifying the insertion of hidden variables in logical forms, the so-called binding argument: to show that an element of the truth-conditional content of an utterance of a sentence is the result of a process of saturation, it is enough to show that it can vary in accordance with the values of a variable bound by a quantifier.

Consider the following sentence:

(23) Whenever Bob lights a cigarette, it rains.

An interpretation of (23) is the following: Whenever Bob lights a cigarette, it rains where Bob lights it. In this interpretation, the location where it rains varies in relation to the time when Bob lights a cigarette. Therefore, the value of the variable ‘l’ in ‘it rains _{(t, l)}’ depends on the value of the variable ‘t’ that is bound by a quantifier that ranges over times. This interpretation can be obtained if (23) is represented as follows:

(23*) [every t: temporal interval t Ù Bob lights a cigarette at t](it rains _{(ƒ(t), ƒ*(t))}).

The value of ‘ƒ’ is the identity function so that ƒ(t) = t, and the value of ‘ƒ*’ is a function that assigns to the time that is the value of ‘t’ the location where Bob lights a cigarette at that time.

d. Objections to the Binding Argument

Some philosophers (Cappelen and Lepore 2002, Breheny 2004) raise an objection of overgeneration against the binding argument. In their view, the binding argument forces the introduction of too many hidden variables, even when there is no need for them. The strongest objection against the binding argument has been raised by Recanati (2004: 110), who argues that the binding argument is fallacious. Recanati summarizes the binding argument as follows:

1. Linguistic pragmatism maintains that in ‘it rains’ the implicit reference to the location is the result of a process of modulation that does not require any covert variable.
2. In the sentence ‘whenever Bob lights a cigarette, it rains’, the reference to the location varies according to the value of the variable bound by the quantifier ‘whenever Bob lights a cigarette’.
3. There can be no binding without a variable in the logical form.
4. In the logical form of ‘it rains’ there is a variable for locations, although phonologically not realized.

Therefore:

1. Linguistic pragmatism is wrong: In ‘it rains’, the reference to the location is mandatory, because it is articulated in the logical form.

Recanati argues that this argument is fallacious because of an ambiguity in premise 4, where the sentence ‘it rains’ can be intended either in isolation or as a part of compound phrases. According to Recanati, the sentence ‘it rains’ contains a covert variable when it occurs as a part of the compound sentence ‘whenever Bob lights a cigarette, it rains’, but it does not contain any variable when it occurs alone.

Recanati proposes a theory that admits that binding requires variables in the logical form, but at the same time it rejects indexicalism. Recanati makes use of expressions that modify predicates. Given an n-place predicate, a modifier can form an n+1 place or an n-1 place predicate. A modifier expresses a function from properties/relations to other properties/relations. For example, Recanati says that ‘it rains’ expresses the property of raining, which is predicated of temporal intervals. Expressions like ‘at’, ‘in’, and so forth, transform the predicate ‘it rains _(t)’ from a one-place predicate to a two-place predicate: ‘it rains _{(t, l)}’. Expressions like ‘here’ or ‘in London’ are special modifiers that transform the predicate ‘it rains’ from a one-place predicate to a two-place predicate but also provide a value for the new argument place. Recanati argues that expressions like ‘whenever Bob lights a cigarette’ are modifiers of the same kind as ‘here’ and ‘in London’. They change the number of predicate places and provide a value to the new argument through the value of the variable they bind. Recanati’s conclusion is that although binding requires variables in the logical form of compound sentences, there is no need to insert covert variables in sub-sentential expressions or sentences in isolation.

The next section presents a different approach to semantics, one that distinguishes between semantic contents and speech act contents.

4. Defending the Autonomy of Semantics: Minimalism

a. Distinguishing Semantic Content from Speech Act Content

Indexicalists and linguistic pragmatists share the view that the goal of semantics is to explain the explicit contents of speech acts performed by utterances of sentences. They both agree that there must be a close explanatory connection between the meaning encoded in a sentence S and the contents of speech acts performed by utterances of S. One important corollary of this conception is that if a sentence S is systematically uttered for performing speech acts with different contents at different contexts, this phenomenon calls for an explanation on behalf of semantics. The point of disagreement between indexicalists and linguistic pragmatists is that the former think that semantics can provide such an explanation while the latter think that semantics alone is not sufficient and a new theoretical model is needed, one in which pragmatic processes, semantically unconstrained, contribute to determine the contents of speech acts. As said above, indexicalists explain the variability of contents in contexts in terms of context-sensitivity by enlarging the range of indexicality and polysemy, whereas linguistic pragmatists explain it in terms of semantic underdetermination. The debate between indexicalists and linguistic pragmatists starts taking for granted the explanatory connection between semantics and contents of speech acts.

Minimalism in semantics is a family of theories that reject the explanatory connection between semantics and contents of speech acts. Minimalists (Borg 2004, 2012, Cappelen and Lepore 2005, Soames 2002) maintain that semantics is not in the business of explaining the contents of speech acts performed by utterances of sentences. Minimalists work with a notion of semantic content that does not play the role of speech acts content. According to them the semantic content of a sentence is a full truth-conditional content that is obtained compositionally by the syntactic structure of the sentence and the semantic values of the expressions in the sentence that are fixed by conventional meanings. Moreover, they claim that the Basic Set of genuinely context-sensitive expressions, which are governed by conventions of saturation, comprises only overt indexicals (pronouns, demonstratives, tense markers, and a few other words). Minimalists call the semantic content of a sentence its minimal content.

The above statement that minimal contents are not contents of speech acts requires qualification. Cappelen and Lepore argue indeed for speech act pluralism. They argue that speech acts have a plurality of contents and the minimal content of a sentence is always one of many contents that its utterances express. In order to protect speech act pluralism from the objection that very often speakers are not aware of having made an assertion with the minimal content, and, if asked, they would deny having made an assertion with the minimal content, Cappelen and Lepore argue that speakers can sincerely assert a content without believing it and without knowing they have asserted it. For example, if Mary looks into the refrigerator and says ‘there are no beers’, Mary herself would deny that she asserted that there are no beers in existence and deny that she believes that there are no beers in existence, although that there are no beers in existence is the minimal content that the sentence ‘there are no beers’ semantically expresses.

The main line of the minimalists’ attack on indexicalism and linguistic pragmatism is methodological. Minimalists argue that both indexicalists and linguistic pragmatists adhere to the methodological principle that says that a semantic theory is adequate just in case it accounts for the intuitions people have about what speakers say, assert, claim, and state by uttering sentences. Minimalists claim that this principle is mistaken just because it conflates semantic contents and contents of speech acts. Semantics is the study of the semantic values of the lexical items and their contribution to the semantic contents of complex expressions. Contents of speech acts, instead, are contents that can be used to describe what people say by uttering sentences in particular contexts of utterance.

b. Rebutting the Arguments for Linguistic Pragmatism

Minimalists dismiss context-shifting arguments and inappropriateness arguments just on the grounds that they conflate intuitions about semantic contents of sentences and intuitions about contents of speech acts. Incompleteness arguments are a subtler matter and require more articulated responses. Cappelen and Lepore’s (2005) response and Borg’s (2012) response are presented in the following. An incompleteness argument aims at showing that there is no invariant content that a sentence S expresses in all contexts of utterance. For example, with respect to:

(14) Mary is ready,

an incompleteness argument starts from the observation that if (14) is taken separately from contextual information specifying what Mary is ready for, people are unable to evaluate it as true or false. This evidence leads to the conclusion that there is no minimal content—that Mary is ready (simpliciter)—that is invariant and semantically expressed by (14) in all contexts of utterance. In general, then, the conclusions of incompleteness arguments are that minimal contents do not exist: without pragmatic processes, many sentences in our language do not express full propositional contents with determinate truth-conditions.

Cappelen and Lepore accept the premises of incompleteness arguments, that is, that people are unable to truth-evaluate certain sentences, but they argue that from these premises it does not follow that minimal contents do not exist. Borg adopts a different strategy. Borg tries to block incompleteness arguments by rejecting their premises and explaining away people’s inability to truth-evaluate certain sentences.

Cappelen and Lepore raise the objection that incompleteness arguments try to establish metaphysical conclusions, for example about the existence of the property of being ready (simpliciter) as a building block of the minimal content that Mary is ready (simpliciter), from premises that concern psychological facts regarding people’s ability to evaluate sentences as true or false. They point out that psychological data are not relevant in metaphysical matters. The data about people’s dispositions to evaluate sentences might reveal important facts about psychology and communication but have no weight at all in metaphysics. Cappelen and Lepore say that people’s inability to evaluate sentences like (14) as true or false independently of contextual information does not provide evidence against the claim that the property of being ready exists and is the semantic content of the adjective ‘ready’. On the one hand, they acknowledge that the problem of giving the analysis of the property of being ready is a difficult one, but it is for metaphysicians, not for semanticists. On the other hand, they argue that semanticists have no difficulty at all in stating what invariant minimal content is semantically encoded in (14). Sentence (14) semantically expresses the minimal content that Mary is ready. There is no difficulty in determining its truth-conditions either: ‘Mary is ready’ is true if and only if Mary is ready.

Cappelen and Lepore address the immediate objection that if the truth-condition of (14) is represented by a disquotational principle like the one reported above, then nobody is able to verify whether such truth-condition is satisfied or not. This fact is witnessed by people’s inability to evaluate (14) as true or false independently of information specifying what Mary is ready for. Cappelen and Lepore respond that it is not a task for semantics to ascertain how things are in the world. For example, it is not a task for semantics to say whether (14) is true or false. That a semantic theory for a language L does not provide speakers with a method of verifying sentences of L is not a defect of that semantic theory. Cappelen and Lepore say that those theorists who think otherwise indulge in verificationism. For an objection to Cappelen and Lepore see Recanati (2004), Clapp (2007), Penco and Vignolo (2019).

In Pursuing Meaning Borg offers a different strategy for blocking incompleteness arguments. Borg’s strategy is to explain away the intuitions of incompleteness. Borg agrees that speakers have intuitions of incompleteness with respect to sentences like ‘Mary is ready’, but she argues that intuitions of incompleteness emerge from some overlooked covert and context-insensitive syntactic structure. Borg says that ‘ready’ is lexically marked as an expression with two argument places. On Borg’s view ‘ready’ always denotes the same relation, the relation of readiness, which holds between a subject and the thing for which they are held to be ready. When only one argument place is filled at the surface level, the other is marked by an existentially bound variable in the logical form. Thereby ‘ready’ makes exactly the same contribution in any context of utterance to any propositional content literally expressed. For example, Borg says that in a context where what is salient is the property being ready to join the fire service, the sentence ‘Mary is ready’ literally expresses the minimal content that Mary is ready for something not that Mary is ready to join the fire service. As Borg points out, the minimal content that Mary is ready for something is almost trivially true. Yet, Borg warns not to conflate intuitions about the informativeness of a propositional content with intuitions about its semantic completeness.

Borg’s explanation of the intuitions of incompleteness is that speakers are aware of the need for the two arguments, which is in tension with the phonetic delivery of only one argument. Speakers are uneasy to truth-evaluate sentences like ‘Mary is ready’ not because the sentence is semantically incomplete and lacks a truth-condition, but because their expectation for the second argument to be expressed is frustrated and the minimal content that is semantically expressed, when the argument role corresponding to the direct object is not filled at the surface level, is barely informative. For a critical assessment of Borg’s strategy, see Clapp (2007) and Penco and Vignolo (2019).

The following subsection illustrates the tenets that characterise minimalism and the central motivation for it.

c. Motivation and Tenets of Minimalism

Minimalism is characterised by four main theses (Borg 2007) and one main motivation. The first thesis is propositionalism. Propositionalism states that sentence types, relativized to indexes representing contexts of utterance, express full propositional contents with determinate truth-conditions. These semantic contents are the minimal ones, which are invariant through contexts of utterance when sentence types do not contain overt context-sensitive expressions. Propositionalism distinguishes minimalism from radical minimalism, which is a philosophical view sustained by Bach (2007). Bach acknowledges the existence of semantic contents of sentence types, but he rejects the view that such contents are always fully propositional with determinate truth-conditions. According to Bach, most semantic contents are propositional radicals. As Borg points out, despite the fact that Bach insists on avowing that he is not a linguistic pragmatist, it is not easy to spot substantial differences between Bach’s view and linguistic pragmatism. Although Bach’s semantically incomplete sentences are not context-sensitive unless they contain overt context-sensitive expressions, linguistic pragmatists need not deny that semantic theories are possible. They simply maintain that in most cases semantic theories deal with sub-propositional contents. Bach and linguistic pragmatists agree that, in many if not most cases, in order to reach full propositional contents theorists need to focus on speech acts and not on sentence types.

The second important thesis of minimalism is the Basic Set assumption. The Basic Set assumption states that the only genuine context-sensitive expressions that trigger and drive pragmatic processes for the determination of semantic values are those that are listed in the Basic Set, that is, overt indexicals like ‘I’, ‘here’, ‘now’, ‘that’, plus or minus a bit. Expressions like ‘ready’, ‘tall’, ‘green’, quantified noun phrases, and so on, are not context-sensitive.

The third tenet of minimalism is the distinction between semantic contents and speech acts contents: Semantic contents are not what speakers intend to explicitly and directly communicate. The contents explicitly communicated are pragmatic developments of semantic contents. As said, this move serves to disarm batteries of arguments advanced by indexicalists and linguistic pragmatists. Even if in almost all cases semantic contents are not the contents of speech acts, they nonetheless play an important theoretical role in communication. Semantic contents are fallback contents that people are able to understand on the sole basis of their linguistic competence when they ignore or mistake the intentions of the speakers and the contextual information needed for understanding what speakers are trying to communicate. Minimal contents can play this role in communication just because they can be grasped simply in virtue of linguistic competence alone.

The fourth and last thesis of minimalism is a commitment to formalism. Formalism is the view that the processes that compute the truth-conditional contents of sentence types are entirely formal and computational. There is an algorithmic route to the semantic content of each sentence (relative to an index representing contextual features), and all contextual contributions to semantic contents are formally tractable. More precisely, all those contextual contributions that depend on speakers’ intentions must be kept apart from semantic contents. This last claim puts a further constraint on context-sensitive expressions, which ought to be responsive only to objective aspects of contexts of utterance, like who is speaking, when, and where. These are the features that Bach (1994, 1999, 2001) and Perry (2001) termed narrow features of contexts and play a semantic role, as opposed to wide features that depend on speakers’ intentions and play a pragmatic role. It is also a claim that relates to Kaplan’s distinction between pure (automatic) indexicals, which refer semantically by picking out objective features of the context of utterance, and intentional indexicals, which refer pragmatically in virtue of intentions of speakers (Kaplan 1989a, Perry 2001).

Formalism is related to one of the main motivations for minimalism. Minimalism is compatible with a modular account of meaning understanding. The modularity theory of mind is the view that the mind is constituted of discrete and relative autonomous modules, each of which is dedicated to the computation of particular cognitive functions. A module possesses a specific body of information and specific rules working computationally on that body of information. Among such modules there is one, the module of the faculty of language, which is dedicated to the comprehension of literal contents of sentences. This model includes phonetic/orthographic information and related rules, syntactic information and related rules, and semantic information and related rules.

A minimalist semantics fits well as part of the language module since it derives the truth-conditional contents of sentences, relative to indexes, in a computational way operating on representations of semantic properties of the lexicon and with formal rules working on such representations. Thus, if linguistic comprehension is modular, minimalism offers a theory that is consistent with the existence of the language module.

The following data are often-invoked evidence to justify the claim that linguistic comprehension is modular. The understanding of literal meanings of sentences seems to be the result of domain-specific and encapsulated processes. Linguistically competent people understand the literal meaning of a sentence even when they ignore salient aspects of the context of utterance and the communicative intentions of the speaker. Moreover, the understanding of literal meaning is carried out independently of any sort of encyclopaedic information. The processes that yield literal truth-conditional contents of sentences are mandatory, very fast, and mostly unavailable to consciousness. People cannot help reading certain signs and hearing certain sounds as utterances of sentences in languages they understand. Competent speakers interpret straightforwardly and very quickly those signs and sounds as sentences with literal contents without being aware of the information and the rules operating on it that yield such an understanding. Finally, linguistic understanding is associated with localized neuronal structures that undergo regularities in acquisition and development processes, and regularities of breakdown due to neuronal damages. In conclusion, for those who believe that this is good evidence that comprehension of literal meaning is modular, minimalism offers a semantic theory that can be coherently taken to be part of the language faculty module.

The presentation of minimalism closes with the discussion of the tests that Cappelen and Lepore propose in order to select the only context-sensitive expressions that go into the Basic Set. The following subsection contains some technicalities. The reader who is mainly interested in an overview on context-sensitivity can skip to section 5.

d. Testing Context-Sensitivity

Cappelen and Lepore propose different tests for distinguishing the expressions in the Basic Set that are genuinely context-sensitive from those that are not. Here only one of their tests is illustrated, but it is sufficient to give a hint of their work.

Test of inter-contextual disquotational indirect reports: Suppose that Anna, who had planned to climb Eiger’s North Face on July 1 but cancelled, utters the following sentence on July 2:

(24) Yesterday I was not ready.

Suppose that on July 3 Mary indirectly reports what Anna said on July 2. Mary cannot use the same words as Anna used. If she did, she would make the following report:

(25) Anna said that yesterday I was not ready.

From this example it is clear that context-sensitive expressions like ‘I’and ‘yesterday’ generate inter-contextual disquotational indirect reports that are false or inappropriate.

Cappelen and Lepore say that it is possible to make inter-contextual disquotational indirect reports with the adjective ‘ready’, and this fact provides evidence that ‘ready’ is not context-sensitive. Assume that on July 5 Mary utters the following sentence:

(26) On July 1 Anna was not ready.

Then, on July 6 George might report what Mary said with the utterance of the following sentence:

(27) Mary said that on July 1 Anna was not ready.

These results generalize to all expressions that do not belong to the Basic Set.

Another case is the following. Suppose Mary utters ‘Anna is ready’ in a context C₁ to say that Anna is ready to climb Eiger’s North Face and makes a second utterance of it in a context C₂ to say that Anna is ready to sit her logic exam. Cappelen and Lepore argue that in a context C₃ the following reports are true:

(28) Mary said that Anna is ready (with respect to the utterance in C₁).

(29) Mary said that Anna is ready (with respect to the utterance in C₂).

(30) In C₁ and C₂ Mary said that Anna is ready.

Cappelen and Lepore say that linguistic pragmatism and indexicalism have difficulty explaining the truth of the above inter-contextual disquotational indirect reports. It is not obvious, however, that the difficulty Cappelen and Lepore propose is insurmountable. The context C₃ might differ from C₁ and C₂ because the speaker, the time, and the place of the utterance are different, but the same contextual information might be available in C₃ and be relevant for the interpretation of the utterance in C₁ or C₂. In C₃ the speaker (and the audience too) might be aware that Mary was talking about alpinism in C₁ and of logic exams in C₂.

According to a suggestion by Stanley (2005b), and Cappelen and Hawthorne (2009), sentence (30) might be represented as follows:

(30*) C₁and C₂lx (in x Mary said that Anna is ready_ƒ(x)).

Here the variable ‘x’ takes contexts as values and the variable ‘ƒ’ takes a function that maps contexts to kinds of actions or activities salient in those contexts. This analysis yields the interpretation that the report (30) is true if and only if in C₁ Mary said that Anna is ready to climb Eiger’s North Face and in C₂ Mary said that Anna is ready to take her logic exam. On the other hand, if one supposes that the speaker in C₃ has the erroneous belief that Mary was talking about Anna’s readiness to go out with friends, linguistic pragmatists and indexicalists will doubt the truth of the reports (28)-(30) and reduce the debate to a conflict of intuitions.

The test of inter-contextual disquotational indirect reports and the other tests that Cappelen and Lepore present, such as the test of inter-contextual disquotation and the test of collective descriptions, raised an intense debate. For critical assessments of these tests, see Leslie (2007) and Taylor (2007). Cappelen and Hawthorne (2009) present the test of agreement, while Donaldson and Lepore (2012) add the test of collective reports. Limits of space prevents deeper detail of the debate on tests for context-sensitivity. The foregoing suffices to give an idea of the kind of arguments that philosophers involved in that debate deal with.

While minimalism is a strong alternative to linguistic pragmatism and indexicalism, another approach develops in a new way the idea of invariant semantic contents: relativism. The next section presents the view of relativism, which reconceptualises the relations between meaning and context.

5. Defending Invariant Semantic Contents: Relativism

a. Indexicality, Context-Sensitivity, and Assessment-Sensitivity

Relativism in semantics provides a new conceptualization of context dependence. Relativists (Kolbel 2002, MacFarlane 2014, Richard 2008) recover invariant semantic contents and explain some forms of context dependence not in terms of variability of contents in contexts of utterance but in terms of variability of extensions in contexts of assessment. A context of utterance is a possible situation in which a sentence might be uttered and a context of assessment is a possible situation in which a sentence might be evaluated as true or false.

As said in section 1b, Kaplan represents meanings as functions that return contents in contexts of utterances. Contents are functions that distribute extensions in circumstances of evaluation. The content of a sentence in a context of utterance is a function that returns truth-values at standard circumstances of evaluation composed of a possible world and a time. MacFarlane shows that the technical machinery of Kaplan’s semantics is apt to draw conceptual distinctions among what he calls indexicality, context-sensitivity, and assessment-sensitivity. MacFarlane’s notion of indexicality covers the standard variability of contents in contexts. His notions of context-sensitivity and assessment-sensitivity cover new semantic phenomena, according to which expressions might change extensions while maintaining the same contents. MacFarlane’s notions are defined as follows:

Indexicality:

An expression E is indexical if and only if its content at a context of utterance depends on features of the context of utterance.

Context-sensitivity:

An expression E is context-sensitive if and only if its extension at a context of utterance depends on features of the context of utterance.

Assessment-sensitivity:

An expression E is assessment-sensitive if and only if its extension at a context of utterance depends on features of a context of assessment.

For example, consider two utterances of (5): a true utterance in a conversation about basketball players and a false utterance in a conversation about the general population.

(5) Mark is short.

Indexicality: The standard account in terms of indexicality affirms that the two utterances have different contents because the adjective ‘short’ is treated as an expression that expresses different contents in different contexts of utterance. According to indexicalism, the meaning of ‘short’ demands that the speaker fill in a standard of height that is operative in the context of utterance in order to determine the content of the utterance. Thus, the speaker in the first conversation expresses a different content than that expressed in the second conversation. Since the difference in truth-values between the two utterances is explained in terms of a difference in contents, the context of utterance—in our example speaker’s intentions referring to different standard of height—has a content-determinative role.

Context-sensitivity: Context-sensitivity, in MacFarlane’s sense, explains the difference in truth-values in terms of a difference in the circumstance of evaluation. The circumstance of evaluation is enriched with non-standard parameters. In our example, the circumstance of evaluation is enriched with a parameter concerning the standard of height. The meaning of ‘short’ returns the same content in all contexts of utterance. The content of ‘short’ is invariant across contexts of utterance, but it returns different extensions in circumstances of evaluation that comprise a possible world, a time, and a standard of height. The standard of height that is operative in the first conversation enters the circumstance of evaluation with respect to which ‘short’ has an extension in that context of utterance. According to that standard of height, Mark does belong to the extension of ‘short’. The standard of height that is operative in the second conversation enters the circumstance of evaluation with respect to which ‘short’ has an extension in that context of utterance. According to that standard of height, Mark does not belong to the extension of ‘short’. With context-sensitivity (in MacFarlane’s sense) the context of utterance has a circumstance-determinative role, since it fixes the non-standard parameters that enter the circumstance of evaluation with respect to which expressions have extensions at the context of utterance.

Context-sensitivity so defined is not relativism. For any context of utterance, expressions have just one, if any, extension at that context. In particular, sentences in contexts have absolute truth-values. Truth for sentences in contexts is defined as follows:

A sentence S at a context of utterance i is true if and only if S is true in i_w at i_t and with respect to i_h1…i_hn, where i_w and i_t are the world and the time of the context of utterance i, and i_h1…i_hn are all the non-standard parameters, demanded by the expressions in S, which are operative in i (in the above example the standard of height demanded by ‘short’, that is, the average height of basketball players in the first context and the average height of American citizens in the second context).

On the contrary, relativism holds that the extensions of expressions at contexts of utterance are relative to contexts of assessment. So, if contexts of assessment change, extensions too might change. In particular, sentences are true or false at contexts of utterance relative to contexts of assessment. Relative truth is defined as follows:

A sentence S at a context of utterance i is true relative to a context of assessment a if and only if S is true in i_w at i_t and with respect to a_h1…a_hn, where i_w and i_t are the world and the time of the context of utterance i, and a_h1…a_hn are all the non-standard parameters, demanded by the expressions in S, that are operative in the context of assessment a.

Relativism requires small revisions of the technical machinery of standard truth-conditional semantics in order to define the notion of relative truth, but it provides a radical reconceptualization of the ways in which meaning, contents, and extensions are context-dependent. Different authors apply relativism to different parts of language. MacFarlane (2014) presents a relativistic semantics for predicates of taste, knowledge attributions, epistemic modals, deontic modals, and future contingents. Kompa (2002) and Richard (2008) offer a relativist treatment of comparative adjectives like ‘short’. Predelli (2005) suggests a view close to relativism for colour words like ‘green’.

The major difficulty for relativists is not technical but conceptual. Relativism must explain what it is for a sentence at a context of utterance to be true relative to a context of assessment. The next subsection presents MacFarlane’s attempt to answer this conceptual difficulty. The final subsection discusses the case of faultless disagreement, which many advocates of relativism employ to show it superior to rival theories in semantics.

b. The Intelligibility of Assessment-Sensitivity

Many philosophers, following Dummett, say that the conceptual grasp of the notion of truth is due to a clarification of its role in the overall theory of language. In particular, the notion of truth has been clarified by its connection with the notion of assertion. One way to get this explication is to take the norm of truth as constitutive of assertion. The norm of truth can be stated as follows:

Norm of truth: Given a context of utterance C and a sentence S, an agent is permitted to assert that S at C only if S is true. (Remember that a sentence S at a context of utterance C is true if and only if S is true in the world of C at the time of C.)

Relativism needs to provide the explication of what it is for a sentence at a context of utterance to be true relative to a context of assessment. If the clarification of the notion of relative truth is to proceed along with its connection to the notion of assertion, what is needed is a norm of relative truth that relates the notion of assertion to the notion of relative truth. It would seem intuitive to employ the following norm of relative truth that privileges the context of utterance and selects it as the context of assessment:

Norm of relative truth: given a context of utterance C and a sentence S, an agent is permitted to assert that S at C only if S at context C is true as assessed from context C itself.

The problem, as MacFarlane points out, is that if the adoption of the norm of relative truth is all that can be said in order to explicate the notion of relative truth, then assessment-sensitivity is an idle wheel with no substantive theoretical role. Relativism becomes a notational variant of standard truth-conditional semantics. The point is that when the definition of relative truth is combined with the norm of relative truth, which picks out the context of utterance and makes it the context of assessment, relativism has the same prescriptions for the correctness of assertions as standard truth-conditional semantics, which works with the definition of truth (simpliciter) combined with the norm of truth.

MacFarlane argues that in order to clarify the notion of relative truth, the norm of relative truth is necessary but not sufficient. In order to gain a full explication of the notion of relative truth, a norm for retraction of assertions must be added to the norm of relative truth. MacFarlane presents the norm for retraction as follows:

Norm for retraction: An agent at a context of assessment C2 must retract an assertion of the sentence S, uttered at a context of utterance C1, if S uttered at C1 is not true as assessed from C2.

Relativism together with the norm of relative truth and the norm for retraction predicts cases of retraction of assertions that other semantic theories are not able to predict. Consider the following example: Let C1 be the context of utterance consisting of John, a time t in the year 1982, and the actual world @; let C2 be the context of utterance consisting of John, a time t´ in 2019, and the actual world @. Let C3 be the context of assessment in which John’s taste in 1982 is operative and C4 the context of assessment in which John’s taste in 2019 is operative.

Suppose John did not like green tea in 1982, when he was ten years old, but he likes green tea a lot in 2019, when he is forty-seven years old. Green tea is not in the extension of ‘tasty’ at C1 as assessed from C3 but it is in the extension of ‘tasty’ at C2 as assessed form C4. Suppose John utters:

(31) ‘Green tea is not tasty’ at C1

and

(32) ‘Green tea is tasty’ at C2.

Relativism predicts that both assertions are correct. John does not violate the norm of relative truth. However, relativism also predicts that in 2019 John must retract the assertion he made in 1982, because in 1982 John uttered a sentence that is false as assessed from C4.

Notice that John’s retraction of his assertion made in 1982 is predicted only by relativism, which treats the adjective ‘tasty’ as assessment-sensitive. If ‘tasty’ is treated as an indexical expression, then John’s assertions in 1982 and in 2019 have two distinct contents, and there is no reason why in 2019 John ought to retract his assertion made in 1982, because his assertion made in 1982 is true. There is no reason why John ought to retract his assertion if ‘tasty’ is treated as a context-sensitive expression. In this case too John’s assertion made in 1982 is true, because the circumstance of evaluation of his 1982 assertion contains the taste that is operative for John in 1982. Retraction is made possible only if ‘tasty’ is assessment-sensitive, making it possible to assess an assertion made in a context of utterance with respect to parameters that are operative in another context (the context of assessment).

c. Faultless Disagreement

Even if one accepts MacFarlane’s explanation of the intelligibility of relativism, it remains an open question whether languages contain assessment-sensitive expressions. It is important, then, to clarify whether there are linguistic phenomena that relativism explains better than linguistic pragmatism, indexicalism, or minimalism. Relativists address a representative phenomenon: faultless disagreement. In a pre-theoretic sense there is faultless disagreement between two parties when they disagree about a speech act or an attitude and neither of them violates any epistemic or constitutive norm governing speech acts or attitudes.

Faultless disagreement is very helpful to model disputes about non-objective matters, for instance, disputes on aesthetic values like tastes. Such disputes show the distinctive linguistic traits of genuine disagreement when the parties involved say ‘No, that is false’, ‘What you are saying is false’, ‘You are wrong, I disagree with you’, and so on. However, many philosophers feel compelled to avoid the account of disagreement that characterizes matters of objective fact, which in subjective areas of discourse would impute implausible cognitive errors and chauvinism to the parties in disagreement.

First, it is important to identify what kinds of disagreement are made intelligible in different semantic theories. Then, given an area of discourse, one must ask which of these kinds of disagreement can be found in it. Thus, semantic theories can be assessed on the basis of which of them predicts the kind of disagreement that is present in that area of discourse.

By employing the notion of relative truth, MacFarlane defines the following notion of accuracy for attitudes and speech acts:

Assessment-sensitive accuracy: An attitude or speech act occurring at a context of utterance C1 is accurate, as assessed from a context of assessment C2, if and only if its content at C1 is true as assessed from C2.

Based on the notion of assessment-sensitive accuracy, MacFarlane defines the following notion of disagreement:

Preclusion of joint accuracy: Agent A disagrees with agent B if and only if the accuracy of the attitudes or speech acts of A, as assessed from a given context, precludes the accuracy of the attitudes or speech acts of B, as assessed from the same context.

There are also different senses in which an attitude or speech act can be faultless. One of them is the absence of violation of constitutive norms governing attitudes or speech acts. According to MacFarlane, the kind of faultless disagreement given by preclusion of joint accuracy together with absence of violation of constitutive norms of attitudes or speech acts is typical of disputes in non-objective matters like taste.

Consider the sentence ‘Green tea is tasty’. Relativism accommodates the idea that its truth depends on the subjective taste of the assessor. Whether green tea is tasty is not an objective state of affairs. Suppose John utters the sentence ‘Green tea is tasty’ and George utters the sentence ‘Green tea is not tasty’. John and George disagree to the extent that there is no context of assessment from which both John’s and George’s assertions are accurate, but neither of them violates the norm of relative truth and the norm of retraction. John’s assertion is accurate if assessed from John’s context of assessment where John’s standard of taste is operative. George’s assertion is accurate if assessed from George’s context of assessment where George’s standard of taste is operative. They are both faultless. Moreover, George will acknowledge that ‘Green tea is tasty’ is true if assessed from John’s standard of taste and vice versa. Finally, suppose that after trying green tea several times, George starts appreciating it. George now says:

(33) Green tea is tasty.

George must retract his previous assertion and say:

(34) What I said (about green tea) is false.

Relativism predicts this pattern of linguistic uses of the adjective ‘tasty’. On the contrary, other semantic theories cannot describe the dispute between John and George as a case of faultless disagreement defined as preclusion of joint accuracy and absence of violation of constitutive norms governing attitudes/speech acts.

Linguistic pragmatism and indexicalism affirm that John’s and George’s tastes have a content-determinative role. Uttered by John, ‘tasty’ means tasty in relation to John’s standard of taste, and uttered by George it means tasty in relation to George’s standard of taste. Therefore, the sentence ‘green tea is tasty’ has a different content in John’s context of utterance than in George’s, with the consequence that disagreement is lost.

Minimalism says that the content of ‘tasty’, the objective property of tastiness, is invariant through all contexts of utterance and its extension in a given possible world is invariant through all contexts of assessment. Therefore, either green tea is in the extension of ‘tasty’ or is not. In this case, John and George are in disagreement but at least one of them is at fault.

6. References and Further Reading

a. References

Bach, Kent, 1994. ‘Conversational Impliciture’, Mind and Language, 9: 124-162.
Bach, Kent, 1999. ‘The Semantics-Pragmatics Distinction: What It Is and Why It Matters’, in K. Turner (ed.), The Semantics-Pragmatics Interface from Different Points of View, Oxford: Elsevier, pp. 65-84.
Bach, Kent, 2001. ‘You Don’t Say?’, Synthese, 128: 15-44.
Bach, Kent, 2007. ‘The Excluded Middle: Minimal Semantics without Minimal Propositions’, Philosophy and Phenomenological Research, 73: 435-442.
Borg, Emma, 2004. Minimal Semantics, Oxford: Oxford University Press.
Borg, Emma, 2007. ‘Minimalism versus Contextualism in Semantics’, in Preyer & Peter (2007), pp. 339-359.
Borg, Emma, 2012. Pursuing Meaning, Oxford: Oxford University Press.
Borg, Emma, 2016. ‘Exploding Explicatures’, Mind and Language, 31(3): 335-355.
Breheny, Richard, 2004. ‘A Lexical Account of Implicit (Bound) Contextual Dependence’, in R. Young, and Y. Zhou (eds.), Semantics and Linguistic Theory (SALT) 13, pp. 55-72.
Cappelen, Herman, and Lepore, Ernie, 2002. ‘Indexicality, Binding, Anaphora and A Priori Truth’, Analysis, 62, 4: 271-81.
Cappelen, Herman, and Lepore, Ernest, 2005. Insensitive Semantics: A Defence of Semantic Minimalism and Speech Act Pluralism, Oxford: Blackwell.
Cappelen, Herman, and Hawthorne, John, 2009. Relativism and Monadic Truth, Oxford: Oxford University Press.
Carston, Robyn, 2002. Thoughts and Utterances: The Pragmatics of Explicit Communication, Oxford: Blackwell.
Carston, Robyn, 2009. ‘Relevance Theory: Contextualism or Pragmaticism?’, UCL Working Papers in Linguistics 21: 19-26.
Carston, Robyn, 2019. ‘Ad Hoc Concepts, Polysemy and the Lexicon’ In K. Scott, R. Carston, and B. Clark (eds.) Relevance, Pragmatics and Interpretation, Cambridge: Cambridge University Press, pp. 150-162.
Carston, Robyn, and Hall, Alison, 2012. ‘Implicature and Explicature’, in H. J. Schmid and D. Geeraerts (eds.), Cognitive Pragmatics, Vol. 4. Berlin: Mouton de Gruyter, 47–84.
Clapp, Lenny, 2007. ‘Minimal (Disagreement about) Semantics’, in Preyer & Peter (2007) below, pp. 251-277.
Donaldson, Tom, and Lepore, Ernie, 2012. ‘Context-Sensitivity’, in D. G. Fara, and G. Russell (eds.), 2012, pp. 116-131.
Grice, Herbert Paul, 1989. Studies in the Way of Words, Cambridge, MA: Harvard University Press.
Kaplan, David, 1989a. ‘Demonstratives’, in J. Almog, J. Perry, and H. Wettstein (eds.), Themes from Kaplan, Oxford: Oxford University Press, pp. 481-563.
Kaplan, David, 1989b. ‘Afterthoughts’, in J. Almog, J. Perry, and H. Wettstein (eds.), Themes from Kaplan, Oxford: Oxford University Press, pp. 565-614.
Korta, Kepa, and John, Perry, 2007. ‘Radical Minimalism, Moderate Contextualism.’ In Preyer & Peter (2007), pp. 94-111.
Kolbel, Max, 2002. Truth without Objectivity, London: Routledge.
Kompa, Nikola, 2002. ‘The Context-Sensitivity of Knowledge Ascriptions’, Grazer Philosophische Studien, 64: 79-96.
Leslie, Sarah-Jane, 2007. ‘How and Why to be a Moderate Contextualist’, in Preyer & Peter (2007), pp. 133-168.
MacFarlane, John, 2014. Assessment Sensitivity, Oxford: Oxford University Press.
Neale, Stephen, 2004. ‘This, That, and the Other’, in A. Bezuidenhout, and M. Reimer (eds.), Descriptions and Beyond, Oxford: Oxford University Press, pp. 68-182.
Penco, Carlo, and Vignolo, Massimiliano, 2019. ‘Some Reflexions on Conventions’, Croatian Journal of Philosophy, Vol. XIX, No. 57: 375-402.
Perry, John, 2001. Reference and Reflexivity, Stanford, CSLI Publications.
Predelli, Stefano, 2005. Contexts: Meaning, Truth, and the Use of Language, Oxford: Oxford University Press.
Recanati, Francois, 2004. Literal Meaning, New York: Cambridge University Press.
Recanati, Francois, 2010. Truth-Conditional Pragmatics, Oxford: Clarendon Press.
Richard, Mark, 2008. When Truth Gives Out, Oxford: Oxford University Press.
Rothschild, Daniel, and Segal, Gabriel, 2009. ‘Indexical Predicates’, Mind and Language, 24, 4: 467-493.
Searle, John, 1978. ‘Literal Meaning’, Erkenntnis, 13: 207-224.
Soames, Scott, 2002. Beyond Rigidity: The Unfinished Semantic Agenda of Naming and Necessity, Oxord: Oxford University Press.
Sperber, Dan, and Wilson, Deindre, 1986. Relevance: Communication and Cognition, Oxford: Blackwell.
Stanley, Jason, 2005a. Language in Context, Oxford: Oxford University Press.
Stanley, Jason, 2005b. Knowledge and Practical Interests, Oxford: Oxford University Press.
Stanley, Jason, and Williamson, Timothy, 1995. ‘Quantifiers and Context-Dependence’, Analysis, 55: 291-295.
Szabo, Zoltan Gendler, 2001. ‘Adjectives in context’. In I. Kenesi, and R. Harnish (eds.), Perspectives on Semantics, Pragmtics, and Discourse. Amsterdam: John Benjamins, pp. 119-146.
Szabo, Zoltan Gendler, 2006. ‘Sensitivity Training’, Mind and Language, 21: 31-38.
Taylor, Kenneth, 2003. Reference and the Rational Mind, Stanford, CA: CSLI Publications.
Taylor, Kenneth, 2007. ‘A Little Sensitivity Goes a Long Way’, in (Preyer & Peter (2007), pp. 63-92.
Travis, Charles, 2008. Occasion-Sensitivity: Selected Essays, Oxford: Oxford University Press.
Unnsteinsson, Elmar Geir, 2014. ‘Compositionality and Sandbag Semantics’, Synthese, 191: 3329–3350.

b. Further Reading

Bianchi, Claudia (ed.), 2004. The Semantic/Pragmatic Distinction, Stanford: CSLI.
- A collection on context-sensitivity.
Domaneschi, Filippo, and Penco, Carlo (eds.), 2013. What is Said and What is Not, Stanford: CSLI.
- A collection on context-sensitivity.
Fara, Delia Graff, and Russell, Gillian (eds.), 2012. The Routledge Companion to Philosophy of Language, New York: Routledge.
- A companion to the philosophy of language that covers many of the topics that are discussed in this encyclopedia article.
Garcia-Carpintero, Manuel, and Kolbel, Max (eds.), 2008. Relative Truth, Oxford: Oxford University Press.
- A collection on relativism.
Preyer, Gerhard, and Peter, George (eds.), 2007. Context-Sensitivity and Semantic Minimalism: New Essays on Semantics and Pragmatics, Oxford: Oxford University Press.
- A collection on minimalism.
Recanati, Francois, Stojanovic, Isidora, and Villanueva, Neftali (eds.), 2010. Context Dependence, Perspective, and Relativity, Berlin: De Gruyter.
- A collection on context-sensitivity.
Szabo, Zoltan Gendler (ed.), 2004. Semantics versus Pragmatics, Oxford: Oxford University Press.
- A collection on context-sensitivity.

Author Information

Carlo Penco
Email: penco@unige.it
University of Genoa
Italy

and

Massimiliano Vignolo
Email: massimiliano.vignolo@unige.it
University of Genoa
Italy

Constructivism in Metaphysics

Although there is no canonical view of “Constructivism” within analytic metaphysics, here is a good starting definition:

Constructivism: Some existing entities are constructed by us in that they depend substantively on us.

Constructivism is a broad view with many, more specific, iterations. Versions of Constructivism will vary depending on who does the constructing, for example, all humans, an ideal subject, certain groups. It will also vary depending on what is constructed, for example, concrete objects, abstract objects, facts), and what the constructed entity is constructed out of (for example, natural objects, nonmodal stuff, concepts). Most Constructivists take the constructing relation to be constitutive, that is, it is part of the very nature of constituted objects that they depend substantively on humans. Some, however, take the constituting relation to be merely causal. Some versions of Constructivism are relativistic; others are not. Another key difference between versions of Constructivism concerns whether they take the constructing relation to be global in scope (so everything—or, at least every object we have epistemic access to—is a constructed object) or local (so there are unconstructed objects, as well as constructed ones).

Given the many dimensions along which versions of Constructivism differ, one might wonder what unites them—what, that is, do all versions of Constructivism have in common that marks them out as versions of Constructivism? Constructivists are united first in their opposition to certain forms of Realism—namely, those that claim that x exists and is suitably independent of us. Constructivists about x agree that x exists, but they deny that it is suitably independent of us. Constructivism is distinguished from other versions of anti-Realism by the emphasis it places on the constructing relation. Constructivists are united by all being anti-Realists about x and by believing this is due to x’s being, in some way, constructed by us.

What Is Constructivism?
20th-Century Global Constructivism in Analytic Metaphysics
21st-Century Local Constructivism in Analytic Metaphysics
Criticisms of Constructivism in Analytic Metaphysics
1. Coherence Criticisms
2. Substantive Criticisms
Evaluating Constructivism within Analytic Metaphysics
Timeline of Constructivism in Analytic Metaphysics
References and Further Reading

1. What Is Constructivism?

There is no canonical definition of “Constructivism” within philosophy. The following, however, can serve as a good starting point definition for understanding constructivism:

Constructivism: Some extant entities are constructed by us in that they depend substantively on us. (Exactly what it is for an entity to “depend substantively on us” varies between views.)

Constructivism can be further elucidated by noting that constructing is a three-place relation Cxyz (x constructs y out of z) which involves a constructor x (generally taken to be humans), a constructed entity y, and a more basic entity z which serves as a building block for the constructed entity. (Some would take constructing to be a four-place relation Cxyzt—x constructs y out of z at time t. To simplify, the time variable is left out of the relation. It is straightforward to add it in. Each of the terms that are related are examined below before the examination of the constructing relation itself.

Regarding x, who does the constructing? There is no orthodox view regarding which humans do the constructing; different constructivists give different answers. Constructivists frequently (though not always) emphasize the role language and concepts play in constructing entities. Since language and concepts both arise at the level of the group, rather than the level of the individual, it is generally the group (for example, of language speakers or concept users) rather than the individual which is taken to be the constructor. (Lynne Ruder Baker, for example, is typical of Constructivists when she argues that constructed objects rely on our societal conventions as a whole, rather than on the views of any lone individual: “I would not have brought into existence a new thing, a bojangle; our conventions and practices do not have a place for bojangles. It is not just thinking that brings things into existence” (Baker 2007, 44). See also Thomasson (2003, 2007) and Remhof (2017).) Some Constructivists (for example, Kant) take the constructor to be all human beings; other Constructivists (for example, Goodman, Putnam) take the constructor to be a subset of all human beings (for example, society A, society B). There are some versions of Constructivism which take it to be individuals, rather than groups, which do the constructing. (See Goswick, 2018a, 2018b.) These views are more likely to rely on overt responses (for example, how Sally responds when presented with some atoms arranged rockwise) than on language and concepts.

Regarding y, what is constructed? Versions of Constructivism within analytic philosophy can be distinguished based on which entities they focus on. Constructivism in the philosophy of science, for instance, tends to focus on the construction of scientific knowledge. (Scientific “Constructivists maintain that … scientific knowledge is ‘produced’ primarily by scientists and only to a lesser extent determined by fixed structures in the world” (Downes 1-2). See also Kuhn (1996) and Feyerabend (2010).) Constructivism in aesthetics focuses on the construction of an artwork’s meaning and/or on the construction of aesthetic properties more generally. (Aesthetic Constructivists argue that “rather than uncovering the meaning or representational properties of an artwork, an interpretation instead generates an artwork’s meaning” (Alward 247). See also Werner (2015).) Constructivism in the philosophy of mathematics focuses on mathematics objects. (Mathematical Constructivists argue that, when we claim a mathematical object exists, we mean that we can construct a proof of its existence (Bridges and Palmgren 2018).) Constructivism within ethics concerns the origin and nature of our ethical judgments and of ethical properties. (Ethical Constructivists argue that “the correctness of our judgments about what we ought to do is determined by facts about what we believe, or desire, or choose and not, as Realism would have it, by facts about a prior and independent normative reality” (Jezzi 1). Ethical Constructivism has been defended by Korsgaard, Scanlon, and Rawls. For an explication of their views, see Jezzi (2019) and Street (2008, 2010).) Social Constructivism focuses on the construction of distinctly social categories such as race, gender, and sexuality. (See Hacking (1986, 1992, 1999) and Haslanger (1995, 2003, 2012).) Constructivism in metaphysics focuses on the construction of physical objects. (See, for example, Baker (2004, 2007), Goodman (1978, 1980), Putnam (1982, 1987), Thomasson (2003, 2007).)

Regarding z, what is the more basic entity that serves as a building block for the constructed entity? There is no general answer to this question, as different versions of Constructivism give different answers. Some Constructivists (for example, Goswick) take physical stuff to be the basic building blocks of constructed entities. Goswick argues that modal objects are composite objects which have physical stuff and sort-properties as their parts (Goswick 2018b). Some Constructivists (for example, Goodman) take worlds to be the basic building blocks of constructed entities. Goodman argues that it is constructivism all the way down, so each world we construct is itself built out of other worlds.

Regarding C, what is the relation of constructing? Constructivists vary widely regarding the exact details of the constructing relation. In particular, versions of Constructivism vary with regard to whether the constructing relation is (1) global or local, (2) causal or constitutive, (3) temporally and counterfactually robust or not, and (4) relative or absolute. Each of these dimensions of difference are examined in turn.

Regarding 1, is the constructing relation global or local? Historically, the term “constructivism” has been associated with the global claim that every entity to which we have epistemic access is constructed. (Ant Eagle (personal correspondence) points out that there could be an even more global form of Constructivism which claims that all entities, even those to which we do not have epistemic access, are constructed. This is an intriguing view. However, since it has not yet been defended in analytic metaphysics, it is not discussed here.) Kant held this view; as did the main 20th-century proponents of Constructivism (Goodman and Putnam). In the 21st century, philosophers have explored a more local constructing relation in which only some of the entities we have epistemic access to are constructed. Searle, for instance, argues that social objects (for example, money, bathtubs) are constructed but natural objects (for example, trees, rocks) are not. Einheuser argues that modal objects are constructed but nonmodal stuff is not.

Regarding 2, is the constructing relation causal or constitutive? For example, when an author claims that we construct money does she mean that we bear a causal relation to money (that is, we play a causal role in bringing about the existence of money or in money’s having the nature it has) or does she mean that we bear a constitutive relation to money (that is, part of what it is for money to exist or for money to have the nature it has is for us to bear the constitutor-of relation to it)? We can define the distinction as follows: (See also Haslanger (2003, pp. 317-318) and Mallon (2019, p. 4).)

y is causally constructed by x iff x caused y to exist or to have the nature it has.

For example, we caused that $20 bill to come into existence when we printed it at the National Mint and we caused that $20 bill to have the nature it has when we embedded it in the American currency system.

y is constitutively constructed by x iff what it is for y to exist is for x to F or what it is for y to have the nature it has is for x to F.

For example, what it is for a stop sign to exist is for something with physical features P₁–P_n to play role r in a human society and what it is for a y to have stop-sign-nature is, in part, for humans to treat y as a stop sign.

Some Constructivists (for example, Goodman, Putnam) do not discuss whether they intend their constructing to be causal or constitutive. (Presumably because the central aims they intend to accomplish by endorsing Constructivism can be satisfied via either a causal or a constitutive version. We can easily modify their views to be explicitly causal or explicitly constitutive. For a Constructivism that is causal, endorse the standard Goodman/Putnam line and add to it that the constructing is to be taken causally. For a Constructivism that is constitutive, endorse the standard Goodman/Putnam line and add to it that the constructing is to be taken constitutively.) Other Constructivists are explicit about whether the constructing relation they utilize is causal or constitutive. Thomasson, for example, notes that

The sort of dependence relevant to [Constructivism] is logical dependence, i.e. dependence which is knowable a priori by analyzing the relevant concepts, not a mere causal or nomological dependence. The very idea of an object’s being money presupposes collective agreement about what counts as money. The very idea of something being an artifact requires that it have been produced by a subject with certain intentions. (Thomasson 2003, 580)

Remhof argues that an object is constructed “iff the identity conditions of the object essentially depend on (i.e., are partly constituted by) our intentional activities” (Remhof 2014, 2). And Searle notes that “part of being a cocktail party is being thought to be a cocktail party; part of being a war is being thought to be a war. This is a remarkable feature of social facts; it has no analogue among physical facts” (Searle 33-34). (For more on constitutive versions of Constructivism, see Haslanger (2003) and Baker (2007, p. 12). For examples of Constructivisms which are causal, see Hacking (1999) and Goswick (2018b). Regarding Hacking, Haslanger notes: “The basis of Hacking’s social constructivism is the historical [constructivist] who claims that, ‘Contrary to what is usually believed, x is the contingent result of historical events and forces, therefore x need not have existed, is not determined by the nature of things, etc.’ … He says explicitly that construction stories are histories and the point, as he sees it, is to argue for the contingency or alterability of the phenomenon by noting its social or historical origins” (Haslanger 2003, 303).)

Regarding 3, is the constituting relation temporally and counterfactually robust or not? Temporal robustness concerns whether constructed entity e exists and has the nature it has prior to and posterior to our constructing it. If yes, then e is temporally robust; otherwise, e is not temporally robust. Counterfactual robustness concerns whether constructed entity e would exist and have the nature it has if certain things were different than they actually are, for example, if we had never existed or had had different conventions/responses/intentions/systems of classification than we actually have. If it would, then the constructing relation is counterfactually robust; otherwise, it is not. Some Constructivists (for example, Putnam, Goodman) deny that the constructing relation is temporally/counterfactually robust. They believe that before we existed there were no stars and that, if we employed different systems of classification, there would be no stars. Other Constructivists take the constructing relation to be temporally/counterfactually robust. Remhof, for instance, argues that even “if there had been no people there would still have been stars and dinosaurs; there would still have been things that would be constructed by humans were they around” (Remhof 2014, 3). Schwartz adds that:

In the process of fashioning classificatory schemes and theoretical frameworks, we organize our world with a past, as well as a future, and provide for there being objects or states of affairs that predate us. Although these facts may be about distant earlier times, they are themselves retrospective facts, not readymade or built into the eternal order. (Schwartz 1986, 436)

An advantage of taking the constructing relation to be temporally/counterfactually robust is that many find it difficult to believe that, for example, there were no stars before there were people or that there would not have been stars had people employed different systems of classification. A disadvantage of endorsing a temporally/counterfactually robust Constructivism is that it is difficult to give an account which is temporally/counterfactually robust but still respects the genuine role Constructivists take humans to play in constructing. After all, if the stars would have been there even if we never existed, why think we play any substantial role in constructing them? At the very least, any role we do play must be non-essential.

Regarding 4, is the constituting relation relative or absolute? Some philosophers (for example, Kant) take the constructing relation to be absolute. Kant thought that all humans, by virtue of being human, employed the same categories and thus created the same phenomena. Other philosophers (for example, Goodman and Putnam) take the constructing relation to be relative. Both argued that worlds exist only relative to a conceptual scheme. Although relativism is often associated with Constructivism (presumably because the most prominent Constructivists of the 20th century also happened to be relativists), the two views are orthogonal. There are relativist and absolutist versions of Constructivism. Moreover, it is easy to slightly tweak relativist views to make them absolutist, or to slightly tweak absolutist views to make them relativist.

At this point, four ways in which constructing relations can differ from one another have been examined: with regard to whether they are (i) global or local, (ii) causal or constitutive, (iii) temporally/counterfactually robust or not, and (iv) relativistic or absolute. The starting point definition of Constructivism is:

Constructivism: Some extant entities are constructed by us in that they depend substantively on us.

Exactly what it is for an entity to “depend substantively on us” varies between views. This definition holds up well to scrutiny. It captures the commonalities one finds across a wide swath of views across sub-disciplines of philosophy (for example, the philosophy of mathematics, aesthetics, metaphysics) and is general enough to accommodate the many differences between views (for example, some Constructivists take constructing to be constitutive, others take it to be merely causal; some Constructivists take the scope of Constructivism to be global, others take it to be very limited in scope and claim there are very few constructed entities). There is some worry, however, that—being so general—the given definition is too broad: are there any views that do not fall under the Constructivist umbrella?

Constructivism has historically been developed in opposition to Realism; and examining the tension between Constructivism and Realism can help us further understand Constructivism. Although the word “realism” is used widely within philosophy and different philosophers take it to mean different things, several fairly canonical uses have evolved: (i) the linguistic understanding of Realism advocated by Dummett which sees the question of Realism as concerning whether sentences have evidence-transcendent truth conditions or verificationist truth conditions, (ii) an understanding of Realism developed within the philosophy of science which centers on whether the aim of scientific theories is truth understood as correspondence to an external world, and (iii) an understanding of Realism developed within metaphysics which centers on whether x exists and is suitably independent of humans. The understanding of Realism relevant to elucidating Constructivism is this final one:

Ontological Realism (about x): x exists and is suitably independent of us.

Constructivism (about x) stands in opposition to Ontological Realism (about x). The Ontological Realist takes x to be “suitably independent of us,” whereas the Constructivist takes x to “depend substantively on us for either its existence or its nature.” Whatever suitable independence is, it rules out depending substantially on us. Although one does still hear philosophers talk simply of “Realism,” it has become far more common, within analytic metaphysics, to talk of “Realism about x” and to take Realism to be a first-order metaphysical view concerning the existence and/or human independence of specific types of entities (for example, properties, social objects, numbers, ordinary objects) rather than a general stance one has (concerning, for example, the purpose of philosophical investigation). Following this trend in the literature on Realism (that is, the move away from talking about Realism and anti-Realism in general to talking specifically of Realism about x) can help us make more precise the definition of Constructivism.

Constructivism (about x): x exists and depends substantively on us for either its existence or its nature.

This definition of Constructivism is still very general (that is, because it does not spell out what “depends substantively on” entails/requires). However, given that it is standard within the literature on Realism to give a definition which is general enough to encompass many different understandings of “suitably independent of” and that Constructivism has historically been developed in opposition to Realism, it makes sense to mimic this level of generality in defining Constructivism.

One last precisification is in order before we move on to discussing the details of specific versions of Constructivism. A wide array of differences track whether the constructing relation is taken to be global or local. Global and local versions of Constructivism differ with regard to when they were/are endorsed (global: in the 20th century versus local: subsequently), why they are endorsed (global: thinks Realism itself is somehow defective versus local: likes Realism in general but thinks there is at least one sort of object it can’t account for), and what the best objections to the view are (global: general objections to constructing versus local: specific objections regarding whether some x really is constructed). Given this, it is useful to separate our discussion of Constructivism into Global Constructivism and Local Constructivism.

Global Constructivism: For all existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.

Local Constructivism: For only some existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.

2. 20th-Century Global Constructivism in Analytic Metaphysics

Who are the global constructivists? Who is it, that is, who argues that

[All physical objects we have epistemic access to are] constructed in a way that reflects our contingent needs and interests. [Global Constructivists think that we] can only make sense of there being a fact of the matter about the world after we have agreed to employ some descriptions of it as opposed to others, that prior to the use of those descriptions, there can be no sense to the idea that there is a fact of the matter “out there” constraining which of our descriptions are true and which false. (Boghossian 25, 32)

The number of Global Constructivists within analytic metaphysics is small. (Constructivism has a long and healthy history within Continental philosophy and is still much more widely discussed within contemporary Continental metaphysics than it is within contemporary analytic metaphysics. See Kant (1965), Foucault (1970), and Remhof (2017).) Scouring the literature will yield only a handful. The best-known proponents are Goodman and Putnam. Schwartz supported Goodman’s view in the 1980s and most recently wrote an article supporting the view in 2000. Kant (late 1700s) and James (early 1900s) were early proponents of the view. Rorty and Dummett each endorse the view in passing. These seven authors exhaust the list of analytic Global Constructivists. (Al Wilson (personal communication) suggests this list might be expanded to include Rudolf Carnap, Simon Blackburn, and Huw Price.) Their motivation for endorsing Global Constructivism is worries they have about the cogency of Realism. They think that, if Realism were true, we would have no way to denote objects or to know about them. Since we can denote objects and do have knowledge of them, Realism must not be the correct account of them. The correct account is, rather, Constructivism. Although their number is small, their influence—especially that of Goodman and Putnam—has reverberated within analytic metaphysics. The remainder of this section examines the views of each of the central defenders of Global Constructivism.

Goodman defended Global Constructivism is a series of articles and books clustering around the 1980s: Ways of Worldmaking (1978), “On Starmaking” (1980), “Notes on the Well-Made World” (1983), “On Some Worldly Worries” (1993). Goodman, himself, described his view as “a radical relativism under rigorous restraints, that eventuates in something akin to irrealism” (1978 x). He believed that there were many right worlds, that these worlds exist only relative to a set of concepts, and that the building blocks of constructed objects are other constructed objects: “Worldmaking as we know it always starts from worlds already on hand; the making is a remaking” (1978 6-7). Goodman thought that there is “no sharp line to be drawn between the character of the experience and the description given by the subject” (Putnam 1979, 604). Goodman is perhaps the most earnest and sincere defender of the global scope of Constructivism. Whereas others tend to find the idea that we construct, for example, stars nearly incoherent; Goodman finds the idea that we did not construct the stars nearly incoherent:

Scheffler contends that we cannot have made the stars. I ask him which features of the stars we did not make, and challenge him to state how these differ from features clearly dependent on discourse. … We make a star as we make a constellation, by putting its parts together and marking off its boundaries. … The worldmaking mainly in question here is making not with hands but with minds, or rather with languages or other symbol systems. Yet when I say that worlds are made, I mean it literally. … That we can make the stars dance, as Galileo and Bruno made the earth move and the sun stop, not by physical force but by verbal invention, is plain enough. (Goodman 1980 213 and 1983 103)

Goodman takes the constructors of reality to be societies (rather than lone individuals). He takes constructing to be relative, so, for example, society A constructs books and plants, whereas, faced with the same circumstances, society B constructs food and fuel (Goodman 1983, 103). He does not comment on whether the constructing relation is causal or constitutive. Like all relativistic versions of Constructivism, his view is not temporally/counterfactually robust. Goodman’s motivation for endorsing Global Constructivism is that he thinks it is clear that we can denote and know about, for example, stars and he thinks we would not be able to do this were Realism true.

Schwartz defends Goodmanian Global Constructivism in two articles: “I’m Going to Make You a Star” (1986) and “Starting from Scratch: Making Worlds” (2000). Since Goodman’s writings on constructivism can often be difficult to understand, examining Schwartz’s writings can serve to give us further insight into Goodman’s view. Schwartz writes that:

In shaping the concepts and classification schemes we employ in describing our world, we do take part in constituting what that reality is. Whether there are stars, and what they are like, … are facts that are carved out in the very process of devising perspicuous theories to aid in understanding our world. … Until we fashion star concepts and related categories, and integrate them into ongoing theories and speculations, there is no interesting sense in which the facts about stars are really one way rather than another. (Schwartz 1986, 429)

Schwartz emphasizes the role we play in making it the case that certain properties are instantiated and, thus, in drawing out ordinary objects from the mass of undifferentiated stuff which exists independently of people:

In natura rerum there are no inherent facts about the properties [x] has. It is no more a star, than it is a Big Dipper star and belongs to a constellation. … From the worldmaker’s perspective, the unmade world is a world without determinate qualities and shape. Pure substance, thisness, or Being may abound, but there is nothing to give IT specific character. (Schwartz 2000, 156)

Schwartz notes that, “no argument is needed to show that we do have some power to create by conceptualization and symbolic activity. Poems, promises, and predictions are a few obvious examples” (Schwartz 1986, 428). For example, it is uncontroversial that part of what it is to be a Scrabble joker (one of those blank pieces of wood that you can use as any letter when playing the game of Scrabble) is to be embedded in a certain human context: “These bits of wooden reality could no more be Scrabble jokers without the cognitive carving out of the features and dimensions of the concept, than they could be Scrabble jokers had they never been carved from the tree” (Schwartz 1986, 430-431). Schwartz, and Global Constructivists in general, differ from non-constructivists in that they think all ordinary objects (and, in fact, all the objects we have epistemic access to) are like Scrabble jokers. Of course, there is something that exists independently of us. But this something is amorphous, undefined, and plays no role in our epistemic lives. What we are aware of is the objects we create out of this mass by the (often unconscious) imposition of our concepts.

The other key defender of Global Constructivism is Putnam. Like Goodman, Putnam defended Global Constructivism is a series of articles and books which cluster around the 1980s, see, for example, “Reflections on Goodman’s Ways of Worldmaking” (1979), Reason, Truth, and History (1981), “Why There Isn’t a Ready-Made World” (1982), and The Many Faces of Realism (1987). Putnam thinks philosophy should look to science, and he shares the Positivists’ skepticism about traditional metaphysics:

There is … nothing in the history of science to suggest that it either aims at or should aim at one single absolute version of “the world”. On the contrary, such an aim, which would require science itself to decide which of the empirically equivalent successful theories in any given context was “really true”, is contrary to the whole spirit of an enterprise whose strategy from the first has been to confine itself to claims with clear empirical significance. … Metaphysics, or the enterprise of describing the “furniture of the world”, the “things in themselves” apart from our conceptual imposition, has been rejected by many analytic philosophers. … apart from relics, it is virtually only materialists [i.e. physicalists] who continue the traditional enterprise. (Putnam 1982 144 and 164)

Contrary to Putnam’s hopes, in the twenty-first century the materialists have won, and most metaphysicians recognize the sharp subject/object divide that Putnam rejected. Putnam argues that objects “do not exist independently of conceptual schemes. We cut up the world into objects when we introduce one scheme or another” (Cortens 41). Putnam takes the constructors of reality to be societies, the constructing to be relative, and does not comment on whether the constructing relation is causal or constitutive. Like all relativistic versions of Constructivism, his view is not temporally/counterfactually robust. Putnam’s motivation for endorsing Global Constructivism is that he rejects the sharp division between object and subject which Realism presupposes. He thinks analytic philosophy erred when it responded to 17th-century science by introducing a distinction between primary and secondary qualities (Putnam 1987). He argues that we should instead have taken everything that exists to be a muddled combination of the objective and subjective; there is no way to neatly separate out the two. By recognizing the role we play in constructing objects, Global Constructivism pays homage to this lack of separation; Realism does not. Thus, Putnam prefers Global Constructivism to Realism. (See Hale and Wright (2017) for further discussion of Putnam’s rejection of Realism.)

Other adherents of Global Constructivism include Kant, James, Rorty, and Dummett. (See Kant (1965), James (1907), Rorty (1972), and Dummett (1993).) In “The World Well Lost” (1972), Rorty argues that “the realist true believer’s notion of the world is an obsession rather than an intuition” (Rorty 661). He endorses an account of alternative conceptual frameworks which draws heavily on continental philosophers (Hegel, Kant, Heidegger), as well as on Dewey. Ultimately, he concludes that we should stop focusing on trying to find an independent world that is not there and should recognize the role we play in constructing the world. In Frege: Philosopher of Language (1993), Dummett argues that the “picture of reality as an amorphous lump, not yet articulated into discrete objects, thus proves to be a correct one. [The world does not present] itself to us as already dissected into discrete objects” (Dummett 577). Rather, in the process of developing language, we develop the criterion of identity associated with each term and then, with this in place, the world is individuated into distinct objects.

The heyday of analytic Global Constructivism was the 1980s. No one in analytic metaphysics defends the view Schwartz’s defense in 2000. The view has now more or less been abandoned. Remhof discussed the view in 2014, but he did not endorse it. However, Global Constructivism continues to be influential in discussions, where it serves primarily as a rallying point for the Realists who argue against it—see, for example, Devitt (1997) and Boghossian (2006). Although there are no contemporary Global Constructivists, Local Constructivism—which is an heir to Global Constructivism—is alive and well. The next section examines the many versions of Local Constructivism which proliferate in the twenty-first century.

3. 21st-Century Local Constructivism in Analytic Metaphysics

You will not find the term “constructivism” bandied about within contemporary analytic metaphysics with anything approaching the frequency with which the term is used in other sub-disciplines of analytic philosophy or within Continental philosophy. (Why is not the term “constructivism” used more frequently in contemporary analytic metaphysics? The reluctance to use the term “constructivism” probably stems from the current sociology of analytic metaphysics. Realism has a strong grip on analytic metaphysics. Moreover, many anti-Realist metaphysics writings are strikingly bad, and most philosophers currently working within analytic philosophy can easily recall the criticism that was directed toward Global Constructivism: “Barring a kind of anti-realism that none of us should tolerate” (Hawthorne 2006, 109). “[Constructivism] is such a bizarre view that it is hard to believe that anyone actually endorses it” (Boghossian 25). “We should not close our eyes to the fact that Constructivism is prima facie absurd, a truly bizarre doctrine” (Devitt 2010, 105). These factors conspire to make contemporary analytic metaphysics a particularly unappealing place to launch any theory which might smell of anti-Realism, and to be a Constructivist about x is to be an anti-Realist about x.) However, if one looks at the content of views within analytic metaphysics rather than at what the views are labeled, it quickly becomes apparent that many of them meet the definition of Local Constructivism.

Local Constructivism: For only some existing xs to which we have epistemic access, x depends substantively on us for either its existence or its nature.

Although they may be Realists about many kinds of entities (and may self-identify as “Realists”), many metaphysicians of the twenty-first century are Constructivists about at least some kinds of entities. (See, for example, Baker (2004 and 2007), Einheuser (2011), Evnine (2016), Goswick (2018a), Kriegel (2008), Searle (1995), Sidelle (1989), Thomasson (2003 and 2007), Varzi (2011).) Let’s consider the views of several of these metaphysicians. In particular, let’s look at Local Constructivism with regard to vague objects (Heller), modal objects (Sidelle, Einheuser, Goswick), composite objects (Kriegel), artifacts (Searle, Thomasson, Baker, Devitt), and objects with conventional boundaries (Varzi).

Although not himself a Constructivist, in The Ontology of Physical Objects (1990) Heller presents a view which is a close ancestor of contemporary Local Constructivism. Since a minor tweak turns his view into Local Constructivism, since he was one of the first in the general field of Local Constructivism, and since his work has been so influential on contemporary Local Constructivists, it is worth taking a quick look at exactly what Heller says and why Local Constructivists have found inspiration in his book. Heller distinguishes between what he calls “real objects” and what he calls “conventional objects.” Real objects are four-dimensional hunks of matter which have precise spatiotemporal boundaries; we generally do not talk or think about real objects (since we tend not to individuate so finely as to denote objects with precise spatiotemporal boundaries). “Conventional object” is the name Heller gives to objects which we think exist, but do not really (due to the fact that, if they did exist they would have vague spatiotemporal boundaries and nothing that exists has vague spatiotemporal boundaries) (Heller 47). For example, Heller thinks there is no statue and no lump of clay:

The [purported] difference [between the statue and the clay] is a matter of convention. … This difference cannot reflect a real difference in the objects. There is only one object in the spatiotemporal region claimed to be occupied by both the statue and the lump of clay. There . are no coincident entities; there are just . different conventions applicable to a single physical object. (Heller 32)

What really exists (in the rough vicinity we intuitively think contains the statue) are many precise hunks of matter. None of these hunks is a statue or a lump of clay (because “statue” and “lump of clay” are both ordinary language terms which are not precise enough to distinguish between, for example, two hunks of matter which differ only with regard to the fact that one includes, and the other excludes, atom a), but we mistakenly think there is a statue (where really there are just these various hunks of matter). Heller is an Eliminativist about conventional objects: there are none. However, it is a short step from Heller’s Eliminativism about vague objects to Constructivism about vague objects. The framework is in place; Heller has already provided a thorough account of the difference between nonconventional objects (hunks of matter) and conventional objects (objects—such as rocks, dogs, mountains, and necklaces—which have vague spatiotemporal boundaries) and of how our causal interaction with nonconventional objects gives rise to our belief that there are conventional objects. To be a Constructivist rather than an Eliminativist about Heller’s conventional objects, one need only argue, contra Heller, that our conventions in fact bring new objects—objects which are constructed out of hunks of matter and our conventions—into existence. (Just to re-iterate, Heller is opposed to this: “There are other alternatives that can be quickly discounted. For instance, the claim that we somehow create a new physical object by passing legislation involves the absurd idea that without manipulating or creating any matter we can create a physical object” (Heller 36). However, by so thoroughly examining nonconventional objects, conventional objects, and the relationship between them, he laid the groundwork for the Local Constructivists that would come after him.)

Local Constructivists about modal objects share Heller’s skepticism about the ability of Realism to account for ordinary objects. However, whereas Heller worries that ordinary objects have vague spatiotemporal boundaries but that all objects that really exist have precise spatiotemporal boundaries and resolves this worry by being an Eliminativist about ordinary objects, Local Constructivists about modal objects worry that ordinary objects have “deep” modal properties but that all objects that Realism is true of have at most “shallow” modal properties. (Where a “deep” modal property is any de re necessity or de re possibility which is non-trivial and a “shallow” modal property is any modal property which is not “deep.” See Goswick (2018b) for a more detailed discussion.) Rather than being Eliminativists about ordinary objects, they resolve this worry by endorsing Local Constructivism about objects which have at least one “deep” modal property (henceforth, such objects will be referred to as “modal objects”).

Sidelle and Einheuser both defend Local Constructivism about modal objects. Sidelle’s goal in his (1989) is to defend a conventionalist account of modality. He argues that conventionalism about modality requires Constructivism about modal objects (1989 77). He relies on (nonmodal) stuff as the basic building block out of which modal objects are constructed: “[The] conventionalist should … say that what is primitively ostended is ‘stuff’, stuff looking, of course, just as the world looks, but devoid of modal properties, identity conditions, and all that imports. For a slogan, one might say that stuff is preobjectual” (1989 54-55). Modal objects come to exist when humans provide individuating conditions. It is because we respond to stuff s as if it is a chair and apply the label “chair” to it that there is a chair with persistence conditions c rather than just some stuff. Einheuser’s goal in her (2011) is to ground modality. She argues that the best way to do this is to endorse a conceptualist account of modality and that so doing requires endorsing Constructivism about modal objects. Like Sidelle, she endorsees preobjectual stuff: “the content of the spatio-temporal region of the world occupied by an object [is] the stuff of the object” (Einheuser 303). She argues that this stuff “does not contain … built-in persistence criteria. … It is ‘objectually inarticulate’” (Einheuser 303). Modal objects are created out of such mere stuff by the imposition of our concepts:

Concepts like statue and piece of alloy impose persistence criteria on portions of material stuff and thereby “configure” objects. That is, they induce objects governed by these persistence criteria. Our concept statue is associated with one set of persistence criteria. Applied to a suitable portion of stuff, the concept statue configures an object governed by these criteria. (Einheuser 302)

Einheuser emphasizes the fact that what we are doing is creating a new object (a piece of alloy) rather than adding modal properties to pre-existing stuff. (Einheuser on why we must be Local Constructivists about modal objects rather than Local Constructivists about only modal properties: “There is the view that our concepts project modal properties onto otherwise modally unvested objects. This view appears to imply that objects have their modal properties merely contingently. [The piece of alloy may be necessarily physical] but that is just a contingent fact about [it] for our concepts might have projected a different modal property [on to it]. That seems tantamount to giving up on the idea of de re necessity. … The conceptualist considered here maintains conceptualism not merely about modal properties but about objects: Concepts don’t project modal properties onto objects. Objects themselves are, in a sense to be clarified, projections of concepts” (302).)

Kriegel endorses Local Constructivism about composite objects. He takes Realism to be true of non-composite objects and uses them as the basic building blocks of his composite objects. He worries that, given Realism, there is simply no fact of the matter regarding whether the xs compose an o (Kriegel 2008). He argues that we should be conventionalists about composition: “the xs compose an o iff the xs are such as to produce the response that the xs compose an o in normal intuiters under normal forced-choice conditions” (Kriegel 10). A side effect of this conventionalism about composition is Local Constructivism about composite objects, namely, Kriegel is a Realist about some physical entities r (the non-composite objects) to which we have epistemic access, and he thinks that by acting in some specified way (having the composition intuition) with regard to these physical entities we thereby bring new physical objects (the composite ones) into existence.

Local Constructivism about artifacts is the most wide-spread form of Local Constructivism. It is endorsed by Seale, Thomasson, Baker, and Devitt, among others. (See also Evnine (2016).) Searle is a Realist about natural objects such as Mt. Everest, bits of metal, land, stones, water, and trees (Searle 153, 191, 4). He is a Constructivist about artifactual objects such as money, cars, bathtubs, restaurants, and schools (Searle xi, 4). He takes the natural objects to be the basic building blocks of the artifactual ones:

[The] ontological subjectivity of the socially constructed reality requires an ontological objective reality out of which it is constructed, because there has to be something for the construction to be constructed out of. To construct money, property, and language, for example, there have to be the raw materials of bits of metal, paper, land, sounds, and marks. And the raw materials cannot in turn be socially constructed without presupposing some even rawer materials out of which they are constructed, until eventually we reach a bedrock of brute physical phenomena independent of all representations. (Searle 191)

Thomasson’s Local Constructivism about artifacts arises from her easy ontology. She claims that terms have application and co-application conditions and that, when these conditions are satisfied, the term denotes an object of kind k (Thomasson 2007). Although humans set the application and co-application conditions for natural kind terms such as “rock,” humans play no role in making it the case that these conditions are satisfied. Thus, Realism about natural objects is true. However, with regard to artifactual kind terms such as “money,” humans both set the application and co-application conditions for the term and play a role in making it the case that these conditions are satisfied: “The very idea of something being an artifact requires that it have been produced by a subject with certain intentions” (Thomasson 2003, 580). Intentions, alone, however are not enough:

Although artifacts depend on human beliefs and intentions regarding their nature and their existence, the way they are also partially depends on real acts, e.g. of manipulating things in the environment. Many of the properties of artifacts are determined by physical aspects of the artifacts without regard for our beliefs about them. (Thomasson 2003, 581)

Every concrete artifact includes unconstructed properties which serve as the basis for the object’s constructed properties.

Baker distinguishes between what she calls “ID objects” and non-ID objects. ID objects are objects—such as stop signs, tables, houses, driver’s licenses, and hammocks—that could not exist in a world lacking beings with beliefs, desires, and intentions (Baker 2007, 12). Non-ID objects are objects which could exist in a world which lacked such beliefs, desires, and intentions, for example, dinosaurs, planets, rocks, trees, dogs. Artifacts are ID objects. They are constructed out of our doing certain things to and having certain attitudes toward non-ID objects.

When a thing of one primary kind is in certain circumstances, a thing of another primary kind—a new thing, with new causal powers—comes to exist. [Sometimes this new thing is an ID object.] For example, when an octagonal piece of metal is in circumstances of being painted red with white marks of the shape S-T-O-P, and is in an environment that has certain conventions and laws, a new thing—a traffic sign—comes into existence. (Baker 2007, 13)

Baker advocates a constitution theory according to which coinciding objects stand in a hierarchical relation of constitution. Aggregates are fundamental, non-ID objects, and serve as the ground-level building blocks out of which all ID objects, including artifacts, are built: “Although … thought and talk make an essential contribution to the existence of certain objects [e.g., artifacts], … thought and talk alone [do not] bring into existence any physical objects: conventions, practices, and pre-existing materials [i.e., non-ID aggregates] are also required” (Baker 2007, 46). (Unlike nearly all the other advocates of Local Constructivism about artifacts, Baker does not take constructed objects to be inferior to non-constructed ones: “An artifact has as great a claim as a natural object to be a genuine substance. This is so because artifactual kinds are primary kinds. Their functions are their essences” (Baker 2004, 104).)

Devitt is another defender of Local Constructivism about artifacts. He distinguishes between artifactual objects whose “natures are functions that involve the purposes of agents” (Devitt 1997, 247) and natural objects whose nature is not such a function: “A hammer is a hammer in virtue of its function for hammerers. A tree is not a tree in virtue of its function” (Devitt 1997, 247). Devitt argues that every constructed artifact can also be described as a natural object which is not constructed: “Everything that is [an artifact] is also a [natural object]; thus, a fence may also be a row of trees” (Devitt 1997, 248). He is at pains to distance his Local Constructivism from Global Constructivism and emphasizes the role unconstructed objects play in bringing about the existence of constructed objects:

No amount of thinking about something as, say, a hammer is enough to make it a hammer. … Neither designing something to hammer nor using it to hammer is sufficient to make it a hammer. [Only] things of certain physical types could be [hammers]. In this way [artifacts] are directly dependent on the [unconstructed] world. (Devitt 1997, 248-249)

The final version of Local Constructivism to be examined is Varzi’s Local Constructivism about objects with conventional boundaries. Varzi distinguishes between objects with natural boundaries and those with conventional boundaries. He argues that, “If a certain entity enjoys natural boundaries, it is reasonable to suppose that its identity and survival conditions do not depend on us; it is a bona fide entity of its own” (Varzi 137). On the other hand, if an entity’s “boundaries are artificial—if they reflect the articulation of reality that is effected through human cognition and social practices—then the entity itself is to some degree a fiat entity, a product of our world-making” (Varzi 137). Varzi is quick to point to the role objects with natural boundaries play in our construction of objects with conventional boundaries: “the parts of the dough [the objects with natural boundaries] provide the appropriate real basis for our fiat acts. [They] are whatever they are [independently of us] and the relevant mereology is a genuine piece of metaphysics” (Varzi 145). Varzi also emphasizes the compatibility of Local Constructivism with a generally Realist picture:

It is worth emphasizing that even a radical [constructivist] stance need not yield the nihilist apocalypse heralded by postmodern propaganda. [Constructed objects] lack autonomous metaphysical thickness. But other individuals may present themselves. For instance, on a Quinean metaphysics, there is an individual corresponding to “the material content, however heterogeneous, of some portion of space-time, however disconnected and gerrymandered”. … Such individuals are perfectly nonconventional, yet the overall [Quinean] picture is one that a [constructivist] is free to endorse. (Varzi 147-148)

Having examined five versions of Local Constructivism—constructivism about vague objects, modal objects, composite objects, artifacts, and objects with conventional boundaries—I turn now to describing what all these view have in common that marks them out as constructivist views. Taking note of what each view takes to be unconstructed and what each view takes to be constructed can provide insight into what all the views have in common:

Author	Unconstructed Entities	Constructed Entities
neo-Hellerian	4D hunks of matter	vague objects
Sidelle/Einheuser/Goswick	nonmodal stuff	modal objects
Kriegel	simple objects	composite objects
Searle/Thomasson/Baker/Devitt	natural objects	artifactual objects
Varzi	natural boundaries	conventional boundaries

The definitive thing that each version of Local Constructivism has in common that makes it a Local Constructivist view is that (i) each takes there to be something unconstructed to which we have epistemic access, and (ii) each thinks that by acting in some specified way with regard to these unconstructed entities we thereby bring new physical objects (the constructed ones) into existence. The views differ with regard to what they think the unconstructed entities are and with regard to what they think we have to do in order to utilize these unconstructed entities to construct new entities, but they are all alike in endorsing (i) and (ii). This is what marks them out as local and constructivist. They are local—rather than global—in scope because they all think only some of the entities that we have epistemic access to are constructed. They are Constructivist—rather than Realist—about vague objects or modal objects or … objects because they take these entities to depend substantially (either causally or constitutively) on us for either their existence or nature.

Broadly speaking, all Local Constructivists share the same motivation for endorsing Constructivism—namely, they think that although Realism is generally a good theory there are little bits of the world that it cannot account for. Although Local Constructivists tend to be fond of Realism, they are even fonder of certain entities which they take Realism to be unable to accommodate. They resolve this tension (that is, between the desire to be Realists and the desire to have entities e in their ontology) by endorsing Local Constructivism about entities e. The appeal of Local Constructivism springs from an inherent tension between naturalism and Realism. Most analytic metaphysicians of the twenty-first century are naturalists: they think that metaphysics should be compatible with our best science, that philosophy has much to learn from studying the methods used in science, and that, at root, the basic entities philosophy puts in its ontology had better be ones that are scientifically respectable (quarks, leptons, and forces are in; God, dormative powers, and Berkeleyan ideas are out). It is not obvious, however, that there is a place within our best science for the ordinary objects we know and love. (“We have already seen that ordinary material objects tend to dissolve as soon as we acknowledge their microscopic structure: this apple is just a smudgy bunch of hadrons and leptons whose exact shape and properties are no more settled than those of a school of fish” (Varzi 140).) Metaphysicians’ naturalism inclines them to be Realists only about those entities our best science countenances. (Searle, for example, wonders how there can “be an objective world of money, property, marriage, governments, elections, football games, cocktail parties, and law courts in a world that consists entirely of physical particles in fields of force” (Searle xi).) They worry that there is no room within this naturalistic picture of the world for, for example, modal objects, composite objects, or artifacts. This places them in a bind: they do not want to abandon naturalism or Realism, but they also do not want to exclude entities e (whose existence/nature is not countenanced by naturalistic Realism) from their ontology. This underlying situation makes it the case that analytic metaphysicians will often end up endorsing Local Constructivism for some entities, that is, because doing so allows them to include such objects in their ontology whilst recognizing that they are defective in a way many other objects included in their ontology are not (that is, because they are existence or nature depends on us in some way the existence/nature of other objects does not). (This discussion of Local Constructivism has focused on concrete objects. There is also a literature concerning the construction of abstract objects. See, for example, Levinson (1980), Thomasson (1999), Irmak (2019), Korman (2019).)

4. Criticisms of Constructivism in Analytic Metaphysics

The previous two sections examined two central versions of Constructivism within analytic metaphysics and provided overviews of the works of their most prominent adherents. The article concludes by asking what—all things considered—we should make of Constructivism in analytic metaphysics. Before the question can be answered, there must be an examination of the central criticisms of Constructivism. These criticisms can be divided into two main sorts: (1) coherence criticisms—which argue that Constructivism is in some way internally flawed to the extent that we cannot form coherent, evaluable versions of the view, and (2) substantive criticisms—which take Constructivism to be coherent and evaluable, but argue that we have good reason to think it is false.

a. Coherence Criticisms

Consider these four coherence criticisms: (i) Constructivism is not a distinct view, (ii) The term “constructivism” is too over-used to be valuable, (iii) Constructivism is too metaphorical, and (iv) Constructivism is incoherent.

Consider, first, whether Constructivism is a distinct view within the anti-Realist family of metaphysical views. Meta-ethicists, for instance, sometimes worry about whether Ethical Constructivism is sufficiently distinct from other views (for example, emotivism or response-dependence) within ethics. (See, for example, Jezzi (2019) and Street (2008 and 2010).) Does a similar worry arise with regard to Constructivism in analytic metaphysics? It does not. Constructivism is a broad view within anti-Realism; there are many more specific versions of it, but Constructivism is sufficiently distinct from other anti-Realist views. It is not, for example, Berkeleyan Idealism (that is, because Berkeleyan Idealism requires that God play a central role in determining what exists and Constructivism has no such reliance on God) or Eliminativism (that is, because Eliminativists about x deny that x exists, whereas Constructivists about x claim that x exists).

Consider, next, whether the term “constructivism” is too over-used to be valuable. Haslanger notes that, “The term ‘social construction’ has become commonplace in the humanities. [The] variety of different uses of the term has made it increasingly difficult to determine what claim authors are using it to assert or deny” (Haslanger 2003, 301-302). The term “constructivism” certainly is not over-used with analytic metaphysics. If anything, it is underused; authors only very rarely use the term “constructivism” to refer to their own views. We need not fear that the variety of uses which plagues the humanities in general will be an issue in analytic metaphysics. The term is uncommon within analytic metaphysics; and there is value in introducing the label within analytic metaphysics—as such labels serve to emphasize the similarity both in content and in underlying motivation between views whose authors use quite disparate terms to identify their own views.

Consider, third, whether “constructivism,” as used in analytic metaphysics, is too metaphorical. This criticism has been directed primarily at Global Constructivism. Understandably when, for instance, Goodman writes, “The worldmaking mainly in question here is making not with hands but with minds, or rather with languages or other symbol systems. Yet when I say that worlds are made, I mean it literally” (Goodman 1980 213), we want to know exactly what it is to literally make a world with words—it is difficult to parse this phrase if we do not take either the making or the world to be metaphorical. Global Constructivists, themselves, often stress—as Goodman does in the above passage—that they mean their views to be taken non-metaphorically: we really do construct the stars, the planets, and the rocks. Critics of Global Constructivism, however, often find it almost irresistible to take the writings of Global Constructivists to be metaphorical, namely “The anti-realist [Constructivist] is of course speaking in metaphor. It we took him to be speaking literally, what he says would be wildly false—so much so that we would question his sanity” (Devitt 2010, 237—quoting Wolterstorff). There is something to the worry that what Global Constructivists say is just so radical (and frequently, so convoluted) that the only way we can make any sense of it at all is to take it metaphorically (regardless of whether its proponents intend us to take it this way).

A final coherence criticism is that Constructivism is simply incoherent: we cannot make enough sense of what the view is to be in a position to evaluate it. This criticism takes various forms, including that Constructivism (a) is incompatible with what we know about our terms; (b) relies on a notion of a conceptual scheme which is, itself, incoherent; (c) requires unconstructed entities of a sort Global Constructivism cannot accept; (d) relies on a notion of unconstructed objects which is itself contradictory; and (e) allows for the construction of incompatible objects.

Consider, first, the claim that Constructivism is incompatible with what we know about our terms. Boghossian, for example, writes:

Isn’t it part of the very concept of an electron, or of a mountain, that these things were not constructed by us? Take electrons, for example. Is it not part of the very purpose of having such a concept that it is to designate things that are independent of us? If we insist on saying that they were constructed by our descriptions of them, don’t we run the risk of saying something not merely false but conceptually incoherent, as if we hadn’t quite grasped what an electron was supposed to be? (Boghossian 39)

The idea behind Boghossian’s worry is that linguistic and conceptual competence reveal to us that the term “electron” and the concept electron denote something which is independent of us. If so, then any theory that proposes that electrons depend on us is simply confused about the meaning of the term “electron” or, more seriously, about the nature of electrons. There are a variety of ways one can address this concern. One could argue that externalism is true and, thus, that competent users can be radically mistaken about what their terms refer to and still successfully refer. Historically, we have often been mistaken both about what exists and about what the nature of existing objects is. We were able to successfully refer to water even when we thought it was a basic substance (rather than a composite of H₂O) and we can refer successfully to electrons even if we are deeply mistaken about their nature, that is, we think they are independent entities when they are really dependent entities. The more serious version of Boghossian’s worry casts it as a worry about changing the subject matter rather than as a worry about reference. It may be that electrons-which-depend-on-us are so radically different from what we originally thought electrons were that Constructivists (who claim electrons so depend) are (i) proposing Eliminativism about electrons-which-are-independent-of-us, and (ii) introducing an entirely new ontology, namely electrons-which-depend-on-us. (See Evnine (2016) for arguments that taking electrons to depend on humans changes the subject matter so radically that Eliminativism is preferable.) The critic could press this point, but it is not very convincing. To see this, hold a rock in your hand. On the most reasonable way of casting the debate, the Realist to your right and the Constructivist to your left can both point to the rock and utter, “we have different accounts of the existence and nature of that rock.” It is uncharitable to interpret them as talking about different objects, rather than as having different views about the same object. Boghossian overestimates the extent of our knowledge of, for example, the term “electron,” the concept electron, and the objects electrons. We are not so infallible with regard to such terms, concepts, and objects that views which dissent from the mainstream Realist position are simply incoherent.

Consider, next, the criticism that Constructivism relies on a notion of a conceptual scheme which is, itself, incoherent. Goodman and Putnam both endorsed relativistic versions of Global Constructivism which rely on different cultures having different conceptual schemes and on the idea that truth can be relative to a conceptual scheme. Davidson (1974) attacks the intelligibility of truth relative to a conceptual scheme. Cortens (2002) argues that, “Many relativists run into serious trouble on this score; rarely do they provide a satisfactory explanation of just what sort of thing a conceptual scheme is” (Cortens 46). Although there are responses to this criticism, they are not presented here. (See the entries for Goodman, Putnam, and Schwartz in the bibliography.) Goodman/Putnam’s Global Constructivism is a dated view, and contemporary versions of Constructivism do not utilize the old-fashioned notion of a conceptual scheme or of truth relative to a conceptual scheme.

Another criticism which attacks the coherence of Constructivism is the claim that Constructivism requires unconstructed entities of a sort Global Constructivism cannot accept. Boghossian (2006) and Scheffler (2009) argue that Constructivism presupposes the existence of at least some unconstructed objects which we have epistemic access to. If this is correct, then Global Constructivism is contradictory, that is, since it would require unconstructed objects we have epistemic access to (to serve as the basis of our constructing) whilst also claiming that all objects we have epistemic access to are constructed:

If our concepts are cutting lines into some basic worldly dough and thus imbuing it with a structure it would not otherwise possess, doesn’t there have to be some worldly dough for them to work on, and mustn’t the basic properties of that dough be determined independently of all this [constructivist] activity. (Boghossian 2006, 35)

There are various answers Constructivists can give to this worry. Goodman, for instance, insists that everything is constructed:

The many stuffs—matter, energy, waves, phenomena—that worlds are made of are made along with the worlds. But made from what? Not from nothing, after all, but from other worlds. Worldmaking as we know it always starts from worlds already on hand; the making is a remaking (Goodman 1978, 6-7)

Goodman’s view may be hard to swallow, but it is not internally inconsistent. Another approach is to argue that although all objects are constructed, there are other types of entities (for example, Sidelle’s nonmodal stuff, Kant’s noumena) which are not constructed. (See also Remhof (2014).)

A fourth incoherence criticism is that Constructivism relies on a notion of unconstructed objects which is itself (at worst) contradictory or (at best) under explained. How cutting a worry this is depends on what a particular version of Constructivism takes to be unconstructed. Kriegel’s Local Constructivism about composite objects, for instance, allows that all mereologically simple objects are unconstructed—such simples provide a rich building base for his constructivism. Similarly, Local Constructivists about artifacts claim that natural objects are unconstructed. They are, that is, Realists about all the objects Realists typically give as paradigms. This, too, provides a rich and uncontroversially non-contradictory building base for their constructed objects. Other views—such as Global Constructivism and Local Constructivism about modal objects—do face a difficulty regarding how to allow unconstructed entities to have enough structure that we can grasp what they are, without claiming they have so much structure that they become constructed entities. Wieland and Elder give voice to this common Realist complaint against Constructivism:

When it comes to [the question of what unconstructed entities are], those who are sympathetic to [Constructivism] are remarkably vague. … The problem [is that constructivists] want to reconcile our freedom of carving with serious, natural constraints. … [The] issue is about the elusive nature of non-perspectival facts in a world full of facts which do depend on our perspective. (Wieland 22)

[Constructivists] are generally quite willing to characterize the world as it exists independently of our exercise of our conceptual scheme. It is simply much stuff, proponents say, across which a play of properties occurs. … But just which properties is it that get instantiated in the world as it mind-independently exists? (Elder 14)

Global Constructivists are quite perplexing when they try to explain how they can construct in the absence of any unconstructed entities to which we have epistemic access. This is a central problem with Global Constructivism and one reason it lacks contemporary adherents. The situation is different with Local Constructivism. Local Constructivists are vocal about the fact that they endorse the existence of unconstructed entities to which we have epistemic access and that such entities play a crucial role in our constructing. (Baker, for example, notes that, “I do not hold that thought and talk alone bring into existence any physical objects … pre-existing materials are also required” (2007 46). Devitt argues that, “Neither designing something to hammer nor using it to hammer is sufficient to make it a hammer … only things of certain physical types could be [hammers]” (1991 248). Einheuser emphasizes that the application of our concepts to stuff is only object creating when our concepts are directed at independently existing stuff which has the right nonmodal properties (Einheuser 2011).) Local Constructivists—even those such as Sidelle who think unconstructed entities have no “deep” modal properties—can provide an account of unconstructed entities which is coherent. There are a variety of ways to do this. (See, for example, Sidelle (1989), Goswick (2015, 2018a, 2018b), Remhof (2014).) Rather than presenting any one of them, there will be a few general points which should enable the reader to understand for herself that Local Constructivists about modal objects can provide a coherent view of unconstructed entities. The easiest way to see this is to note two things: (1) The Local Constructivist about modal objects does not think that every entity which has a modal property is constructed; they only think that objects which have “deep” modal properties are constructed. So, for example, arguments such as the following will not work: Let F denote some property purportedly unconstructed entity e has. Every entity that is actually F is possibly F. So, e is possibly F. Thus, e has a modal property—which contradicts the Local Constructivists’ claim that unconstructed entities do not have modal properties. But, of course, Local Constructivists are happy for unconstructed objects to have a plethora of modal properties, so long as they are “shallow” modal properties. (A “deep” modal property, remember, is any constant de re necessity or de re possibility which is non-trivial. A “shallow” modal property is any modal property which is not “deep.”) (2) Most of us have no trouble understanding Quine when he defines objects as “the material content of a region of spacetime, however heterogeneous or gerrymandered” (Quine 171). But, of course, Quine rejected “deep” modality. The Local Constructivist about modal objects can simply point to Quine’s view and use Quine’s objects as their unconstructed entities. (See Blackson (1992) and Goswick (2018c).)

A final coherence criticism of Constructivism is the claim that Constructivism licenses the construction of incompatible objects, for example, society A constructs object o (which entails the non-existence of object o*), whilst society B constructs object o* (which entails the non-existence of object o). (Suppose, for example, that there are no coinciding objects, so at most one object occupies region r. Then, society A’s constructing a statue (at region r) rules out the existence of a mere-lump (at region r) and society B’s constructing a mere-lump (at region r) rules out the existence of a statue (at region r).) What, then, are we to say with regard to the existence of o and o*? Do both exist, neither, one but not the other? Boghossian puts the worry this way:

[How could] it be the case both that the world is flat (the fact constructed by pre-Aristotelian Greeks) and that it is round (the fact constructed by us)? [Constructivism faces] a problem about how we are to accommodate the possible simultaneous construction of logically incompatible facts. (Boghossian 39-40)

Different versions of Constructivism will have different responses to this worry, but every version is able to give a response that dissolves the worry. Relativists will say that o exits only relative to society A, whereas o* exists only relative to society B. Constructivists who are not relativists will pick some subject to privilege, for example, society A gets to do the constructing, so what they say goes—o exists and o* does not.

b. Substantive Criticisms

Now that Constructivism has been shown to satisfactorily respond to the coherence criticisms, let’s turn to presenting and evaluating the eight main substantive criticisms of Constructivism: (i) If Constructivism were true, then multiple systems of classification would be equally good, but they are not, (ii) Constructivism is under-motivated, (iii) Constructivism is incompatible with naturalism, (iv) Constructivism should be rejected outright because Realism is so obviously true, (v) Constructivism requires constitutive dependence, but really, insofar as objects do depend on us, they depend on us only causally, (vi) Constructivism is not appropriately constrained, (vii) Constructivism is crazy, and (viii) Constructivism conflicts with obvious empirical facts.

Consider, first, the criticism that if Constructivism were true, then multiple systems of classification would be equally good; but they are not, so Constructivism is not true. The main proponent of this criticism is Elder. He expresses the concern in the following way:

If there were something particularly … unobjective about sameness in natural kind, one might expect that we could prosper just as well as we do even if we wielded quite different sortals for nature’s kinds. (Elder 10)

The basic idea is that, as a matter of fact, dividing up the world into rocks and non-rocks works better for us than does dividing up the world into dry-rocks, wet-rocks, and non-rocks: the sortal rock is better than the alternative sortals dry-rock and wet-rock. Why is this? Elder’s explanation is that rock is a natural kind sortal which traces the existence of real objects. Dry-rock and wet-rock do not work as well as rock because there are rocks and there are not dry-rocks and wet-rocks. Since we cannot empirically distinguish between a rock that is (accidentally) dry and an (essentially dry) dry-rock or between a rock that is (accidentally) wet and an (essentially wet) wet-rock, Elder provides no empirical basis for his claim. The Constructivist will point out that she is not arguing that any set of constructed objects is as good as any other. It may very well be the case that rock works better for us than do dry-rock and wet-rock. The Constructivist attributes this to contingent facts about us (for example, our biology and social history) rather than to its being the case that Realism is true of rocks and false of dry-rocks and wet-rocks. Nothing Elder says blocks this way of describing the facts. Pending some argument showing that the only way (or, at least, the best way) we can explain the fact that rock works better for us than do dry-rock and wet-rock is if Realism is true of rocks, Elder has no argument against the Constructivist.

Another argument one sometimes hears is that Constructivism is undermotivated. Global Constructivism is seen as an overly radical metaphysical response to minor semantic and epistemic problems with Realism. (See, for example, Devitt (1997) and Wieland (2012).) How good a criticism this is depends on how minor the semantic and epistemic problems with Realism are and how available a non-metaphysic solution to them is. This issue is not explored further here because this sort of criticism cannot be evaluated in general but must be looked at with regard to each individual view, for example, is Goodman’s Global Constructivism undermotivated, is Sidelle’s Local Constructivism about modal objects undermotivated, is Thomasson’s Local Constructivism about artifacts undermotivated? Whether the criticism is convincing will depend on how well each view does at showing there’s a real problem with Realism and that their own preferred way of resolving the problem is compelling. If Sidelle is really correct that the naturalist/empiricist stance most analytic philosophers embrace in the twenty-first century is incompatible with the existence of ordinary objects with “deep” modal properties, then we should be strongly motivated to seek a non-Realist account of ordinary objects. If Thomasson’s really right that existence is easy and that some terms really are such that anything that satisfies them depends constitutively on humans, then we should be strongly motivated to seek a non-Realist account of the referents of such terms.

Another argument one sometimes hears is that Constructivism is incompatible with the naturalized metaphysics which is in vogue. Most contemporary metaphysicians are heavily influenced by Lewisian naturalized metaphysics: they believe that there is an objective reality, that science has been fairly successful in examining this reality, that the target of metaphysical inquiry is this objective reality, and that our metaphysical theorizing should be in line with what our best science tells us about reality. If Constructivism really is incompatible with naturalized metaphysics it will ipso facto be unattractive to most contemporary metaphysicians. However, although one frequently hears this criticism, upon closer examination it is seen to lack teeth. The crucial issue—with regard to compatibility with naturalistic metaphysics—is whether one’s view is adequately constrained by an independent, objective, open to scientific investigation reality. All versions of Realism are so constrained, so Realism wears its compatibility with naturalistic metaphysics on its sleeve. Not all versions of Constructivism are so constrained, for example, Goodman and Putnam’s Global Constructivisms are not. But it would be overly hasty to throw out all of Constructivism simply because some versions of Constructivism are incompatible with naturalistic metaphysics. Some versions of Constructivism are more compatible with naturalized metaphysics than is Realism. Suppose Ladyman and Ross are correct when they say our best science shows there are no ordinary objects (2007). Suppose Einheuser is correct when she says our best science shows there are no objects with modal properties (2011). Suppose, however, that in daily human life we presuppose (as we seem to) the existence of ordinary objects with modal properties. Then, Local Constructivism about ordinary objects is motivated from within the perspective of naturalistic metaphysics. One’s naturalism prevents one from being a Realist about ordinary objects, that is, because all the subject-independent world contains is ontic structure (if Ladyman and Ross are correct) or nonmodal stuff (if Einheuser is correct). One’s desire to account for human behavior prevents one from being an Eliminativist about ordinary objects. A constructivism which builds ordinary objects out of human responses to ontic structure/nonmodal stuff is the natural position to take. Although some versions of Constructivism (for example, Global Constructivism) may be incompatible with naturalistic metaphysics, there is no argument from naturalized metaphysics against Constructivism per se.

A fourth substantive criticism levied against Constructivism is that it should be rejected outright because Realism is so obviously true:

A certain knee-jerk realism is an unargued presupposition of this book. (Sider 2011, 18)

Realism is much more firmly based than these speculations that are thought to undermine it. We have started the argument in the wrong place: rather than using the speculations as evidence against Realism, we should use Realism as evidence against the speculations. We should “put metaphysics first.” (Devitt 2010, 109)

[Which] organisms and other natural objects there are is entirely independent of our beliefs about the world. If indeed there are trees, this is not because we believe in trees or because we have experiences as of trees. (Korman 92)

For example, facts about mountains, dinosaurs or electrons seem not to be description-dependent. Why should we think otherwise? What mistake in our ordinary, naive realism about the world has the [Constructivist] uncovered? What positive reason is there to take such a prima facie counterintuitive view seriously. (Boghossian 28)

All that the Constructivist can say in response to this criticism—which is not an argument against Constructivism but rather a sharing of the various authors’ inclinations—is that she does not think Realism is so obviously true. She can, perhaps, motivate others to see it as less obviously true by not casting the debate as a global one between choosing whether the stance one wants to adopt toward the world is Global Constructivist or Global Realist, but rather as a more local debate concerning the ontological status of, for example, tables, rocks, money, and dogs. We are no longer playing a global game; one can be an anti-Realist about, for example, money without thereby embracing global anti-Realism.

Another criticism of Constructivism is that Constructivism is only true if objects constitutively depend on us, but really, insofar as objects do depend on us, they depend on us only causally. As this article has defined “Constructivism,” it has room for both causal versions and constitutive versions. (Hacking (1999) and Goswick (2018b) present causal versions of Constructivism. Baker (2007) and Thomasson (2007) present constitutive versions of Constructivism.) One could, instead, define “Constructivism” more narrowly so that it only included constitutive accounts. This would be a mistake. Consider a (purported) causal version of Local Constructivism about modal objects: Jane is a Realist about nonmodal stuff and claims we have epistemic access to it. She thinks that when we respond to rock-appropriate nonmodal stuff s with the rock-response we bring a new object into existence: a rock. Jane does not think that rocks depend constitutively on us—it is not part of what it is to be a rock that we have to F in order for rocks to exist. But we do play a causal role in bringing about the existence of rocks. If there were some modal magic, then rocks could have existed without us (nothing about the nature of rocks bars this from being the case); but there is no modal magic, so all the rocks that exist do causally depend on us. Now consider a (purported) constitutive version of Local Constructivism about modal objects: James is a Realist about nonmodal stuff and claims we have epistemic access to it. He thinks that when we respond to rock-appropriate nonmodal stuff s with the rock-response we bring a new object into existence: a rock. James thinks that rocks depend constitutively on us—it is part of what it is to be a rock that we have to F in order for rocks to exist. Even if there were modal magic, rocks could not have existed without us. Do Jane and James’ views differ to the extent that one of them deserves the label “Constructivist” and the other does not? Their views are very similar—after all they both take rocks to be composite objects which come to exist when we F in circumstances c, that is, they tell the same origin story for rocks. What they differ over is the nature of rocks: is their dependence on us constitutive of what it is to be a rock (as James says) or is it just a feature that all rocks in fact have ( as Jane says). Jane and James’ views are so similar (and the objections that will be levied against them are so similar) that taking both to be versions of the same general view (that is, Constructivism) is more perspicuous than not so doing. More generally, causal constructivism is similar enough to constitutive constructivism that defining “constructivism” in such a way that in excludes the former would be a mistake.

A sixth substantive criticism of Constructivism is that it is not appropriately constrained.

Putnam does talk, in a Kantian way, of the noumenal world and of things-in-themselves [but] he seems ultimately to regard this talk as “nonsense” … This avoids the facile relativism of anything goes by fiat: we simply are constrained, and that’s that. … [But to] say that our construction is constrained by something beyond reach of knowledge or reference is whistling in the dark. (Devitt 1997, 230)

The worry here is that it is not enough just to say “our constructing is constrained”; what does the constraining and how it does so must be explained. Global Constructivists have fared very poorly with regard to this criticism. They (for example, Goodman, Putnam, Schwartz) certainly intend their views to be so constrained. What is less clear, however, is whether they are able to accomplish this aim. They provide no satisfactory account of how, given that we have no epistemic access to them, the unconstructed entities they endorse are able to constrain our constructing. This is a serious mark against Global Constructivism. Local Constructivists fare better in this regard. They place a high premium on our constructing being constrained by the (subject-independent) world and each Local Constructivist is able to explain what constrains constructing on her view and how it does so. Baker, for example, argues that all constructed objects stand in a constitution chain which eventuates in an unconstructed aggregate. These aggregates constrain which artifacts can be in their constitution chains, namely (i) an artifact with function f can only be constituted by an aggregate which contains enough items of suitable structure to enable the proper function of the artifact to be performed, and (ii) an artifact with function f can only be constituted by an aggregate which is such that the items in the aggregate are available for assembly in a way suitable for enabling the proper function of the artifact to be performed (Baker 2007, 53). For another example, consider Einheuser’s explanation of what constrains her Local Constructivism about modal objects: Every (constructed) modal object coincides with some (unconstructed) nonmodal stuff. A modal object of sort s (for example, a rock) can only exist at region r if the nonmodal stuff that occupies region r has the right nonmodal properties (Einheuser 2011). This ensures that, for example, we cannot construct a rock at a region that contains only air molecules.

A seventh substantive criticism of Constructivism is the claim that Constructivism is crazy. Consider,

We should not close our eyes to the fact that Constructivism is prima facie absurd, a truly bizarre doctrine. … How could dinosaurs and stars be dependent on the activities of our minds? It would be crazy to claim that there were no dinosaurs or stars before there were people to think about them. [The claim that] there would not have been dinosaurs or stars if there had not been people (or similar thinkers) seems essential to Constructivism: unless it were so, dinosaurs and stars could not be dependent on us and our minds. [So Constructivism is crazy.] (Devitt 2010, 105 and Devitt 1997, 238)

The idea that we in any way determine whether there are stars and what they are like seems so preposterous, if not incomprehensible, that any thesis that leads to this conclusion must be suspect. … And a forceful, “But people don’t make stars” is often thought to be the simplest way to bring proponents of such metaphysical foolishness back to their senses. For isn’t it obvious that … there were stars long before sentient beings crawled about and longer still before the concept star was thought of or explicitly formulated? (Schwartz 1986, 429 and 427)

The “but Constructivism is crazy” elocution is not a specific argument but is rather an expression of the utterer’s belief that Constructivism has gone wrong in some serious way. Arguments lie behind the “Constructivism is crazy” utterance and the arguments, unlike the emotive outburst, can be diffused. Behind Devitt’s “it’s crazy” utterance is the worry that Constructivism simply gets the existence conditions for natural objects wrong. It is just obvious that dinosaurs and stars existed before any people did and it follows from this that they must be unconstructed objects. There are two ways to respond to this objection: (1) argue that even if humans construct dinosaurs and stars it can still be the case that dinosaurs and stars existed prior to the existence of humans. (For this approach, see Remhof, “If there had been no people there would still have been stars and dinosaurs; there would still have been things that would be constructed by humans were they around” (Remhof 2014, 3); Searle, “From the fact that a description can only be made relative to a set of linguistic categories, it does not follow that the objects described can only exist relative to a set of categories. … Once we have fixed the meaning of terms in our vocabulary by arbitrary definitions, it is no longer a matter of any kind of relativism or arbitrariness whether representation-independent features of the world that satisfy or fail to satisfy the definitions exist independently of those or any other definitions” (Searle 166); and Schwartz, “In the process of fashioning classificatory schemes and theoretical frameworks, we organize our world with a past, as well as a future, and provide for there being objects or states of affairs that predate us. Although these facts may be about distant earlier times, they are themselves retrospective facts, not readymade or build into the eternal order” (Schwartz 1986, 436).) (2) bite the bullet. Agree that—if Constructivism is true —dinosaurs and stars did not exist before there were any people. Diffuse the counter-intuitiveness of this claim by, for example, arguing that, although dinosaurs per se did not exist, entities that were very dinosaur-like did exist. (For this approach, see Goswick (2018b):

The [Constructivist] attempts to mitigate this cost by pointing out that which ordinary object claims are false is systematic and explicable. In particular, we’ll get the existence and persistence conditions of ordinary objects wrong when we confuse the existence/persistence of an s-apt n-entity for the existence/persistence of an ordinary object of sort s. We think dinosaurs existed because we mistake the existence of dinosaur-apt n-entities for the existence of dinosaurs (Goswick 2018b, 58).

Behind Schwartz’s “Constructivism is crazy” utterance is the same worry Devitt has: namely—that Constructivism simply gets the existence conditions for natural objects wrong. It can be diffused in the same way Devitt’s utterance was.

The final substantive criticism of Constructivism to be considered is the claim that Constructivism conflicts with obvious empirical facts.

It is sometimes said, for example, that were it not for the fact that we associated the word “star” with certain criteria of identity, there would be no stars. It seems to me that people who say such things are guilty of [violating well-established empirical facts]. Are we to swallow the claim that there were no stars around before humans arrived on the scene? Even the dimmest student of astronomy will tell you that this is non-sense. (Cortens 45)

This worry has largely been responded to in responding to the previous criticism. However, Cortens makes one point beyond that which Devitt and Schwartz make. Namely, that it is not just our intuitions that tell us stars existed before humans, but also our best science. Any naturalist who endorses Constructivism about stars will be skeptical—that our best science really tells us this. Even the brightest student of astronomy is unlikely to make the distinctions metaphysicians make, for example, between a star and the atoms that compose it. Does the astronomy student really study whether there are stars or only atoms-arranged-starwise? If not, how can she be in a place to tell us whether there where stars before there were humans or whether there were only atoms-arranged-starwise? The distinction between stars and atoms-arranged-starwise is not an empirical one. In general, the issues Constructivists and Realists differ over are not ones that can be resolved empirically. Given this, it is implausible that Constructivism conflicts with obvious empirical facts. It would conflict with an obvious empirical fact (or, at least, with what our best science takes to be the history of our solar system) if, for example, Constructivists denied that there was anything star-like before there were humans. But Constructivists do not do this; rather, they replace the Realists’ pre-human stars with entities which are empirically indistinguishable from stars but which lack some of the metaphysical features (for example, being essentially F) they think an entity must have to be a star.

5. Evaluating Constructivism within
Analytic Metaphysics

Having explicated what Constructivism within analytic metaphysics is and what the central criticisms of it are, let’s examine what, all things considered, should be made of Constructivism within analytic metaphysics.

Global Constructivism is no longer a live option within analytic metaphysics. Our understanding of Realism, and our ability to clearly state various versions of it, has expanded dramatically since the 1980s. Realists have found answers to the epistemic and semantic concerns which originally motivated Global Constructivism, so the view is no longer well motivated. (See, for example, Devitt (1997) and Devitt (2010).) Moreover, there are compelling objections to Global Constructivism regarding, in particular, how we can construct entities if we have no epistemic access to any unconstructed entities to construct them from, and what can constrain our constructing, namely, given that we have epistemic access only to the constructed, it appears nothing unconstructed can constrain our constructing.

Local Constructivism fares better for reasons both sociological and philosophical. Sociologically, Local Constructivism has not been around for long and, rather than being one view, it is a whole series of loosely connected views, so it has not yet drawn the sort of detailed criticism that squashed Global Constructivism. Additionally, being a Local Constructivist about x is compatible with being a Realist about y, z, a, b, … (all non-x entities). As such, it is not a global competitor to Realism and has not drawn the Realists’ ire in the way Global Constructivism did. Philosophically, Local Constructivism is also on firmer ground than was Global Constructivism. By endorsing unconstructed entities which we have epistemic access to and which constrain our constructing, Local Constructivists are able to side-step many of the central criticisms which plague Global Constructivism. Local Constructivism looks well poised to provide an intuitive middle ground between a naturalistic Realism (which often unacceptably alters either the existence or the nature of the ordinary objects we take ourselves to know and love) and an overly subjective anti-Realism (which fails to recognize the role the objective world plays in determining our experiences and the insights we can gain from science).

6. Timeline of Constructivism in Analytic Metaphysics

1781	Kant’s A Critique of Pure Reason distinguishes between noumena and phenomena, thereby laying the groundwork for future work on constructivism
1907	James’ Pragmatism: A New Name for Some Old Ways of Thinking defends Global Constructivism
1978-1993	Goodman and Putnam publish a series of books and papers defending Global Constructivism
1986 and 2000	Schwartz defends Global Constructivism
1990	Heller defends an eliminativist view of vague objects, along the way to doing so, he shows how to be a constructivist about vague objects
1990s-2000s	Baker, Thomasson, Searle, and Devitt endorse Local Constructivism about artifacts
Post 1988	Sidelle, Einheuser, and Goswick argue that objects having “deep” modal properties are constructed
2008	Kriegel argues that composite objects are constructed
2011	Varzi argues that objects with conventional boundaries are constructed

7. References and Further Reading

a. Constructivism: General

Alward, Peter. (2014) “Butter Knives and Screwdrivers: An Intentionalist Defense of Radical Constructivism,” The Journal of Aesthetics and Art Criticism, 72(3): 247-260.
Boyd, R. (1992) “Constructivism, Realism, and Philosophical Method” in Inference, Explanation, and Other Frustrations: Essays in the Philosophy of Science (ed. Earman). Los Angeles: University of California Press: 131-198.
Bridges and Palmgren. (2018) “Constructive Mathematics” in The Stanford Encyclopedia of Philosophy.
Chakravartty, Anjan. (2017) “Scientific Realism” in The Stanford Encyclopedia of Philosophy.
Downes, Stephen. (1998) “Constructivism” in the Routledge Encyclopedia of Philosophy.
Feyerabend, Paul. (2010) Against Method. USA: Verso Publishing.
Foucault, Michel. (1970) The Order of Things. USA: Random House.
Hacking, Ian. (1986) “Making Up People,” in Reconstructing Individualism: Autonomy, Individuality, and the Self in Western Thought (eds. Heller, Sosna, Wellbery). Stanford: Stanford University Press, 222-236.
Hacking, Ian. (1992) “World Making by Kind Making: Child-Abuse for Example,” in How Classification Works: Nelson Goodman among the Social Sciences (eds. Dougles and Hull). Edinburgh: Edinburgh University Press, 180-238.
Hacking, Ian. (1999) The Social Construction of What? Cambridge: Harvard University Press.
Haslanger, Sally. (1995) “Ontology and Social Construction,” Philosophical Topics, 23(2): 95-125.
Haslanger, Sally. (2003) “Social Construction: The ‘Debunking’ Project,” Socializing Metaphysics: The Nature of Social Reality (ed. Schmitt). Lanham: Roman & Littlefield Publishers, 301-326.
Haslanger, Sally. (2012) Resisting Reality: Social Construction and Social Critique, New York: Oxford University Press.
Jezzi, Nathaniel. (2019) “Constructivism in Metaethics,” Internet Encyclopedia of Philosophy. https://iep.utm.edu/con-ethi/
Kuhn, Thomas. (1996) The Structure of Scientific Revolutions. Chicago: Chicago University Press.
Mallon, Ron. (2019) “Naturalistic Approaches to Social Construction” in the Stanford Encyclopedia of Philosophy.
Rawls, John. (1980) “Kantian Constructivism in Moral Theory,” Journal of Philosophy, 77: 515-572.
Remhof, J. (2017) “Defending Nietzsche’s Constructivism about Objects,” European Journal of Philosophy, 25(4): 1132-1158.
Street, Sharon. (2008) “Constructivism about Reasons,” Oxford Studies in Metaethics, 3: 207-245.
Street, Sharon. (2010) “What Is Constructivism in Ethics and Metaethics?” Philosophy Compass, 5(5): 363-384.
Werner, Konrad. (2015) “Towards a PL-Metaphysics of Perception: In Search of the Metaphysical Roots of Constructivism,” Constructivist Foundations, 11(1): 148-157.

b. Constructivism: Analytic Metaphysics

Baker, Lynne Ruder. (2004) “The Ontology of Artifacts,” Philosophical Explorations, 7: 99-111.
Baker, Lynne Ruder. (2007) The Metaphysics of Everyday Life: An Essay in Practical Realism. USA: Cambridge University Press.
Bennett, Karen. (2017) Making Things Up. Oxford: Oxford University Press.
Dummett, Michael. (1993) Frege: Philosophy of Language. Cambridge: Harvard University Press.
Einheuser, Iris. (2011) “Towards a Conceptualist Solution to the Grounding Problem,” Nous, 45(2): 300-314.
Evnine, Simon. (2016) Making Objects and Events: A Hylomorphic Theory of Artifacts, Actions, and Organisms. Oxford: Oxford University Press.
Goodman, Nelson. (1980) “On Starmaking,” Synthese, 45(2): 211-215.
Goodman, Nelson. (1983) “Notes on the Well-Made World,” Erkenntnis, 19: 99-108.
Goodman, Nelson. (1978) Ways of Worldmaking. USA: Hackett Publishing Company.
Goodman, Nelson. (1993) “On Some Worldly Worries,” Synthese, 95(1): 9-12.
Goswick, Dana. (2015) “Why Being Necessary Really Isn’t the Same As Being Not Possibly Not,” Acta Analytica, 30(3): 267-274.
Goswick, Dana. (2018a) “A New Route to Avoiding Primitive Modal Facts,” Brute Facts (eds. Vintiadis and Mekios). Oxford: OUP, 97-112.
Goswick, Dana. (2018b) “The Hard Question for Hylomorphism,” Metaphysics, 1(1): 52-62.
Goswick, Dana. (2018c) “Ordinary Objects Are Nonmodal Objects,” Analysis and Metaphysics, 17: 22-37.
Goswick, Dana. (2019) “A Devitt-Proof Constructivism,” Analysis and Metaphysics, 18: 17-24.
Hale and Wright. (2017) “Putnam’s Model-Theoretic Argument Against Metaphysical Realism” in A Companion of the Philosophy of Language (eds. Hale, Wright, and Miller). USA: Wiley-Blackwell, 703-733.
Heller, Mark. (1990) The Ontology of Physical Objects. Cambridge: CUP.
Irmak. (2019) “An Ontology of Words,” Erkenntnis, 84: 1139-1158.
James, William. (1907) Pragmatism: A New Name for Some Old Ways of Thinking. New York: Longmans Green Publishing (especially lectures 6 and 7).
James, William. (1909) The Meaning of Truth: A Sequel to Pragmatism. New York: Longmans Green Publishing.
Kant, Immanuel. (1965) The Critique of Pure Reason. London: St. Martin’s Press.
Kitcher, Philip. (2001) “The World As We Make It” in Science, Truth and Democracy. Oxford: Oxford University Press, ch. 4.
Korman. (2019) “The Metaphysics of Establishments,” The Australasian Journal of Philosophy, DOI: 10.1080/00048402.2019.1622140.
Kriegel, Uriah. (2008) “Composition as a Secondary Quality,” Pacific Philosophical Quarterly, 89: 359-383.
Ladyman, James and Ross, Don. Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press, 2007.
Levinson. (1980) “What a Musical Work Is,” The Journal of Philosophy, 77(1): 5-28.
McCormick, Peter. (1996) Starmaking: Realism, Anti-Realism, and Irrealism. Cambridge: MIT Press.
Putnam, Hilary. (1979) “Reflections on Goodman’s Ways of Worldmaking,” Journal of Philosophy, 76: 603-618.
Putnam, Hilary. (1981) Reason, Truth, and History. Cambridge: Cambridge University Press.
Putnam, Hilary. (1982) “Why There Isn’t a Ready-Made World,” Synthese, 51: 141-168.
Putnam, Hilary. (1987) The Many Faces of Realism. LaSalle: Open Court Publishing.
Quine, W.V.O. (1960) Word and Object. Cambridge: MIT Press.
Remhof, J. (2014) “Object Constructivism and Unconstructed Objects,” Southwest Philosophy Review, 30(1): 177-186.
Rorty, Richard. (1972) “The World Well Lost,” The Journal of Philosophy, 69(19): 649-665.
Schwartz, Robert. (1986) “I’m Going to Make You a Star,” Midwest Studies in Philosophy, 11: 427-438.
Schwartz, Robert. (2000) “Starting from Scratch: Making Worlds,” Erkenntnis, 52: 151-159.
Searle, John. (1995) The Construction of Social Reality. USA: Free Press.
Sidelle, Alan. (1989) Necessity, Essence, and Individuation. London: Cornell University Press.
Thomasson, Amie. (1999) Fiction and Metaphysics. Cambridge: Cambridge University Press.
Thomasson, Amie. (2003) “Realism and Human Kinds,” Philosophy and Phenomenological Research, 67(3): 580-609.
Thomasson, Amie. (2007) Ordinary Objects. Oxford: OUP.
Varzi, Achille. (2011) “Boundaries, Conventions, and Realism” in Carving Nature at Its Joints (eds. Campbell et al.). Cambridge: MIT Press, 129-153.

c. Critics of Analytic Metaphysical Constructivism

Blackson, Thomas. (1992) “The Stuff of Conventionalism,” Philosophical Studies, 68(1): 65-81.
Boghossian, Paul. (2006) Fear of Knowledge: Against Relativism and Constructivism. New York: Oxford University Press.
Cortens, Andrew. (2002) “Dividing the World Into Objects” in Realism and Antirealism. (ed. Alston). Ithaca: Cornell University Press.
Davidson, Donald. (1974) “On the Very Idea of a Conceptual Scheme,” Proceedings and Addresses of the American Philosophical Association, 47: 5-20.
Devitt, Michael. (1997) Realism and Truth. Princeton: Princeton University Press.
Devitt, Michael. (2010) Putting Metaphysics First: Essays on Metaphysics and Epistemology. Oxford: Oxford University Press.
Elder, Crawford. (2011) “Carving Up a Reality in Which There Are No Joints” in A Companion to Relativism (ed. Hales). London: Blackwell, 604-620.
Korman, Daniel. (2016) Objects: Nothing Out of the Ordinary. Oxford: Oxford University Press.
Scheffler, Israel. (1980). “The Wonderful Worlds of Goodman,” Synthese, 45(2): 201-209.
Sider, Ted. (2011) Writing the Book of the World. Oxford: Oxford University Press.
Wieland, Jan. (2012) “Carving the World as We Please,” Philosophica, 84: 7-24.

Author Information

Dana Goswick
Email: dgoswick@unimelb.edu.au
University of Melbourne
Australia

Precautionary Principles

The basic idea underlying a precautionary principle (PP) is often summarized as “better safe than sorry.” Even if it is uncertain whether an activity will lead to harm, for example, to the environment or to human health, measures should be taken to prevent harm. This demand is partly motivated by the consequences of regulatory practices of the past. Often, chances of harm were disregarded because there was no scientific proof of a causal connection between an activity or substance and chances of harm, for example, between asbestos and lung diseases. When this connection was finally established, it was often too late to prevent severe damage.

However, it is highly controversial how the vague intuition behind “better safe than sorry” should be understood as a principle. As a consequence, we find a multitude of interpretations ranging from decision rules over epistemic principles to procedural frameworks. To acknowledge this diversity, it makes sense to speak of precautionary principles (PPs) in the plural. PPs are not without critics. For example, it has been argued that they are paralyzing, unscientific, or promote a culture of irrational fear.

This article systematizes the different interpretations of PPs according to their functions, gives an overview about the main lines of argument in favor of PPs, and outlines the most frequent and important objections made to them.

The Idea of Precaution and Precautionary Principles
Interpretations of Precautionary Principles
Justifications for Precautionary Principles
1. Practical Rationality
2. Moral Justifications for Precaution
Main Objections and Possible Rejoinders
References and Further Reading

1. The Idea of Precaution and Precautionary Principles

We can identify three main motivations behind the postulation of a PP. First, it stems from a deep dissatisfaction with how decisions were made in the past: Often, early warnings have been disregarded, leading to significant damage which could have been avoided by timely precautionary action (Harremoës and others 2001). This motivation for a PP rests on some sort of “inductive evidence” that we should reform (or maybe even replace) our current practices of risk regulation, demanding that uncertainty must not be a reason for inaction (John 2007).

Second, it expresses specific moral concerns, usually pertaining to the environment, human health, and/or future generations. This second motivation is often related to the call for sustainability and sustainable development in order to not destroy important resources for short-time gains, but to leave future generations with an intact environment.

Third, PPs are discussed as principles of rational choice under conditions of uncertainty and/or ignorance. Typically, rational decision theory is well suited for situations where we know the possible outcomes of our actions and can assign probabilities to them (a situation of “risk” in the decision-theoretic sense). However, the situation is different for decision-theoretic uncertainty (where we know the possible outcomes, but cannot assign any, or at least no meaningful and precise, probabilities to them) or decision-theoretic ignorance (where we do not know the complete set of possible outcomes). Although there are several suggestions for decision rules under these circumstances, it is far from clear what is the most rational way to decide when we are lacking important information and the stakes are high. PPs are one proposal to fill this gap.

Although they are often asserted individually, these motivations also complement each other: If, as following from the first motivation, uncertainty is not allowed to be a reason for inaction, then we need some guidance for how to decide under such circumstances, for example, in the form of a decision principle. And in many cases, it is the second motivation—concerns for the environment or human health—which makes the demand for precautionary action before obtaining scientific certainty especially pressing.

Many existing official documents cite the demand for precaution. One often-quoted example for a PP is principle 15 of the Rio Declaration on Environment and Development, a result of the United Nations Conference on Environment and Development (UNCED) in 1992. It refers to a “precautionary approach”:

Rio PP—In order to protect the environment, the precautionary approach shall be widely applied by states according to their capabilities. Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing cost-effective measures to prevent environmental degradation. (United Nations Conference on Environment and Development 1992, Principle 15)

Another prominent example is the formulation that resulted from the Wingspread Conference on the Precautionary Principle 1998, where around 35 scientists, lawyers, policy makers and environmentalists from the United States, Canada and Europe met to define a PP:

Wingspread PP—When an activity raises threats of harm to human health or the environment, precautionary measures should be taken even if some cause and effect relationships are not fully established scientifically. In this context the proponent of an activity, rather than the public, should bear the burden of proof. The process of applying the precautionary principle must be open, informed and democratic and must include potentially affected parties. It must also involve an examination of the full range of alternatives, including no action. (Science & Environmental Health Network (SEHN) 1998)

Both formulations are often cited as paradigmatic examples of PPs. Although they both mention uncertain threats and measures to prevent them, they also differ in important points, for example their strength: The Rio PP makes a weaker claim, stating that uncertainty is not a reason for inaction, whereas the Wingspread PP puts more emphasis on the fact that measures should be taken. They both give rise to a variety of questions: What counts as “serious or irreversible damage”? What does “(lack of) scientific certainty” mean? How plausible does a threat have to be in order to warrant precaution? What counts as precautionary measures? Additionally, PPs face many criticisms, like being too vague to be action-guiding, paralyzing the decision-process, or being anti-scientific and promoting a culture of irrational fear.

Thus, inspired by these regulatory principles in official documents, a lively debate has developed around how PPs should be interpreted in order to arrive at a version applicable in practical decision-making. This resulted in a multitude of PP proposals that are formulated and defended (or criticized) in different theoretical and practical contexts. Most of the existing PP formulations share the elements of uncertainty, harm, and (precautionary) action. Different ways of spelling out these elements result in different PPs (Sandin 1999, Manson 2002). For example, they can vary in how serious a harm has to be in order to trigger precaution, or which amount of evidence is needed. Additionally, PP interpretations differ with respect to the function they are intended to fulfill. They are typically classified based on some combination of the following categories according to their function (Sandin 2007, 2009; Munthe 2011; Steel 2014):

Action-guiding principles tell us which course of action to choose given certain circumstances;
(sets of) epistemic principles tell us what we should reasonably believe under conditions of uncertainty;
procedural principles express requirements for decision-making, and tell us how we should choose a course of action.

These categories can overlap, for example, when action- or decision-guiding principles come with at least some indication for how they should be applied. Some interpretations explicitly aim at integrating the different functions, and warrant their own category:

Integrated PP interpretations: Approaches that integrate action-guiding, epistemic, and procedural elements associated with PPs. Consequently, they tell us which course of action should be chosen through which procedure, and on what epistemic base.

This article starts in Section 2 with an overview of different PP interpretations according to this functional categorization. Section 3 describes the main lines of arguments that have been presented in favor of PPs, and Section 4 presents the most frequent and most important objections that PPs face, along with possible rejoinders.

2. Interpretations of Precautionary Principles

a. Action-Guiding Interpretations

Action-guiding PPs are often seen on a par with decision rules from rational decision theory. On the one hand, authors formalize PPs by using decision rules already established in decision theory, like maximin. On the other hand, they formulate new principles. While not necessarily located within the framework of decision theory, those are intended to work at the same level. Understood as principles of risk management, they are supposed to help to determine a course of action given our knowledge and our values.

i. Decision Rules

The terms used for decision-theoretic categories of non-certainty differ. In this article, they are used as follows: Decision-theoretic risk denotes situations in which we know the possible outcomes of actions and can assign probabilities to them. Decision-theoretic uncertainty refers to situations in which we know the possible outcomes, but either no or only partial or imprecise probability information is available (Hansson 2005a, 27). When we don’t even know the full set of possible outcomes, we have a situation of decision-theoretic ignorance. When formulated as decision rules, the “(scientific) uncertainty” component of PPs is often spelled out as decision-theoretic uncertainty.

Maximin
The idea to operationalize a PP with the maximin decision rule occurred early within the debate and is therefore often associated with PPs (for example, Hansson 1997; Sunstein 2005b; Gardiner 2006; Aldred 2013).

In order to be able to apply the maximin rule, we have to know the possible outcomes of our actions and be able to at least rank them on an ordinal scale (meaning that for each outcome, we can tell whether it is better, worse, or equally good than each other possible outcome). It then tells us to select the option with the best worst case in order to “maximize the minimum”. Thus, the maximin rule seems like a promising candidate for a PP. It pays special attention to the prevention of threats, and is applicable under conditions of uncertainty. However, as has repeatedly been pointed out, maximin is not a plausible rule of choice in general. Consider the decision matrix in Table 1.

	Scenario₁	Scenario₂
Alternative₁	7	6
Alternative₂	15	5

Table 1: Simplified Decision-Matrix with Two Alternative Courses of Action.

Maximin selects Alternative₁. This seems excessively risk-averse because the best case in Alternative₂ is much better, and the worst case is only slightly worse, as long as we assume (a) that the utilities in this example are cardinal utilities, and (b) that there is not some kind of relevant threshold passed. If we knew that the probability for Scenario₁ is 0.99 and the probability for Scenario₂ only 0.01, then it would arguably be absurd to apply maximin. Proponents of interpreting a PP with maximin thus have stressed that it needs be qualified by some additional criteria in order to provide a plausible PP interpretation.

The most prominent example is Gardiner (2006), who draws on criteria suggested by Rawls to determine conditions under which the application of maximin is plausible:

Knowledge of likelihoods for the possible outcomes of the actions is impossible or at best extremely insecure;
the decision-makers care relatively little for potential gains that might be made above the minimum that can be guaranteed by the maximin approach;
the alternatives that will be rejected by maximin have unacceptable outcomes; and
the outcomes considered are in some adequate sense “realistic”, that is, only credible threats should be considered.

Condition (3) makes it clear that the guaranteed minimum (condition 2) needs to be acceptable to the decision-makers (see also Rawls 2001, 98). What it means that ‘gains above the guaranteed minimum are relatively little cared for’ (condition 2) has been spelled out by Aldred (2013) in terms of incommensurability between outcome values, that is, that some outcomes are so bad that they cannot be outweighed by potential gains. It is thus better to choose an option that promises only little gains but guarantees that the extremely bad outcome can’t materialize.

Gardiner argues that a maximin rule that is qualified by these criteria fits well with some core cases where we agree that precaution is necessary and calls it the “Rawlsian Core Precautionary Principle (RCPP)”. He names the purchase of insurance as an everyday-example where his RCPP fits well with our intuitive judgments and where precaution seems already justified on its own. According to Gardiner, it also fits well with often-named paradigmatic cases for precaution like climate change: The controversy whether or not we should take precautions in the climate case is not a debate around the right interpretation of the RCPP but rather about whether the conditions for its application are fulfilled—for example, which outcomes are unacceptable (Gardiner 2006, 56).

Minimax Regret
Another decision rule that is discussed in the context of PPs is the minimax regret rule. Whereas maximin selects the course of action with the best worst case, minimax regret selects the course of action with the lowest maximal regret. The regret of an outcome is calculated by subtracting its utility from the highest utility one could have achieved under this state by selecting another course of action. This strategy tries to minimize one’s regret for not having made the superior choice in hindsight. The minimax regret rule does not presuppose any probability information, like the maximin rule. However, while for the maximin rule it is enough if outcomes can be ranked on an ordinal scale, the minimax rule requires that we are able to assign cardinal utilities to the possible outcomes. Otherwise, regret cannot be calculated.

Take the following example from Hansson (1997), in which a lake seems to be dying for reasons that we do not fully understand: “We can choose between adding substantial amounts of iron acetate, and doing nothing. There are three scientific opinions about the effects of adding iron acetate to the lake. According to opinion (1), the lake will be saved if iron acetate is added, otherwise not. According to opinion (2), the lake will self-repair anyhow, and the addition of iron acetate makes no difference. According to opinion (3), the lake will die whether iron acetate is added or not.” The consensus is that the addition of iron acetate will have certain negative effects on land animals that drink water from the lake, but that effect is less serious than the death of the lake. Assigning the value -12 to the death of the lake and -5 to the negative effects of iron acetate in the drinking water, we arrive at the utility matrix in Table 2.

	(1)	(2)	(3)
Add iron acetate	5	-5	-17
Do nothing	-12	0	-12

Table 2: Utility-Matrix for the Dying-Lake Case

We can then obtain the regret table by subtracting the utility of each outcome from the highest utility in each column, the result being Table 3. Minimax regret then selects the option to add iron acetate to the lake.

	(1)	(2)	(3)
Add iron acetate	0	5	5
Do nothing	7	0	0

Table 3: Regret-Matrix for the Dying-Lake Case

Chisholm and Clarke (1993) strongly support the minimax regret rule. They argue that it is better suited for PP than maximin, since it gives some weight to foregone benefits. They also show that even if it is uncertain whether precautionary measures will be effective, minimax regret still recommends them as long as the expected damage from not implementing them is large enough. They advocate so-called “dual purpose” policies, where precautionary measures have other positive effects, even if they do not fulfill their main purpose. One example is measures that are aimed at abating global climate change, but at the same time have direct positive effects on local environmental problems. Contrarily, Hansson (1997) argues that to take precautions means to avoid bad outcomes, and especially to avoid worst cases. Consequently, he defends maximin and not minimax regret as the adequate PP interpretation. Maximin would, as Table 2 shows, select to not add iron acetate to the lake. According to Hansson, this is the precautionary choice as adding iron acetate could lead to a worse outcome than not adding it.

ii.Context-Sensitive Principles

Other interpretations of PPs as action-guiding principles differ from stand-alone if-this-then-that decision rules. They stress that principles have to be interpreted and concretized depending on the specific context (Fisher 2002; Randall 2011).

A Virtue Principle
Sandin (2009) argues that one can reinterpret a PP as an action-guiding principle not by reference to decision theory, but by using cautiousness as a virtue. He formulates an action-guiding virtue principle of precaution (VPP):

VPP—Perform those, and only those, actions that a cautious agent would perform in the circumstances. (Sandin 2009, 98)

Although virtue principles are commonly criticized as not being action-guiding, Sandin argues that understanding a PP in this way actually makes it more action-guiding. “Cautious” is interpreted as a virtue term that refers to a property of an agent, like “courageous” or “honest”. Sandin states that it is often possible to identify what the virtuous agent would do: Either because it is obvious, or because at least some agreement can be reached. Even the uncertain cases VPP is dealing with belong to classes of situations where we have experience with, for example, failed regulations of the past, and therefore can assess what the cautious agent would (not) have done and extrapolate from that to other cases (Sandin 2009, 99). According to Sandin, interpreting a PP as a virtue principle will avoid both objections of extremism and paralysis. It is unlikely that the virtuous agent will choose courses of action which will, in the long run, have overall negative effects or are self-refuting (like “ban activity a and do not ban activity a!”). However, even if one accepts that it makes sense to interpret “cautious” as a virtue, “the circumstances” under which one should choose the course of action that the cautious agent would choose are not specified in the VPP as it is formulated by Sandin. This makes it an incomplete proposal.

Reasonableness and Plausibility
Another important example is the PP interpretation by Resnik (2003, 2004), who defends a PP as an alternative to maximin and other strategies for decision-making in situations where we lack the type of empirical evidence that one would need for a risk management that uses probabilities obtained from risk assessment. His PP interpretation, which we can call the “reasonable measures precautionary principle” (RMPP), reads as follows:

RMPP—One should take reasonable measures to prevent or mitigate threats that are plausible and serious.

The seriousness of a threat relates to its potential for harm, as well as to whether or not the possible damage is seen as reversible or not (Resnik 2004, 289). Resnik emphasizes that reasonableness is a highly pragmatic and situation-specific concept. He names some neither exhaustive nor necessary criteria for reasonable responses: They should be effective, proportional to the nature of the threat, take a realistic attitude toward the threat, be cost-effective, and be applied consistently (Resnik 2003, 341–42). Lastly, that threats have to be credible means that there have to be scientific arguments for the plausibility of a hypothesis. These can be based on epistemic and/or pragmatic criteria, including for example coherence, explanatory power, analogy, precedence, precision, or simplicity. Resnik stresses that a threat being plausible is not the same as a threat being even minimally probable: We might accept threats as plausible that we think to be all but impossible to come to fruition (Resnik 2003, 341).

This shows that the question when a threat should count as plausible enough to warrant precautionary measures is very important for the application of an action-guiding PP. Consequently, such PPs are often very sensitive to how a problem is framed. Some authors took these aspects—the weighing of evidence and the description of the decision problem—to be central points of PPs, and interpreted them as epistemic principles, that is, principles at the level of risk assessment.

b. Epistemic Interpretations

Authors that defend an epistemic PP interpretation argue that we should accept that PPs are not principles that can guide our actions, but that this is neither a problem nor against their spirit. Instead of telling us how to act when facing uncertain threats of harm, they propose that PPs tell us something about how we should perceive these threats, and what we should take as a basis for our actions, for example, by relaxing the standard for the amount of evidence required to take action.

i. Standards of Evidence

One interpretation of an epistemic PP is to give more weight to evidence suggesting a causal link between an activity and threats of serious and irreversible harm than one gives to evidence suggesting less dangerous, or beneficial, effects. This could mean to assign a higher probability for an effect to occur than one would in other circumstances based on the same evidence. Arguably, the underlying idea of this PP can be traced back to the German philosopher Hans Jonas, who proposed a “heuristic of fear”, that is, to give more weight to pessimistic forecasts than to optimistic ones (Jonas 2003). However, this PP interpretation has been criticized on the basis that it systematically discounts evidence pointing in one direction, but not in the other. This could lead to distorted beliefs about the world in the long run, being detrimental to our epistemic and scientific progress and eventually doing more harm than good (Harris and Holm 2002).

However, other authors point out that we might have to distinguish between “regulatory science” and “normal science”. Different epistemic standards are appropriate for the two contexts since they have different aims: In normal science, we are searching for truth; in regulatory science, we are primarily interested in reducing risk and avoiding harm (John 2010). Accordingly, Peterson (2007a) refers in his epistemic PP interpretation only to decision makers—not scientists—who find themselves in situations involving risk or uncertainty. He argues that in such cases, decision-makers should strive to acquire beliefs that are likely to protect human health, and that it is less important whether they are also likely to be true. One principle that has been promoted in order to capture this idea is the preference for false positives, that is, for type I errors over type II errors.

ii. Type I and Type II Errors

Is it worse to falsely assert that there is a relationship between two classes of events, which does not exist (false positives), or to fail to assert such a relationship, when it in fact exists (false negatives)? For example, would you prefer a virus software on your computer which classifies a harmless program as a virus (false positive) or rather one that misses a malicious program (false negative)? Statistical hypotheses testing tests the so-called null-hypothesis, which is the default view that there is no relationship between two classes of events, or groups. Rejecting a true null hypothesis is called a type I error, whereas failing to reject a false null hypothesis is a type II error. Which type of possible error should we try to minimize, if we cannot minimize both at once?

In (normal) science, it is valued higher not to include false assertions into the body of knowledge, which would distort it in the long term. Thus, the default assumption—the null hypothesis—is that there is no connection between two classes of events, and typically statistical procedures are used that minimize type I errors (false positives) even if this might mean that an existing connection is missed (at least at first, or for a long time) (John 2010). To believe that a certain existing deterministic or probabilistic connection between two classes of events does not exist might slow down the scientific progress in normal science aiming at truth. However, in regulatory contexts it might be disastrous to believe falsely that a substance is safe when it is not. Consequently, a prominent interpretation of an epistemic PP takes it to entail a preference for type I errors over type II errors in regulatory contexts (see for example Lemons, Shrader-Frechette, and Cranor 1997; Peterson 2007a; John 2010).

Merely favoring one type of error over another might not be enough. It has been argued that the underlying methodology of either rejecting or accepting hypotheses does not sufficiently allow for identifying and tracking uncertainties. If a PP is understood as a principle that relaxes the standard for the amount of evidence required to take action, then a new epistemology might be needed: One that allows integrating the uncertainty about the causal connection between, for example, a drug and a harm, in the decision (Osimani 2013).

iii. Precautionary Defaults

The use of precautionary regulatory defaults is one proposal for how to deal with having to make regulatory decisions in the face of insufficient information (Sandin and Hansson 2002; Sandin, Bengtsson, and others 2004). In regulatory contexts, there are often situations in which a decision has to be made on how to treat a potentially harmful substance that also has some (potential) benefits. Other than in normal science, it is not possible to wait and collect further evidence before a verdict is made. The substance has to be treated one way or another while waiting for further evidence. Thus, it has been suggested that we should use regulatory defaults, that is, assumptions that are used in the absence of adequate information and that should be replaced if such information were obtained. They should be precautionary defaults by building in special margins of safety in order to make sure that the environment and human health get sufficient protection. One example is the use of uncertainty factors in toxicology. Such uncertainty factors play a role in estimating reference doses which are acceptable for humans by dividing a level of exposure found acceptable in animal experiments by a number (usually 100) (Steel 2011, 356). This takes into account that there are significant uncertainties, for example, in extrapolating the results from animals to humans. Such defaults are a way to handle uncertain threats. Nevertheless, they should not be confused with actual judgments about what properties a particular substance has (Sandin, Bengtsson, and others 2004, 5). Consequently, an epistemic PP does not have to be understood as a belief-guiding principle, but as saying something on which methods for risk assessment are legitimate, for example, for quantifying uncertainties (Steel 2011). According to this view, precautionary defaults like uncertainty factors in toxicology are methodological implications of a PP that allow to apply it in a scientifically sound way while protecting human health and the environment.

Given this, it might be misleading to interpret a PP as a purely epistemic principle, if it is not guiding our beliefs but telling us what assumptions to accept, that is, to act as if certain things were true, as long as we do not have more information. Thus, it has been argued that a PP is better interpreted as a procedural requirement, or as a principle that imposes several such procedural requirements (Sandin 2007, 103–4).

c. Procedural Interpretations

It has been argued that we should shift our attention when interpreting PPs from the question of what action to take to the question of what is the best way to reach decisions.

i. Argumentative, or “Meta”-PPs

Argumentative PPs are procedural principles specifying what kinds of arguments are admissible in decision-making (Sandin, Peterson, and others 2002). They are different from prescriptive, or action-guiding, PPs in that they do not directly prescribe actions that should be taken. Take principle 15 of the Rio Declaration on Environment and Development. On one interpretation, it states that arguments for inaction which are based solely on the ground that we are lacking full scientific certainty, are not acceptable arguments in the decision-making procedure:

Rio PP—“In order to protect the environment, the precautionary approach shall be widely applied by states according to their capabilities. Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing cost-effective measures to prevent environmental degradation.” (United Nations Conference on Environment and Development 1992, Principle 15)

Such an argumentative PP is seen as a meta-rule that places real constraints on what types of decision rules should be used: For example, by entailing that decision-procedures should be used that are applicable under conditions of uncertainty, it recommends against some of the traditional approaches in risk regulation like cost-benefit analysis (Steel 2014). Similarly, it has been proposed that the idea behind PPs is best interpreted as a general norm that demands a fundamental shift in our way of risk regulation, based on an obligation to learn from regulatory mistakes of the past (Whiteside 2006).

ii. Transformative Decision Rules

Similar to argumentative principles, an interpretation of a PP as a transformative decision rule doesn’t tell us which action should be taken, but it puts constraints on which actions can be considered as valid options. Informally, a transformative decision rule is defined as a decision rule that takes one decision problem as input, and yields a new decision problem as output (Sandin 2004, 7). For example, the following formulation of a PP as a transformative decision rule (TPP) has been proposed by Peterson (2003):

TPP—If there is a non-zero probability that the outcome of an alternative act is very low, that is, below some constant c, then this act should be removed from the decision-maker’s list of options.

Thus, the TPP excludes courses of actions that could lead, for example, to catastrophic outcomes, from the options available to the decision maker. However, it does not tell us which of the remaining options should be chosen.

iii. Reversing the Burden of Proof

The requirement of reversal of burden of proof is one of the most prominent specific procedural requirements that are named in connection with PPs. For example, in the influential communication on the PP from the Wingspread Conference on the Precautionary Principle (1998), it is stated, “the proponent of an activity, rather than the public bears the burden of proof.”

One common misconception is that the proponent of a potentially dangerous activity would have to prove with absolute certainty that the activity is safe. This gave rise to the objection that PPs are too demanding, and therefore would bring every progress to a halt (Harris and Holm 2002). However, the idea is rather that we have to change our approach to regulatory policy: Proponents of an activity have to prove to a certain threshold that it is safe in order to employ it, instead of the opponents having to prove to a certain threshold that it is harmful in order to ban it.

Thus, whether or not the situation is one in which the burden of proof is reversed depends on the status quo. Instead of speaking of shifting the burden of proof, it seems more sensible to ask what has to be proven, and who has to provide what kind of evidence for it. The important point that then remains to be clarified is what standards of proof are accepted.

An alternative proposal to shifting the burden of proof is that both regulators and proponents of an activity (Arcuri 2007) should share it: If opponents want to regulate an activity, they should at least provide some evidence that the activity might lead to serious or irreversible harm, even though we are lacking evidence to prove it with certainty. Proponents, on the other hand, should provide certain information about the activity in order to get it approved. Who has the burden of proof can play an important role in the production of information: If proponents have to show (to a certain standard) that their activity is safe, this generates an incentive to gather information about the activity, whereas in the other case—“safe until proven otherwise”—they might deliberately refrain from this (Arcuri 2007, 15).

iv. Procedures for Determining Precautionary Measures

Interpreted in a procedural way, a PP puts constraints on how a problem should be described or how a decision should be made. It does not dictate a specific decision or action. This is in line with one interpretation of what it means to be a principle as opposed to a rule. While rules specify precise consequences that follow automatically when certain conditions are met, principles are understood as guidelines whose interpretation will depend on specific contexts (Fisher 2002; Arcuri 2007).

Developing a procedural precautionary framework that integrates different procedural requirements is a way to enable the context-dependent specification and implementation of such a PP. One example is Tickner’s (2001) “precautionary assessment” framework, which consists of six steps that are supposed to guide decision-making as a heuristic device. The first five steps—(1) Problem Scoping, (2) Participant Analysis, (3) Burden/Responsibility Allocation Analysis, (4) Environment and Health Impact Analysis, and (5) Alternatives Assessment—serve to describe the problem, identify stakeholders, and assess possible consequences as well as available alternatives. In the final step, (6) Precautionary Action Analysis, the appropriate precautionary measure(s) are determined based on the results from the other steps. These decisions are not permanent, but should be part of a continuous process of increasing understanding and reducing overall impacts.

That the components are clarified on a case-by-case basis is a big advantage of such procedural implementations of PPs. It avoids an oversimplification of the decision process and takes the complexity of decisions under uncertainty into account. However, they are criticized for losing the “principle” part of PPs: For example, Sandin (2007) argues that procedural requirements form a heterogeneous category. A procedural PP would soon dissolve beyond recognition because it is intermingled with other (rational, legal, moral, and so forth) principles and rules. As an answer, some authors try to preserve the “principle” in PPs, while also taking into account procedural as well as epistemic elements.

d. Integrated Interpretations

We can find two main strategies for formulating a PP that is still identifiable as an action-guiding principle while integrating procedural as well as epistemic considerations: Either (1) developing particular principles that are specific to a certain context, and accompanied by a procedural framework for this context; or (2) describing the structure and the main elements of a PP plus naming criteria for adjusting those elements on a case-by-case basis.

i. Particular Principles for Specific Contexts

It has been argued that the general talk of “the” PP should be abandoned in favor of formulating distinct precautionary principles (Hartzell-Nichols 2013). This strategy aims to arrive at action-guiding and coherent principles by formulating particular PPs that apply to a narrow range of threats and express a specific obligation. One example is the “Catastrophic Harm PP (CHPP)” of Hartzell-Nichols (2012, 2017), which is restricted to catastrophic threats. It consists of eight conditions that specify when precautionary measures have to be taken, spelling out (a) what counts as a catastrophe, (b) the knowledge requirements for taking precaution, and (c) criteria for appropriate precautionary measures. The CHPP is accompanied by a “Catastrophic Precautionary Decision-Making Framework” which guides the assessment of threats in order to decide whether they meet the CHPP’s criteria, and guides decision-makers in determining what precautionary measures should be taken against a particular threat of catastrophe. This framework lists key considerations and steps that should be performed when applying the CHPP, for example, drawing on all available sources of information, assessing likelihoods of potential harmful outcomes under different scenarios, identifying all available courses of precautionary action and their effectiveness, and identifying specific actors who should be held responsible for taking the prescribed precautionary measures.

ii. An Adjustable Principle with Procedural Instructions

Identifying main elements of a PP and accompanying them with rules for adjusting them on a case-by-case basis is another strategy to preserve the idea of a precautionary principle while avoiding both inconsistency as well as vagueness. It has been shown that as diverse as PP formulations are, they typically share the elements of uncertainty, harm, and (precautionary) action (Sandin 1999, Manson 2002). By explicating these concepts and, most importantly, by defining criteria for how they should be adjusted with respect to each other, some authors obtain a substantial PP that can be adjusted on a case-by-case basis without becoming arbitrary.

One example is the PP that Randall (2011) develops in the context of an in-depth analysis of traditional, or as he calls it, ordinary risk management (ORM). Randall identifies the following “general conceptual form of PP”:

If there is evidence stronger than E that an activity raises a threat more serious than T, we should invoke a remedy more potent than R.

Threat, T, is explicated as chance of harm, meaning that threats are assessed and compared according to their magnitude and likelihood. Our knowledge of outcomes and likelihoods is explicated with the concept of evidence, E, referring to uncertainty in the sense of our incomplete knowledge about the world. The precautionary response is conceptualized as remedy, R, which covers a wide range of responses from averting the threat, remediating its damage, mitigating harm, and adapting to changed conditions after other remedies have been exhausted. Remedies should fulfill a double function, (1) providing protection from a plausible threat, while at the same time (2) generating additional evidence about the nature of the threat and the effectiveness of various remedial actions. The main relations between the three elements are that the higher the likelihood that the remedy-process will generate more evidence, the smaller is the threat-standard and the lower is the evidence-standard that should be required before invoking the remedy even if we have concerns about its effectiveness (Randall 2011, 167).

Having clarified the concepts used in the ETR-framework, Randall specifies them in order to formulate a PP that accounts for the weaknesses of ORM:

Credible scientific evidence of plausible threat of disproportionate and (mostly but not always) asymmetric harm calls for avoidance and remediation measures beyond those recommended by ordinary risk management. (Randall 2011, 186)

He then goes on to integrate this PP and ORM together into an integrated risk management framework. Randall makes sure to stress that a PP cannot determine the decision-process on its own. As a moral principle, it has to be weighed against other moral, political, economic, and legal considerations. Thus, he also calls for the development of a procedural framework to ensure that its substantial normative commitments will be implemented on the ground (Randall 2011, 207).

Steel (2014, 2013) develops a comprehensive PP interpretation which is intended to be “a procedural requirement, a decision rule, and an epistemic rule” (Steel 2014, 10). Referring to the Rio Declaration, Steel argues that such a formulation of a PP states that our decision-process should be structured differently, namely that decision-rules should be used that can be applied in an informative way under uncertainty. However, he does not take this procedural element to be the whole PP, but interprets it as a “meta”-rule which guides the application and specification of the precautionary “tripod” of threat, uncertainty, and precautionary action. More specifically, Steel’s proposed PP consists of three core elements:

The Meta Precautionary Principle (MPP): Uncertainty must not be a reason for inaction in the face of serious threats.
The Precautionary Tripod: The elements that have to be specified in order to obtain an action-guiding precautionary principle version, namely: If there is a threat that meets the harm condition under a given knowledge condition then a recommended precaution should be taken.
Proportionality: Demands that the elements of the Precautionary Tripod are adjusted proportionally to each other, understood as Consistency: The recommended precaution must not be recommended against by the same PP version, and Efficiency: Among those precautionary measures that can be consistently recommended by a PP version, the least costly one should be chosen.

An application of this PP requires selecting what Steel calls a “relevant version of PP,” that is, a specific instance of the Precautionary Tripod that meets the constraints from both MPP and Proportionality. To obtain such a version, Steel (2014, 30) proposes the following strategy: (1) select a desired safety target and define the harm condition as a failure to meet this target, (2) select the least stringent knowledge condition that results in a consistently applicable version of PP given the harm condition. To comply with the MPP, uncertainty must neither turn the PP version inapplicable nor lead to continual delay in taking measures to prevent harm.

Thus, Steel’s PP proposal guides decision-makers both in formulating the appropriate PP version as well as in its application. The process of formulating the particular version already deals with many questions like how evidence should be assessed, who has to prove what, to what kind of threats we should react, and what appropriate precautionary measures would be. Arguably, this PP can thereby be action-guiding, since it helps to select specific measures, without being a rigid prescriptive rule that is not suited for decisions under uncertainty.

Additionally, proposals like the ones of Randall and Steel have the advantage that they are not rigidly tied to a specific category of decision-theoretic non-certainty, that is, decision-theoretic risk, uncertainty, or ignorance. They can be adjusted with respect to varying degrees of knowledge and available evidence, taking into account that we typically have some imprecise or vague sense of how likely various outcomes are, but not enough of a sense to assign meaningful precise probabilities to the outcomes. While these situations do not amount to decision-theoretic risk, they nonetheless include more information than what is often taken to be available in decision-theoretic uncertainty. Arguably, this better corresponds to the notion of “scientific uncertainty” than to equate the latter with decision-theoretic uncertainty (see Steel 2014, Chapter 4).

3. Justifications for Precautionary Principles

This section surveys different normative backgrounds that have been used to defend a PP. It starts by addressing arguments that can be located in the framework of practical rationality, before moving to substantial moral justifications for precautions.

a. Practical Rationality

When PPs are proposed as principles of practical rationality, they are typically seen as principles of risk regulation. This includes, but is not reduced to, rational choice theory. When we examine the justifications for PPs in this context, we have to do this against the background of established risk regulation practices. We can identify a rather standardized approach to the assessment and management of risks, which Randall (2011, 43) calls “ordinary risk management (ORM).”

i. Ordinary Risk Management

Although there are different understandings of ORM, we can identify a rather robust “core” of two main parts. First, a scientific risk assessment is conducted, where potential outcomes are identified and their extent and likelihood estimated (compare Randall 2011, 43–46). Typically, risk assessment is understood as a quantitative endeavor, expressing numerical results (Zander 2010, 17). Second, on the basis of the data obtained from the risk assessment, the risk management phase takes place. Here, alternative regulatory courses of action as response to the scientifically estimated risks are discussed, and a choice is made between them. While the risk assessment phase should be as objective and value-free as possible, the decisions that take place in the risk management phase should be, although informed by science, based on the values and interests of the parties involved. In ORM, cost-benefit analysis (CBA) is a powerful and widely used tool for making these decisions in the risk-management phase. To conduct a CBA, the results from the risk assessment, that is, what outcomes are possible under which course of action, are evaluated according to the willingness to pay (WTP) or willingness to accept compensation (WTA) of individuals in order to estimate the benefits and costs of different courses of actions. That means that non-economic values, like human lives or environmental preservation, are getting monetized in order to be comparable on a common ratio-scale. Since we rarely if ever are facing cases of certainty, where each course of action has exactly one outcome which will materialize if we chose it, these so-reached utilities are then probability-weighed and added up in order to arrive at the expected utility of the different courses of action. On this basis, it is possible to calculate which regulatory actions have the highest expected net benefits (Randall 2011, 47), that is, to apply the principle of maximizing expected utility (MEU) and to choose the option with the highest expected utility. CBA is seen as a tool that enables decision-makers to rationally compare costs and benefits, helping them to come to an informed decision (Zander 2010, 4).

In the context of ORM, we can distinguish two main lines of argumentation for PPs: On the one hand, authors argue that PPs are rational by trying to show that they gain support from ORM. On the other hand, authors argue that ORM itself is problematic in some aspects, and propose PPs as a supplement or alternative to it. In both cases, we find justifications for PPs as decision rules for risk management as well as principles that pertain to the risk assessment stage and are concerned with problem-framing (this includes epistemic and value-related questions).

ii. PPs in the Framework of Ordinary Risk Management

To begin, here are some ways in which people propose to locate and defend PPs within ORM.

Expected Utility Theory
Some authors claim that as long as we can assign probabilities to the various outcomes, that is, as long as we are in a situation of decision-theoretic risk, precaution is already “built in” into ORM (Chisholm and Clarke 1993; Gardiner 2006; Sunstein 2007). The argument is roughly that no additional PP is necessary because expected utility theory in combination with the assumption of decreasing marginal utility allows for risk aversion by placing greater weight on the disutility of large damages. Not to choose options with possibly catastrophic outcomes, even if they only have a small probability, would thus be recommended by the principle of maximizing expected utility (MEU) as a consequence of their large disutility.

This argumentation does not go unchallenged, as the next subsection (3.a.iii) shows. Additionally, MEU itself is not uncontroversial (see Buchak 2013). Still, even if we accept it, we cannot use MEU under conditions of decision-theoretic uncertainty, since it relies on probability information. Consequently, authors proposed PPs for decisions under uncertainty in order to fill this “gap” in the ORM framework. They argue that under decision-theoretic uncertainty, it is rational to be risk-averse, and try to demonstrate this with arguments based on rational choice theory. However, it is not always clear if the discussed decision rule is used to justify a—somehow—already formulated PP, or if the decision rule is proposed as a PP itself.

Maximin and Minimax Regret
Both the maximin rule—selecting the course of action with the best worst case—and the minimax regret rule—selecting the course of action where under each possible scenario, the maximal regret is the smallest—have been proposed and discussed as possible formalizations of a PP within the ORM framework. It has been argued that maximin captures the underlying intuitions of PPs (namely, that the worst should be avoided) and that it yields rational decisions in relevant cases (Hansson 1997). Although the rationality of maximin is contested (Harsanyi 1975; Bognar 2011), it is argued that we can qualify it with criteria to single out the cases in which it can—and should—rationally be applied (Gardiner 2006). This is done by showing that a so-qualified maximin rule fits with paradigm cases of precaution and commonsense-decisions that we make, arguing that it is plausible to adopt it also for further cases.

Chisholm and Clarke (1993) argue that the minimax regret rule leads to the prevention of uncertain harm in line with the basic idea of a PP, while also giving some weight to forgone benefits. Against minimax regret and in favor of maximin, Hansson (1997, 297) argues that, firstly, minimax regret presupposes more information, since we need to be able to assign numerical utilities to outcomes. Secondly, he uses a specific example to show that minimax regret and maximin can lead to conflicting recommendations. According to Hansson, the recommendation made by maximin expresses a higher degree of precaution.

Quasi-Option Value
Irreversible harm is mentioned in many PP formulations, for example in the Rio Declaration. One proposal to justify why “irreversibility” justifies precautions refers to the concept of “(quasi-)option value” (Chisholm and Clarke 1993; Sunstein 2005a, 2009), which was first introduced by Arrow and Fisher (1974). They show that when regulators are confronted with decision problems where they are (a) uncertain about the outcomes of the options, but there are (b) chances for resolving or reducing these uncertainties in the future, and (c) one or more of the options might entail irreversible outcomes, then they should attach an extra-value, that is, an option-value to the reversible options. This takes into account the value of the options that choosing an alternative with irreversible outcome would foreclose. To illustrate this, think of the logging of (a part of) a rain forest: It is a very complex ecosystem, which we could use in many ways. But once it is clear-cut, it is almost impossible to restore to its original state. By choosing the option to cut it down, all options to use the rain forest in any other way would practically be lost forever. As Chisholm and Clarke (1993, 115) point out, irreversibility might sometimes be associated with not taking actions now: Not mitigating greenhouse gas (GHG) emissions means that more and more GHG aggregate in the atmosphere, where they stay for a century or more. They argue that introducing the concept of quasi-option value supports the application of a PP even if decision makers are not risk-averse.

iii. Reforming Ordinary Risk Management

After reviewing attempts to justify a PP in the ORM framework, without challenging the framework itself, let us now examine justifications for PPs that are partially based on criticisms of ORM.

Deficits of ORM
As a first point, ORM as a regulatory practice tends toward oversimplification that neglects uncertainty and imprecision, leading to irrational and harmful decisions. This is seen as a systematic deficit of ORM itself, not only of its users (see Randall 2011, 77), and not only as a problem under decision-theoretic uncertainty, that is, situations where no reliable probabilities are available, but already under decision-theoretic risk. First, decision makers tend to ignore low probabilities as irrelevant, focusing on the “more realistic,” higher ones. This means that low, but significant probabilities for catastrophe are ignored, for example, so called “fat tails” in climate scenarios (Randall 2011, 77). Second, decision makers are often “myopic”, placing higher weight on current costs than on future benefits, avoiding high costs today. This often leads to even higher costs in the future. Third, disutilities might get calculated too optimistically, neglecting so-called “secondary effects” or “social amplifications,” for example, the psychological and social effects of catastrophes (see Sunstein 2007, 7). Lastly, since cost-benefit analysis (CBA) provides such a clear view, there is a tendency to apply it even if the conditions for its application are not fulfilled. We tend to assume more than we know, and to decide according to the MEU criterion although no reliable probability information and/or no precise utility information is available. This so-called “tuxedo fallacy” is seen as a dangerous fallacy because it creates an “illusion of control” (Hansson 2008, 426–27).

Since PPs are seen as principles that address exactly such problems—drawing our attention on unlikely catastrophic possibilities, demanding action besides uncertainty, to consider the worst possible outcomes, and not to assume more than we know—they gain indirect support from these arguments. ORM in its current form allures us to apply it incorrectly and to neglect rational precautionary action. At least some sort of overarching PP that reminds us of correct practices seems necessary.

As a second point, it is argued that the regulatory practice of ORM has not only the “built-in” tendency to miss-apply its tools, but that it has fundamental flaws in itself which should be corrected by a PP. Randall (2011, 46–70) criticizes risk assessment in ORM on the grounds that it is typically built on simple models of the threatened system, for example, the climate system. Those neglect systemic risks like the possibility of feedback effects or sudden regime shifts. By depending on the law of large numbers, ORM is also not a decision framework that is suitable to deal with potential catastrophes, since they are singular events (Randall 2011, 52). Similarly, Chisholm and Clarke (1993, 112) argue that expected utility theory is only useful as long as “probabilities and possible outcomes are within the normal range of human experience.” Examples for such probabilities and outcomes in the normal range of human experience are insurances like car and fire insurance: We have statistics about the probabilities of accidents or fires, and can calculate reasonable insurance premiums based on the law of large numbers. Furthermore, we have experience with how to handle them, and have institutions in place like fire departments. None of this is true for singular events like anthropogenic climate change. Consequently, it is argued that we cannot just leave ORM relatively unaltered, and support it with a PP for decisions under uncertainty, and perhaps a more general, overarching PP as a normative guideline. Instead, it is demanded that we also have to reform the existing ORM framework in order to include precautionary elements.

Historical Arguments for Revising ORM
In the past, failures to take precautionary measures often resulted in substantial, widespread, and long-term harm to the environment and human health (Harremoës and others 2001, Gee and others 2013). This insight has been used to defend adopting a precautionary principle as a corrective to existing practices: For John (2007, 222), these past failures can be used as “inductive evidence” in an argument for reforming our regulatory policies. Whiteside (2006, 146) defends a PP as a product of social learning from past mistakes. According to Whiteside, these past mistakes reveal that (a) our knowledge about the influences of our actions on complex ecological systems is insufficient, and (b) that how decisions were reached was an important part of their inefficiency, leading to insufficient protection of the environment and human health. As such, to Whiteside, the PP generates a normative obligation to re-structure our decision-procedures (Whiteside 2006, 114). The most elaborate historical argument is made by Steel (2014, Chapter 5). Steel’s argument rests on the following premise:

If a systematic pattern of serious errors of a specific type has occurred, then a corrective for that type of error should be sought. (Steel 2014, 91)

By critically examining not only cases of failed precautions and harmful outcomes, but also counter-examples of allegedly “excessive” precaution, Steel shows that such a pattern of serious errors in fact exists. Cases such as the ones described in “Late Lessons from Early Warnings” (Harremoës and others 2001) demonstrate that continuous delays in response to emerging threats have frequently led to serious and persistent harms. Steel (2014, 74–77) goes on to examine cases that have been named as examples of excessive precaution. He finds that in fact, often no regulation whatsoever was implemented in the first place. And in cases where regulations were put in place, they were mostly very restricted, had only minimal negative effects, and were relatively easily reversible. For example, one of the “excessive precautions” consisted in putting a warning label on products containing saccharine in the US. According to Steel (2014, 82), the historical argument thus supports a PP as a corrective against a systematic bias that is entrenched in our practices. This bias emerges because there are informational and political asymmetries that make continual delays more likely than precautionary measures when there are trade-offs between short-term economic gain for an influential party against harms that are uncertain or distant in terms of space or time (or all three).

Epistemic Implications
The justifications presented so far all concern PPs aiming at the management of risks, that is, action-guiding interpretations. But we can also find discussions of a PP for the assessment of threats, so called “epistemic” PPs. It is not enough to just supply existing practices with a PP; clearly, risk assessment has to be changed, too, in order to be able to apply a PP. This means that uncertainties have to be taken seriously and to be communicated clearly, that we need to employ more adequate models which take into account the existence of systemic risks (Randall 2011, 77–78), that we need criteria to identify plausible (as opposed to “mere”) possibilities, and so on. However, this is more a question of the implications of adopting a PP, not an expression of a genuine PP itself. Thus, these kinds of argument are either presuppositions for a PP, because we need to identify uncertain harms first in order to do something about them. Or they are implications from a PP, because it is not admissible to conduct a risk assessment that makes it impossible to apply a PP.

Procedural Precaution
Authors who favor a procedural interpretation of PPs stress that they are concerned especially with decisions under conditions of uncertainty. They point out that while ORM, with its focus on cost-effectiveness and maximizing benefits, might be appropriate for conditions of decision-theoretic risk, the situation is fundamentally different if we have to make decisions under decision-theoretic uncertainty or even decision-theoretic ignorance. For example, Arcuri (2007, 20) points out that since PPs are principles particularly for decisions under decision-theoretic uncertainty, they cannot be prescriptive rules which tell us what the best course of action is—because the situation is essentially characterized by the fact that we are uncertain about the possible outcomes to which our actions can lead. Tickner (2001, 14) claims that this should lead to redirecting the questions that are asked in environmental decision-making: The focus should be moved from the hazards associated with a narrow range of options to solutions and opportunities. Thus, the assessment of alternatives is a central point of implementing PPs in procedural frameworks:

In the end, acceptance of a risk must be a function not only of hazard and exposure but also of uncertainty, magnitude of potential impacts and the availability of alternatives or preventive options. (Tickner 2001, 122)

Although (economical) efficiency should not be completely dismissed and still should have its place in decision-making, proponents of a procedural PP proclaim that we should shift our aim in risk regulation from maximizing benefits to minimizing threats, especially in the environmental domain where harms are often irreversible (compare Whiteside 2006, 75). They also advocate democratic participation, pointing out that a decision-making process under scientific uncertainty cannot be a purely scientific one (Whiteside 2006, 30–31; Arcuri 2007, 27). They thus see procedural interpretations of PPs as justified with respect to the goal of ensuring that decisions are made in a responsible and defensible way, which is especially important when there are substantial uncertainties about their outcomes.

Challenging the Underlying Value Assumptions
In addition to scientific uncertainty, Resnik (2003, 334) distinguishes another kind of uncertainty, which he calls “axiological uncertainty.” Both kinds make it difficult to implement ORM in making decisions. While scientific uncertainty arises due to our lack of empirical evidence, axiological uncertainty is concerned with our value assumptions. This kind of uncertainty can take on different forms: We can be unsure about how to measure utilities—in dollars lost/saved, lives lost/saved, species lost/saved, or something else? Then, we can be uncertain how to aggregate costs and benefits, and how to compare, for example, economic values with ecological ones. Values cannot always be measured on a common ordinal scale, much less on a common cardinal scale (as ORM requires, at least in some senses such as those including the use of a version of cost-benefit analysis). Thus, it is irrational to treat them as if they would fulfill this requirement (Thalos 2012, 176–77; Aldred 2013). This challenges the value assumptions underlying ORM, and is seen as a problem that should be fixed by a PP.

Additionally, authors like Hansson (2005b, 10) criticize that it is essentially problematic that costs and benefits get aggregated without regard to who has them, and that person-related aspects like autonomy, or if a risk is willingly taken or imposed by others, are unjustly neglected.

To sum up, we can say that when the underlying value assumptions of ORM are challenged, either the criticism pertains to how values are estimated and assigned, or the utilitarian decision criterion of maximizing overall expected utility is criticized. In both cases, we are arguably leaving the framework of rational choice and ORM, and move toward genuine moral justifications for PPs.

b. Moral Justifications for Precaution

Some authors stress that, regardless of whether a PP is thought to supplement ordinary risk management (ORM) or whether it is a more substantive claim, a PP is essentially a moral principle, and has to be justified on explicitly moral grounds. (Note that depending on the moral position one holds, many of the considerations in 3.a can also be seen as discussions of PPs from a moral standpoint; most prominently utilitarianism, since ORM uses the rule of maximizing expected utility.) They argue that taking precautionary measures under uncertainty is morally demanded, because otherwise we risk damages that are in some way morally unacceptable.

i. Environmental Ethics

PPs are often associated with environmental ethics, and the concept of sustainable development (O’Riordan and Jordan 1995; Kaiser 1997; Westra 1997; McKinney and Hill 2000; Steele 2006; Paterson 2007). Some authors take environmental preservation to be at the core of PPs. PP formulations such as the Rio or the Wingspread PP emerged in a debate about the necessity to prevent environmental degradation, which explains why many PPs highlight environmental concerns. It seems plausible that a PP can be an important part of a broader approach to environmental preservation and sustainability (Ahteensuu 2008, 47). But it seems difficult to justify a PP with recourse to sustainability, since the concept itself is vague and contested. Indeed, when PPs have been discussed in the context of sustainability, they are often proposed as ways to operationalize the vague concept into a principle for policymaking, along with other principles like the “polluter pays” principle (Dommen 1993; O’Riordan and Jordan 1995). Thus, while PPs are partly motivated by the insight that our way of life is not sustainable, and that we should change how we approach environmental issues, it is difficult to justify them solely on such grounds. However, the hope is that a clarification of the normative (moral) underpinnings of PPs will help to justify a PP for sustainable development. In the following, we will see that it might make sense to take special precautions with respect to ecological issues, not only because they often are complex and might entail unresolvable uncertainties (Randall 2011, 64–70), but also because harm to the environment can affect many other moral concerns, for example, human rights and both international and intergenerational justice. As we will see, these moral issues might provide justifications for PPs on their own, without explicit reference to sustainability.

ii. Harm-Based Justifications

PPs that apply to governmental regulatory decisions have been defended as an extension of the harm principle. There are different versions of the harm principle, but roughly, it states that the government is justified in restricting citizens’ individual liberty only to avoid harm to others.

The application of the harm principle normally presupposes that certain conditions are fulfilled, for example, that the harms in question must be (1) involuntarily taken, (2) sufficiently severe and (3) probable, and (4) the prescribed measures must be proportional to the harms (compare Jensen 2002, Petrenko and McArthur 2011). If these conditions are fulfilled, the prevention principle can be applied, prescribing proportional measures to prevent the harm in question from materializing. However, PPs apply to cases where we are unsure about the extent and/or the probability of a possible harm. Consequently, PPs are seen as a “clarifying amendment” (Jensen 2002, 44) which extends the normative foundation of the harm principle from prevention to precaution (Petrenko and McArthur 2011, 354): The impossibility to assign probabilities does not negate the obligation to act as long as possible harms are severe enough and scientifically plausible. Even for the prevention principle, it holds that the more severe a threat is, the less probable it has to be in order to warrant preventive measures. Thus, it has been argued that the probability of high-magnitude harms becomes almost irrelevant, as long as they are scientifically plausible (Petrenko and McArthur 2011, 354–55). Additionally, some harm is seen as so serious that it warrants special precaution, for example, if it is irreversible or cannot be (fully) compensated (Jensen 2002, 49–50). In such situations, the government is justified in restricting liberties by, for example, prohibiting a technology, even if there remains uncertainty about whether or not the technology would actually have harmful effects.

A related idea is that governments have an institutional obligation not to harm the population, which overrides the weaker obligation to do good—meaning that it is worse if certain regulatory decisions of the government lead to harm than if they lead to foregone benefits (John 2007).

The question what exactly makes a threat severe enough to justify the implementation of precautionary measures has also been discussed with reference to justice- and rights-based considerations.

iii. Justice-Based Justifications

McKinnon (2009, 2012) presents two independent arguments for precautions, which both are justice-based. Those arguments are developed with respect to the possibility of a climate change catastrophe (CCC), and concern two alternative courses of action and their worst cases. The case of “Unnecessary Expenditure” means taking precautions which turn out to have been unnecessary, thereby wasting money which could have been spent for other, better purposes. “Methane Nightmare” describes the case of not taking precautions, leading to CCCs with catastrophic consequences, making survival on earth very difficult if not impossible. McKinnon argues that CCCs are uncertain in the sense that they are scientifically plausible, even though we cannot assign probabilities to them (McKinnon 2009, 189).

Playing it Safe
McKinnon’s first argument for why uncertain, yet plausible harm with the characteristics of CCCs justifies precautionary measures is called the “playing safe”– argument. It is based on two Rawlsian commitments about justice (McKinnon 2012, 56): (1) That treating people as equals means (among other things) to ensure a distribution of (dis)advantage among them that makes the worst-off group as well off as possible, and (2) that justice is intergenerational in scope, governing relations across generations as well as within them.

McKinnon (2009, 191–92) argues that the distributive injustice would be so much higher if “Methane Nightmare” would materialize than if it came to “Unnecessary Expenditure” that we have to choose to take precautionary measures, even though we do not know how probable “Methane Nightmare” is. That is to say, such a situation warrants the application of the maximin-principle, because distributive justice in the sense of making the worst-off as well off as possible has lexical priority to maximizing the overall benefits for all. Choosing an option that has a way better best case, but, in the worst-case, would lead to distributive injustice, over another option which might have a less-good best-case, but where the worst-case does not entail such distributive injustices, would be inadmissible.

Unbearable Strains of Commitment
As McKinnon notes, the “playing safe” justification only holds if one accepts a very specific understanding of distributive (in)justice. However, she claims to have an even more fundamental argument for precautionary measures in this context, which is also based on Rawlsian arguments concerning intergenerational justice, but does not rely on a specific conception of distributive justice. It is called the “unbearable strains of commitment”-argument and is based on a combination of the “just savings”-principle for intergenerational justice together with the “impartiality”-principle. It states that we should not choose courses of actions that impose on future generations conditions which we ourselves could not agree to and which would undermine the bare possibility of justice itself (McKinnon 2012, 61). This justifies taking precautions against CCCs, since the worst-case in that option is “Unnecessary Expenditure”, which, in contrast to “Methane Nightmare” would not lead to justice-jeopardizing consequences.

iv. Rights-Based Justifications

Strict precautionary measures concerning climate change have been demanded based on the possible rights violations that such climate change might entail. For example, Caney (2009) claims that although other benefits and costs might be discounted, human rights are so fundamental that they must not be discounted. He argues that the possible harms involved in climate change justify precautions: An unmitigated climate change entails possible outcomes which would lead to serious or catastrophic right violations, while a policy of strict mitigation would not involve a loss of human rights—at least not if it is carried out by the affluent members of the world. Additionally, “business as usual” from the affluent would mean to gamble with the conditions of those who already lack fundamental rights protection, because the negative effects of climate change would come to bear especially in poor countries. Moreover, the benefits of taking the “risk of catastrophic climate change” outcomes would almost entirely result for the risk-takers, not the risk-bearers (Caney 2009, 177–79). If we extrapolate from this concrete application, the basic justification for precaution seems to be: If a rights violation is plausibly possible, and there are ways to avoid this possibility by choosing another course of action, which does not involve the plausible possibility of rights violations, then we have to choose the second option. It does not matter how likely the rights violations are going to happen; as long as they are plausible, we have to treat them as if they would materialize with certainty.

Thus, in this interpretation, precaution means making sure that no rights violations happen, even if we (because of uncertainty) “run the risk” of doing more than what would have been necessary—as long as we don’t have to jeopardize our own rights in order to do so.

v. Ethics of Risk and Risk Impositions

Some authors see the PP as an expression of a problem with what they call standard ethics (Hayenhjelm and Wolff 2012, e28). According to them, standard ethical theories, with their focus on evaluations of actions and their outcomes under conditions of certainty, fail to keep up with the challenges that technological development poses. PPs are then placed in the broader context of developing and defending an ethics of risk, that is, a moral theory about the permissibility of risk impositions. Surprisingly, so far there are few explicit connections between the discussion of the ethics of risk impositions (see for example Hansson 2013, Lenman 2008, Suikkanen 2019) and the discussion of PPs.

One exemption is Munthe (2011), who argues that before we can formulate an acceptable and intelligible PP, we first need at least the basic structure of an ethical theory that deals directly with issues of creating and avoiding risks of harm. In Chapter 5 of his book, Munthe (2011) sets out to develop such a theory, which focuses on the responsibility of a decision, specifically, responsibility as a property of decisions: Decisions and risk impositions may be morally appraised in their own right. When one does not know what the outcome of a decision will be, it is important to make responsible decisions, that is, decisions that can still be defensible as being responsible given the information one had at the time the decision was made, even if the outcome is wrong. However, even though Munthe’s discussion starts out from the PP, he ultimately concludes that we do not need a PP, but a policy that expresses a proper degree of precaution: “What is needed is plausible theoretical considerations that may guide decision makers also employing their own judgement in specific cases. We do not need a precautionary principle, we need a policy that expresses a proper degree of precaution.” Thus, the idea seems to be that while a fully developed ethics of risk will justify demands commonly associated with PPs, it ultimately will replace the need for a PP.

4. Main Objections and Possible Rejoinders

This section presents the most frequent and the most important objections and challenges PPs face. They can be roughly divided into three groups. The first argues that there are fundamental conceptual problems with PPs, which make them unable to guide our decisions. The second claims that PPs, in any reasonable interpretation, are superfluous and can be reduced to existing practices done right. The third rejects PPs as irrational, saying that they are based on unfounded fears and that they contradict science, leading to undesirable consequences. While some objections are aimed at specific PP-proposals, others are intended as arguments against PPs in general. However, even the latter typically hold only for specific interpretations. This section shortly presents the main points of these criticisms, and then discusses how they might be answered.

a. PPs Cannot Guide Our Decisions

There are two main reasons why PPs are seen as unable to guide us in our decision-making: They are rejected either as incoherent, or as being vacuous and devoid of normative content.

Objection: PPs are incoherent
One frequent criticism, most prominently advanced by Sunstein (2005b), is that a “strong PP” leads to contradicting recommendations and is therefore paralyzing our decision-making. He understands “strong PP” as a very demanding principle which states that “regulation is required whenever there is a possible risk to health, safety, or the environment, even if the supporting evidence remains speculative and the economic costs of regulation are high” (Sunstein 2005b, 24). The problem is that every action poses such a possible risk, and thus, both regulation and non-regulation would be prohibited by the “strong PP,” resulting in paralysis (Sunstein 2005b, 31). Hence, “strong PP” is rejected as an incoherent decision-rule, because it leads to contradicting recommendations.

Peterson (2006) makes another argument that rejects PPs as incoherent. He claims that he can prove formally as well as informally that every serious PP formulation is logically inconsistent with reasonable conditions of rational choice, and should therefore be given up as a decision-rule (Peterson 2006, 597).

Rejoinder
Both criticisms have been rejected as being based on a skewed PP interpretation. In the case of Sunstein’s argument, he is attacking a straw-man. His critique of the “strong PP” as paralyzing relies on two assumptions which are not made explicit, namely (a) that a PP is invoked by any and all risks, and (b) that risks of action and inaction are typically equally balanced (Randall 2011, 20). However, this is an atypical PP interpretation. Most formulations make explicit reference to severe dangers, meaning that not just any possible harm, no matter how small, will invoke a PP. And, as the case studies in Harremoës and others (2001) illustrate, the possible harms from action and inaction—or, more precisely, regulation or no regulation—are typically not equally balanced (see also Steel 2014, Chapter 9). Still, Sunstein’s critique calls attention to the important point of risk-risk trade-offs, which every sound interpretation and application of a PP has to take into account: Taking precautions against a possible harm should not lead to an overall higher level of threat (Randall 2011, 84–85). Nevertheless, there seems to be no reason why a PP should not be able to take this into account, and the argument thus fails as a general rejection of PPs.

Similarly, it can be contested whether Peterson’s (2006) PP formalization is a plausible PP candidate: He presupposes that we can completely enumerate the list of possible outcomes, that we have rational preferences that allow for a complete ordering of the outcomes, and that we can estimate at least the relative likelihood of the outcomes. As Randall (2011, 86) points out, this is an ideal setup for ordinary risk management (ORM), and the three conditions for rational choice that Peterson cites and with which he shows his PP to be inconsistent, have their place in the ORM- framework. Thus, one can object that it is not very surprising if a PP, which aims especially at situations in which the ideal conditions are not met, does not do very well under the ideal conditions.

Objection: PPs are vacuous
On the other hand, it is argued that if a PP is attenuated in order not to be paralyzing, it becomes such a weak claim that it is essentially vacuous. Sunstein (2005b, 18) claims that weaker formulations of PPs are, although not incoherent, trivial: They merely state that lack of absolute scientific proof is no reason for inaction, which, according to Sunstein, has no normative force because everyone is already complying with it. Similarly, McKinnon (2009) takes a weak PP formulation to state that precautionary measures are permissible, which she also rejects as a hollow claim, stating that everyone could comply with it without ever taking any precautionary action.

Additionally, PPs are rejected as vacuous because of the multitude of formulations and interpretations. Turner and Hartzell (2004), examining different formulations of PPs, come to the conclusion that they are all beset with unclarity and ambiguities. They argue that there is no common core of the different interpretations, and that the plausibility of a PP actually rests on its vagueness. This makes it unsuitable as a guide for decision-making. Similarly, Peterson (2007b, 306) states that such a “weak” PP has no normative content and no implications for what ought to be done. He claims that in order to have normative content, a PP would need to give us a precise instruction for what to do for each input of information (Peterson 2007b, 306). By formulating a minimal normative PP interpretation and showing that it is incoherent, he argues that there cannot be a PP with normative content.

Rejoinder
Firstly, let us address the criticism that PPs are vacuous because they express a claim that is too weak to have any impact on decision-making. Against this, Steel (2013, 2014) has argued that even if these supposedly “weak” or “argumentative” principles do not directly recommend a specific decision, they nonetheless have an impact on the decision-making process if taken seriously. He interprets them as a meta-principle that puts constraints on what decision-rules should be used, namely, none that would lead to inaction in the face of uncertainty. As, for example, cost-benefit analysis needs numerical probabilities to be applicable, the Meta PP will recommend against it in situations where no such probability information is available. This is a substantial constraint, meaning that the Meta PP is not vacuous. One can reasonably doubt that Sunstein is right that everyone follows such an allegedly “weak” principle anyway. There are many historical cases where there was some positive evidence that an activity caused harm, but the fact that the activity-harm link had not been irrefutably proven was used to argue against regulatory action (Harremoës and others 2001, Gee and others 2013). Thus, in cases where no proof, or at least no reliable probability information, concerning the possibility of harm is available, uncertainty is often used as a reason to not take precautionary action. Additionally, this criticism clearly does not concern all forms of PPs, and only amounts to a full-fledged rejection of PPs if combined with the claim that so-called “stronger” PPs which are not trivial, will always be incoherent. And both Sunstein (2005b) and McKinnon (2009, 2012) do propose other PPs which express a stronger claim, albeit with a restricted scope (for example, only pertaining to catastrophic harm, or damage which entails specific kinds of injustice). This form of the “vacuous” objection can thus be seen not as an attack on the general idea of PPs, but more as the demand that the normative obligation they express should be made clear in order to avoid downplaying it.

Let us now consider the other form of the objection, namely the claim that PPs are essentially vague and that there cannot be a precise formulation of a PP that is both action-guiding and plausible. It is true that so far, there does not seem to exist a “one size fits all” PP that yields clear instructions for every input and that captures all the ideas commonly associated with PPs. However, even if this would be a correct interpretation of what a “principle” is (which many authors deny, compare for example Randall 2011, 97), it is not the only one. Peterson (2007b) presumes that only a strict “if this, then that” rule can have normative force, and consequently be action-guiding. In contrast, other authors stress the difference between a principle and a rule (Fisher 2002; Arcuri 2007; Randall 2011). According to them, while rules specify precise consequences that follow automatically when certain conditions are met, principles express normative obligations that need to be specified according to different contexts, and that need to be implemented and operationalized in rules, laws, policies, and so on (Randall 2011, 97). When authors are rejecting PPs as incoherent (see the objection), they might sometimes make the same mistake, confusing a general principle that needs to be specified on a case-by-case basis with a stand-alone decision rule that should fit for any and all cases.

As for PPs being essentially vague: This criticism seems to presuppose that in order to formulate a clarified PP, we have to capture and unify everything that is associated with it. However, explicating a concept in a way that clarifies it and captures as many of the ideas associated with it as possible does not mean that we have to preserve all of the ideas commonly associated with it. The same is true for explicating a principle such as a PP. Additionally, this article shows that many different ways of interpreting PPs in a precise way are possible, and not all of them exclude each other.

b. PPs are Redundant

Some authors reject PPs by arguing that they are just a narrow and complicated way of expressing what is already incorporated into established, more comprehensive approaches. For example, Bognar (2011) compares Gardiner’s (2006) “Rawlsian Core PP”-interpretation with what he calls a “utilitarian principle” which consists of a combination of the principles of indifference and that of maximizing expected utility. He concludes that this “utilitarian principle” does lead to the same results as the RCPP in the cases where the RCPP applies, but, contrary to it, this “utilitarian principle” is not restricted to such a narrow range of cases. His conclusion is that we can dispose of PPs, at least in formulations of maximin (Bognar 2011, 345).

In the same vein, Peterson (2007b, 600) asserts that if formulated in a consistent way, a PP would not be different from the “old” rules for risk-averse decision-making, while other authors have shown that we can use existing ordinary risk management (ORM) tools to implement a PP (Farrow 2004; Gollier, Moldovanu, and Ellingsen 2001). This allegedly would make PPs redundant (Randall 2011, 25; 87).

Rejoinder
Particularly against the criticism of Bognar (2011), one can counter that his “utilitarian principle” falls victim to the so-called “tuxedo fallacy” (Hansson 2008). Using the principle of indifference, that is, treating all outcomes as equally probable when one does not have enough information to assign reliable probabilities, can be seen as creating an “illusion of control” by assuming that as long as no probability information is available, all outcomes are equally probable. Neither does it pay special attention to catastrophic harms, nor does it take the special challenges of decision-theoretic uncertainty adequately into account.

More generally, one can make the following point: Even though there might be plausible ways how we can translate a PP into the ORM-framework and implement it using ORM-tools, there is more to it than that. Even if we use ORM-methods to implement precaution, in the end this might still be based on a normative obligation to enact precautionary measures. This obligation has to be spelled out, because ORM can allow for precaution, but does not demand it in itself (and, as a regulatory practice, tends to neglect it).

c. PPs are Irrational

The last line of criticism accuses PPs of being based on unfounded fears, expressing cognitive biases, and therefore leading to decisions with undesirable and overall harmful consequences.

Objection: Unfounded Panic
One criticism that is especially frequent in discussions aimed at a broader audience is that PPs lead to unrestrained regulation because they can be invoked by uncertain harm. Therefore, the argument goes, PPs hold the danger of unnecessary expenditures to reduce insignificant risks, forego benefits by regulating or prohibiting potentially beneficial activities, and are prone to being exploited, for example, from interest groups or for protectionism in international trade (Peterson 2006). A PP would stifle innovation, resulting in an overall less safe society: Many (risk-reducing) beneficial innovations of the past were only possible because risks had been taken (Zander 2010, 9), and technical innovation takes place in a process of trial-and-error, which would be seriously disturbed by a PP (Graham 2004, 5).

These critics see this as a consequence of PPs, because PPs do not require scientific certainty in order to take action, which they interpret as making merely speculative harm a reason for strict regulation. Thus, science would be marginalized or even rejected as a basis for decision-making, giving way to cognitive biases of ordinary people.

Objection: Cognitive biases
Sunstein claims that PPs are based on cognitive biases of ordinary people, which tend to systematically mis-assess risks (Sunstein 2005b, Chapter 4). By reducing the importance of scientific risk-assessment and marginalizing the role of experts, decisions resulting from the application of a PP will be influenced by these biases and result in negative consequences, the criticism goes.

Rejoinder
As has been pointed out by Randall (2011, 89), these criticisms seem to be misguided. Lower standards of evidence do not mean no standards at all. It is surely an important challenge for the implementation of a PP to find a way to define plausible possibilities, but this requires by no means less science. Instead, as Sandin, Bengtsson, and others (2004) point out, more, and different scientific approaches are needed. Uncertainties need to be communicated more clearly and tools need to be developed that allow taking uncertainties better into account. For decisions where we lack scientific information, but great harms are possible, ways need to be found for how public concerns can be taken into consideration (Arcuri 2007, 35). This, however, seems more a question of implementation and neither of the formulation nor the justification of a PP.

5. References and Further Reading

Ahteensuu, Marko. 2008. “In Dubio Pro Natura? A Philosophical Analysis of the Precautionary Principle in Environmental and Health Risk Governance.” PhD thesis, Turku, Finland: University of Turku.
Aldred, Jonathan. 2013. “Justifying Precautionary Policies: Incommensurability and Uncertainty.” Ecological Economics 96 (December): 132–40.
Arcuri, Alessandra. 2007. “The Case for a Procedural Version of the Precautionary Principle Erring on the Side of Environmental Preservation.” SSRN Scholarly Paper ID 967779. Rochester, NY: Social Science Research Network.
Arrow, Kenneth J., and Anthony C. Fisher. 1974. “Environmental Preservation, Uncertainty, and Irreversibility.” The Quarterly Journal of Economics 88 (2): 312–19.
Buchak, Lara. 2013. Risk and Rationality. Oxford University Press.
Bognar, Greg. 2011. “Can the Maximin Principle Serve as a Basis for Climate Change Policy?” Edited by Sherwood J. B. Sugden. Monist 94 (3): 329–48. https://doi.org/10.5840/monist201194317.
Caney, Simon. 2009. “Climate Change and the Future: Discounting for Time, Wealth, and Risk.” Journal of Social Philosophy 40 (2): 163–86. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-9833.2009.01445.x/full.
Chisholm, Anthony Hewlings, and Harry R. Clarke. 1993. “Natural Resource Management and the Precautionary Principle.” In Fair Principles for Sustainable Development: Essays on Environmental Policy and Developing Countries, edited by Edward Dommen, 109–22.
Dommen, Edward (Ed.). 1993. Fair Principles for Sustainable Development: Essays on Environmental Policy and Developing Countries. Edward Elgar.
Farrow, Scott. 2004. “Using Risk Assessment, Benefit-Cost Analysis, and Real Options to Implement a Precautionary Principle.” Risk Analysis 24 (3): 727–35.
Fisher, Elizabeth. 2002. “Precaution, Precaution Everywhere: Developing a Common Understanding of the Precautionary Principle in the European Community.” Maastricht Journal of European and Comparative Law 9: 7.
Gardiner, Stephen M. 2006. “A Core Precautionary Principle.” Journal of Political Philosophy 14 (1): 33–60.
Gee, David, Philippe Grandjean, Steffen Foss Hansen, Sybille van den Hove, Malcolm MacGarvin, Jock Martin, Gitte Nielsen, David Quist and David Stanners. 2013. Late lessons from early warnings: Science, precaution, innovation. European Environment Agency.
Gollier, Christian, Benny Moldovanu, and Tore Ellingsen. 2001. “Should We Beware of the Precautionary Principle?” Economic Policy, 303–27.
Graham, John D. 2004. The Perils of the Precautionary Principle: Lessons from the American and European Experience. Vol. 818. Heritage Foundation.
Hansson, Sven Ove. 1997. “The Limits of Precaution.” Foundations of Science 2 (2): 293–306.
Hansson, Sven Ove. 2005a. Decision Theory: A Brief Introduction, Uppsala University class notes.
Hansson, Sven Ove. 2005b. “Seven Myths of Risk.” Risk Management 7 (2): 7–17.
Hansson, Sven Ove. 2008. “From the Casino to the Jungle.” Synthese 168 (3): 423–32. https://doi.org/10.1007/s11229-008-9444-1.
Hansson, Sven Ove. 2013. The Ethics of Risk: Ethical Analysis in an Uncertain World. Palgrave Macmillan.
Harremoës, Poul, David Gee, Malcolm MacGarvin, Andy Stirling, Jane Keys, Brian Wynne, and Sofia Guedes Vaz. 2001. Late Lessons from Early Warnings: The Precautionary Principle 1896-2000. Office for Official Publications of the European Communities.
Harris, John, and Søren Holm. 2002. “Extending Human Lifespan and the Precautionary Paradox.” Journal of Medicine and Philosophy 27 (3): 355–68.
Harsanyi, John C. 1975. “Can the Maximin Principle Serve as a Basis for Morality? A Critique of John Rawls’s Theory.” Edited by John Rawls. The American Political Science Review 69 (2): 594–606. https://doi.org/10.2307/1959090.
Hartzell-Nichols, Lauren. 2013. “From ‘the’ Precautionary Principle to Precautionary Principles.” Ethics, Policy and Environment 16 (3): 308–20.
Hartzell-Nichols, Lauren. 2017. A Climate of Risk: Precautionary Principles, Catastrophes, and Climate Change. Taylor & Francis.
Hartzell-Nichols, Lauren. 2012. “Precaution and Solar Radiation Management.” Ethics, Policy & Environment 15 (2): 158–71. https://doi.org/10.1080/21550085.2012.685561.
Hayenhjelm, Madeleine, and Jonathan Wolff. 2012. “The Moral Problem of Risk Impositions: A Survey of the Literature.” European Journal of Philosophy 20 (S1): E26–E51.
Jensen, Karsten K. 2002. “The Moral Foundation of the Precautionary Principle.” Journal of Agricultural and Environmental Ethics, 15(1): 39–55. https://doi.org/10.1023/A:1013818230213
John, Stephen. 2007. “How to Take Deontological Concerns Seriously in Risk–Cost–Benefit Analysis: A Re-Interpretation of the Precautionary Principle.” Journal of Medical Ethics 33 (4): 221–24.
John, Stephen. 2010. “In Defence of Bad Science and Irrational Policies: An Alternative Account of the Precautionary Principle.” Ethical Theory and Moral Practice 13 (1): 3–18.
Jonas, Hans. 2003. Das Prinzip Verantwortung: Versuch Einer Ethik Für Die Technologische Zivilisation. 5th ed. Frankfurt am Main: Suhrkamp Verlag.
Kaiser, Matthias. 1997. “Fish-Farming and the Precautionary Principle: Context and Values in Environmental Science for Policy.” Foundations of Science 2 (2): 307–41.
Lemons, John, Kristin Shrader-Frechette, and Carl Cranor. 1997. “The Precautionary Principle: Scientific Uncertainty and Type I and Type II Errors.” Foundations of Science 2 (2): 207–36.
Lenman, James. 2008. Contractualism and risk imposition. Politics, Philosophy & Economics, 7(1): 99–122. https://doi.org/10/fqkwg3
Manson, Neil A. 2002. “Formulating the precautionary principle.” Environmental Ethics 24(3): 263–274.
McKinney, William J., and H. Hammer Hill. 2000. “Of Sustainability and Precaution: The Logical, Epistemological, and Moral Problems of the Precautionary Principle and Their Implications for Sustainable Development.” Ethics and the Environment 5 (1): 77–87.
McKinnon, Catriona. 2009. “Runaway Climate Change: A Justice-Based Case for Precautions.” Journal of Social Philosophy 40 (2): 187–203.
McKinnon, Catriona. 2012. Climate Change and Future Justice: Precaution, Compensation and Triage. Routledge.
Munthe, Christian. 2011. The Price of Precaution and the Ethics of Risk. Vol. 6. The International Library of Ethics, Law and Technology. Springer.
O’Riordan, Timothy, and Andrew Jordan. 1995. “The Precautionary Principle in Contemporary Environmental Politics.” Environmental Values 4 (3): 191–212.
Osimani, Barbara. 2013. “An Epistemic Analysis of the Precautionary Principle.” Dilemata: International Journal of Applied Ethics, 149–67.
Paterson, John. 2007. “Sustainable Development, Sustainable Decisions and the Precautionary Principle.” Natural Hazards 42 (3): 515–28. https://doi.org/10.1007/s11069-006-9071-4.
Peterson, Martin. 2003. “Transformative Decision Rules.” Erkenntnis 58 (1): 71–85.
Peterson, Martin. 2006. “The Precautionary Principle Is Incoherent.” Risk Analysis 26 (3): 595–601. ll.
Peterson, Martin. 2007a. “Should the Precautionary Principle Guide Our Actions or Our Beliefs?” Journal of Medical Ethics 33 (1): 5–10. https://doi.org/10.1136/jme.2005.015495.
Peterson, Martin. 2007b. “The Precautionary Principle Should Not Be Used as a Basis for Decision‐making.” EMBO Reports 8 (4): 305–8. https://doi.org/10.1038/sj.embor.7400947.
Petrenko, Anton, and Dan McArthur. 2011. “High-Stakes Gambling with Unknown Outcomes: Justifying the Precautionary Principle.” Journal of Social Philosophy 42 (4): 346–62.
Randall, Alan. 2011. Risk and Precaution. Cambridge University Press.
Rawls, John. 2001. Justice as fairness: A restatement. Belknap, Harvard University Press.
Resnik, David B. 2003. “Is the Precautionary Principle Unscientific?” Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 34 (2): 329–44.
Resnik, David B. 2004. “The Precautionary Principle and Medical Decision Making.” Journal of Medicine and Philosophy 29 (3): 281–99.
Sandin, Per. 1999. “Dimensions of the Precautionary Principle.” Human and Ecological Risk Assessment: An International Journal 5 (5): 889–907.
Sandin, Per. 2004. “Better Safe Than Sorry: Applying Philosophical Methods to the Debate on Risk and the Precautionary Principle.” PhD thesis, Stockholm.
Sandin, Per. 2007. “Common-Sense Precaution and Varieties of the Precautionary Principle.” In Risk: Philosophical Perspectives, edited by Tim Lewens, 99–112. London; New York.
Sandin, Per. 2009. “A New Virtue-Based Understanding of the Precautionary Principle.” Ethics of Protocells: Moral and Social Implications of Creating Life in the Laboratory, 88–104.
Sandin, Per, Bengt-Erik Bengtsson, Ake Bergman, Ingvar Brandt, Lennart Dencker, Per Eriksson, Lars Förlin, and others 2004. “Precautionary Defaults—a New Strategy for Chemical Risk Management.” Human and Ecological Risk Assessment 10 (1): 1–18.
Sandin, Per, and Sven Ove Hansson. 2002. “The Default Value Approach to the Precautionary Principle.” Human and Ecological Risk Assessment: An International Journal 8 (3): 463–71. https://doi.org/10.1080/10807030290879772.
Sandin, Per, Martin Peterson, Sven Ove Hansson, Christina Rudén, and André Juthe. 2002. “Five Charges Against the Precautionary Principle.” Journal of Risk Research 5 (4): 287–99.
Science & Environmental Health Network (SEHN). 1998. Wingspread Statement on the Precautionary Principle.
Steel, Daniel. 2011. “Extrapolation, Uncertainty Factors, and the Precautionary Principle.” Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences 42 (3): 356–64.
Steel, Daniel. 2013. “The Precautionary Principle and the Dilemma Objection.” Ethics, Policy & Environment 16 (3): 321–40.
Steel, Daniel. 2014. Philosophy and the Precautionary Principle. Cambridge University Press.
Steele, Katie. 2006. “The Precautionary Principle: A New Approach to Public Decision-Making?” Law, Probability and Risk 5 (1): 19–31. https://doi.org/10.1093/lpr/mgl010.
Suikkanen, Jussi. 2019. Ex Ante and Ex Post Contractualism: A Synthesis. The Journal of Ethics, 23(1): 77–98. https://doi.org/10/ggjn22
Sunstein, Cass R. 2005a. “Irreversible and Catastrophic.” Cornell Law Review 91: 841–97.
Sunstein, Cass R. 2007. “The Catastrophic Harm Precautionary Principle.” Issues in Legal Scholarship 6 (3).
Sunstein, Cass R. 2009. Worst-Case Scenarios. Harvard University Press.
Sunstein, Cass R. 2005b. Laws of Fear: Beyond the Precautionary Principle. Cambridge University Press.
Thalos, Mariam. 2012. “Precaution Has Its Reasons.” In Topics in Contemporary Philosophy 9: The Environment, Philosophy, Science and Ethics., edited by W. Kabasenche, M. O’Rourke, and M. Slater, 171–84. Cambridge, MA: MIT Press.
Tickner, Joel A. 2001. “Precautionary Assessment: A Framework for Integrating Science, Uncertainty, and Preventive Public Policy.” In The Role of Precaution in Chemicals Policy, edited by Elisabeth Freytag, Thomas Jakl, G. Loibl, and M. Wittmann, 113–27. Diplomatische Akademie Wien.
Turner, Derek, and Lauren Hartzell. 2004. “The Lack of Clarity in the Precautionary Principle.” Environmental Values 13 (4): 449–60.
United Nations Conference on Environment and Development. 1992. Rio Declaration on Environment and Development.
Westra, Laura. 1997. “Post-Normal Science, the Precautionary Principle and the Ethics of Integrity.” Foundations of Science 2 (2): 237–62.
Whiteside, Kerry H. 2006. Precautionary Politics: Principle and Practice in Confronting Environmental Risk. MIT Press Cambridge, MA.
Zander, Joakim. 2010. The Application of the Precautionary Principle in Practice: Comparative Dimensions. Cambridge: Cambridge University Press.

Research for this article was part of the project “Reflective Equilibrium – Reconception and Application” (Swiss National Science Foundation grant no. 150251).

Author Information

Tanja Rechnitzer
Email: tanja.rechnitzer@philo.unibe.ch
University of Bern
Switzerland

Conspiracy Theories

The term “conspiracy theory” refers to a theory or explanation that features a conspiracy among a group of agents as a central ingredient. Popular examples are the theory that the first moon landing was a hoax staged by NASA, or the theory that the 9/11 attacks on the World Trade Center were not (exclusively) conducted by al-Qaeda, but that the US government conspired to let these attacks succeed. Conspiracy theories have long been an element of popular culture; and cultural theorists, sociologists and psychologists have had things to say about conspiracy theories and the people who believe in them. This article focuses on the philosophy of conspiracy theories, that is, on what philosophers have had to say about conspiracy theories. Conspiracy theories meet philosophy when it comes to questions concerning epistemology, science, society and ethics.

After giving a brief history of philosophical thinking about conspiracy theories in section 1, this article considers in more detail the definition of the term “conspiracy theory” in section 2. As it turns out, the definition of the term has received a lot of attention in philosophy, mainly because the common usage of the term has negative connotations (as in, “It’s just a conspiracy theory!”), raising the question whether our definition should reflect these. As there is a great variety of conspiracy theories on offer, section 3 considers ways of classifying conspiracy theories into distinct types. Such a classification may be useful when it comes to identifying possible problems with a conspiracy theory.

The main part of this article, section 4, is devoted to the question when one should believe in a conspiracy theory. In general, the philosophical literature has been more positive about conspiracy theories than other fields, being careful not to dismiss such theories too easily. Hence, it becomes important to come up with criteria that one may use to evaluate a given conspiracy theory. Section 4 provides such a list of criteria, distilled from the philosophical literature.

Turning from questions about belief to questions about society, ethics and politics, section 5 addresses the societal effects of conspiracy theories that philosophers have identified, also asking to what extent these are positive or negative. Given these effects, the last question this article addresses, in section 6, is what, if anything, we should do about conspiracy theories. Answering this question does not, of course, depend on philosophical thinking alone. For this reason, section 7 briefly mentions some relevant work outside of philosophy.

History of Philosophizing about Conspiracy Theories
Problems of Definition
Types of Conspiracy Theories
Criteria for Believing in a Conspiracy Theory
Social and Political Effects of Conspiracy Theories
What to Do about Conspiracy Theories?
Related Disciplines
References and Further Reading

1. History of Philosophizing about Conspiracy Theories

Philosophical thinking about conspiracies can be traced back at least as far as Niccolo Machiavelli. Machiavelli discussed conspiracies in his most well-known work, The Prince (for example in chapter 19), but more extensively in his Discourses on the First Ten Books of Titus Livius, where he devotes the whole sixth chapter of the third book to a discussion of conspiracies. Machiavelli’s aim in his discussion of conspiracies is to help the ruler guard against conspiracies directed against him. At the same time, he warns subjects not to engage in conspiracies, partly because he believes these rarely achieve what they desire.

Where Machiavelli discussed conspiracies as a political reality, Karl Raimund Popper is the philosopher who put conspiracy theories on the philosophical agenda. The philosophical discussion of conspiracy theories begins with Popper’s dismissal of what he calls “the conspiracy theory of society” (Popper, 1966 and 1972). Popper sees the conspiracy theory of society as a mistaken approach to the explanation of social phenomena: It attempts to explain a social phenomenon by discovering people who have planned and conspired to bring the phenomenon about. While Popper thinks that conspiracies do occur, he thinks that few conspiracies are ultimately successful, since few things turn out exactly as intended. It is precisely the unforeseen consequences of intentional human action that social science should explain, according to Popper.

Popper’s comments on the conspiracy theory of society comprised only a few pages, and they did not trigger critical discussion until many years later. It was only in 1995 that Charles Pigden critically examined Popper’s views (Pigden, 1995). Besides Pigden’s critique of Popper, it was Brian Keeley (1999) and his attempt at defining what he called “unwarranted conspiracy theories” that started the philosophical literature on conspiracy theories. The question raised by Keeley’s paper is essentially the demarcation problem for conspiracy theories: Just as Popper’s demarcation problem was to separate science from pseudoscience, within the realm of conspiracy theories, the problem Keeley raised was to separate warranted from unwarranted conspiracy theories. However, Keeley concluded that the problem is a difficult one, admitting that the five criteria he proposed were not adequate for specifying when we are (un)warranted to believe in a conspiracy theory. This article returns to this problem in section 4.

After Popper’s work in the late 1960s and early 1970s, and Pigden’s and Keeley’s in the 1990s, philosophical work on conspiracy theories took off in the first decade of the 21st century. Particularly important in this development is the collection of essays by Coady (2006a), which made visible that there is a philosophical debate about conspiracy theories to a wider audience, as well as within philosophy. Since this collection of essays, philosophical thinking has been continuously evolving, as evidenced by special issues of Episteme (volume 4, issue 2, 2007), Critical Review (volume 28, issue 1, 2016), and Argumenta (volume 3, no.2, 2018).

Looking at the history of philosophizing about conspiracy theories, a useful distinction that has been applied to philosophers writing about conspiracy theories is the distinction between generalists and particularists (Buenting and Taylor, 2010). Following in the footsteps of Popper, generalists believe that conspiracy theories in general have an epistemic problem. For them, there is something about a theory being a conspiracy theory that should lower its credibility. It is this kind of generalism which underlies the popular dismissal, “It’s just a conspiracy theory.” Particularists like Pigden, on the other hand, argue that there is nothing problematic about conspiracy theories per se, but that each conspiracy theory needs to be evaluated on its own (de)merits.

2. Problems of Definition

The definition of the term “conspiracy theory” given at the beginning of this article is neutral in the sense that it does not imply that a conspiracy theory is wrong or unlikely to be true. In popular discourse, however, an epistemic deficit is often implied. Tracking this popular use, the Wikipedia entry on the topic (consulted 26 July 2019) defined a conspiracy theory as “an explanation of an event or situation that invokes a conspiracy by sinister and powerful actors, often political in motivation, when other explanations are more probable.”

We can order possible definitions of the term “conspiracy theory” in terms of logical strength. The definition given at the beginning of this article is minimal in this sense; it says that a conspiracy theory is a theory that involves a conspiracy. Slightly more elaborate, but still in line with this weak notion of conspiracy theory, Keeley (1999, p.116) sees a conspiracy theory as an explanation of an event by the causal agency of a small group of people acting in secret. What Keeley has added to the minimal definition is that the group of conspirators is small. Other additions that have been considered are that the group is powerful and/or that it has nefarious intentions. While these additions create a stronger notion of conspiracy theory, they all remain epistemically neutral; that is, they do not state that the explanation is unlikely or otherwise problematic. On the other end of the logical spectrum, definitions like the Wikipedia definition cited above are not only logically stronger than the minimal definition—the conspirators are powerful and sinister—but are also epistemically laden: A conspiracy theory is unlikely.

Within this spectrum of possibilities, philosophers have generally opted for a rather minimal definition that is epistemically neutral. As explicated by Dentith (2016, p.577), the central ingredients of a conspiracy are (a) a group of conspirators, (b) secrecy and (c) a shared goal. Similarly separating out the different ingredients of a conspiracy theory, Mandik (2007, p.206) states that conspiracy theories postulate “(1) explanations of (2) historical events in terms of (3) intentional states of multiple agents (the conspirators) who, among other things, (4) intended the historical events in question to occur and (5) keep their intentions and actions secret.” He sees these five conditions as necessary conditions for being a conspiracy theory, but he remains agnostic as to whether they are jointly sufficient.

A second approach to defining conspiracy theories has been proposed by Coady (2006b, p.2). He sees conspiracy theories as explanations that are opposed to the official explanation of an event at a given time. Coady points out that usually explanations that are conspiracy theories in this sense are also conspiracy theories in the sense discussed earlier, but not vice versa, as also official theories can refer to conspiracies, for example the official account of 9/11. Often, according to Coady, an explanation will be a conspiracy theory in both senses.

Which definition to adopt—strong or weak, epistemically neutral or not—is ultimately a question of what purpose the definition is to serve. No matter what definition one chooses, such a choice will have consequences. As an example, Watergate will not count as a conspiracy theory under the Wikipedia definition, but it will under the minimal definition given at the beginning of this article. Furthermore, this minimal definition of conspiracy theories will have as a consequence that an explanation of a surprise party will be considered a conspiracy theory. Hence, to be put to use, the minimal definition may need to be supplemented by an extra condition like nefariousness.

Finally, besides using the term “conspiracy theory,” some authors also use the term “conspiracism.” This latter term has been used in different ways in the literature. Pipes (1997) has used the term to indicate a particular paranoid style of thinking. Muirhead and Rosenblum (2019) have used it to describe an evolving phenomenon of political culture, distinguishing classic conspiracism from new conspiracism. While classic conspiracism involves the development of conspiracy theories as alternative explanations of phenomena, new conspiracism has shed the interest in explanation and theory building. Instead, it is satisfied with bare assertion or insinuation of a conspiracy and aims at political delegitimation and destabilization.

3. Types of Conspiracy Theories

Conspiracy theories come in great variety, and typologies can help to order this variety and to further guide research to a particular type of conspiracy theory that is particularly interesting or problematic. Räikkä (2009a, p.186 and 2009b, p.458-9) distinguishes political from non-political conspiracy theories. Räikkä mentions conspiracy theories about the death of Jim Morrison or Elvis Presley as examples of non-political conspiracy theories. He furthermore divides political conspiracy theories into local, global and total conspiracy theories depending on the scale of the event to be explained.

Huneman and Vorms (2018, p.251) provide further useful categories for distinguishing different types of conspiracy theories. They distinguish scientific from non-scientific conspiracy theories—that is, whether or not the theories deal with the domain of science, like the AIDS conspiracy theory—ideological from neutral conspiracy theories—whether there is a strong ideology driving the conspiracy theory, like anti-Semitism—official from anti-institutional conspiracy theories—as exemplified by official versus unofficial conspiracy theories about 9/11—and alternative explanations from denials—providing a different explanation for an event versus denying that the event took place.

A further way to distinguish conspiracy theories is by looking at what kind of theoretical object we are dealing with. In general, a conspiracy theory is an explanation of some event or phenomenon, but one can examine what kind of explanation it is. Some conspiracy theories may be full-blown theories, whereas others may not be theories in the scientific or philosophical sense. Clarke (2002 and 2007) thinks that some conspiracy theories are actually only proto-theories, not worked out sufficiently to count as theories, while others may be degenerating research programs in the sense defined by Imre Lakatos. There is more on the relationship between conspiracy theories and Lakatosian research programs in section 4, but here it is important to realize that while all conspiracy theories are explanations of some sort, certain conspiracy theories may be theories, others may be proto-theories or research programs.

4. Criteria for Believing in a Conspiracy Theory

A number of criteria have been offered, sometimes implicitly, in the philosophical literature to evaluate whether we should believe in a particular conspiracy theory, and these are surveyed below. Partly, such criteria will be familiar from scientific theory choice, but given that we are dealing with a specific type of theory, more can be said and more has been said. Due to the number of criteria, it is useful to group them into categories. There are different ways of grouping these criteria. The one adopted here tries to stay close to the labels and classifications common in the philosophy of science.

Although not explicitly stated, the dominant view in the philosophical literature from which the criteria below are taken is a realist view: Our (conspiracy) theories and beliefs should aim at the truth. Alternatively, one may propose an instrumentalist criterion which advocates a (conspiracy) theory or belief for its usefulness, for example in making predictions. Finally, while instrumentalism still has epistemic aims, we can also identify a more radical pragmatist view which focuses more generally on the consequences, for example political and social consequences, of holding a particular (conspiracy) theory or belief.

As mentioned, most of the criteria from the philosophical literature fit into the realist view. Within this view, we can distinguish three groups of criteria. First, we have criteria coming from the philosophy of science. These criteria have to do with the scientific methodology of theory choice, and here the question is how these play out when applied to conspiracy theories. Second, we have criteria dealing with motives. These can be the motives of the agents proposing a conspiracy theory, the motives of institutions relevant to the propagation of a conspiracy theory, or, finally, the motives of the agents the conspiracy theory is about. Third, there are a number of other criteria neither dealing with motives nor with scientific methodology. The picture arising from this way of organizing the criteria is presented in figure 1. The figure is not intended as a decision tree. Rather, it is more like an organized toolbox from which multiple tools may be chosen, depending, for example, on one’s philosophical commitments and one’s existing beliefs.

Figure 1

a. Criteria concerning Scientific Methodology

i. Internal Faults (C1)

Basham (2001, p.275) advocates skepticism of a conspiracy theory if it suffers from what he calls “internal faults,” among which he lists “problems with self-consistency, explanatory gaps, appeals to unlikely or obviously weak motives and other unrealistic psychological states, poor technological claims, and the theory’s own incongruencies with observed facts it grants (including failed predictions).” Räikkä (2009a, p.196f) also refers to a similar list of criteria. Basham thinks that this criterion, while seemingly straightforward, will already exclude many conspiracy theories. An historical example he mentions is the theory that sees the antichrist of the biblical Book of Revelations to be Adolf Hitler. According to Basham, the fact that Hitler is dead and the kingdom of God nowhere near shows that this theory has internal faults, presumably a big explanatory gap or failed prediction.

Note that the list of things mentioned by Basham as internal faults is rather diverse, and one can debate whether all of these faults should really be considered internal to the theory. More narrowly, one could restrict internal faults to problems with self-consistency. Most of the other elements mentioned by Basham return below as separate criteria. For instance, an appeal to “unlikely or obviously weak motives” is discussed as C5.

ii. Progress: Is the Conspiracy Theory Part of a Progressive Research Program? (C2)

Clarke (2002; 2007) sees conspiracy theories as degenerating research programs in the sense developed by Lakatos (1970). In Clarke’s description of a degenerating research program, “successful novel predictions and retrodictions are not made. Instead, auxiliary hypotheses and initial conditions are successively modified in light of new evidence, to protect the original theory from apparent disconfirmation” (Clarke 2002, p.136). By contrast, a progressive research program would make successful novel predictions and retrodictions. Clarke cites the Watergate conspiracy theory as an example of a progressive research program: It led the journalists to make successful predictions and retrodictions about the behavior of those involved in the conspiracy. By contrast, Clarke uses the conspiracy theory about Elvis Presley’s fake funeral as an example of a degenerating research program (p.136-7), since it did not come up with novel predictions that were confirmed, for example, concerning the unusual behavior of Elvis’s relatives. Going further, Clarke (2007) also views other conspiracy theories—the controlled demolition theory of 9/11, for instance—as only proto-theories, something that is not sufficiently worked out to count as a theoretical core of a degenerating or progressive research program. Proto-theories are similar to what Muirhead and Rosenblum (2019) call new conspiracism.

Pigden (2006, footnote 17 and p.29) criticizes Clarke for not providing any evidence that conspiracy theories are in fact degenerating research programs and points to the many conspiracy theories accepted by historians as counterevidence. In any case, we might consider evaluating a given conspiracy theory by trying to see to what extent it is, or is part of, a progressive or a degenerating research program. Furthermore, as Lakatos’s notion of a research program comes with a hard core—the central characteristic claims not up for modification—and a protective belt—auxiliary hypotheses which can be changed—applying this notion also gives us tools to analyze a conspiracy theory in more detail. Such an analysis might yield, for example, that the problematic aspects of a conspiracy theory all concern its protective belt rather than its hard core.

iii. Inference to the Best Explanation: Evidence, Prior, Relative and Posterior Probability (C3)

Dentith (2016) views conspiracy theories as inferences to the best explanation. To judge such inferences using a Bayesian framework, we need to look at the prior probability of the conspiracy theory, the prior probability of the evidence and its likelihood given the conspiracy theory, thereby allowing us to calculate the posterior probability of the conspiracy theory. Furthermore, we need to look at the relative probability of the conspiracy theory when comparing it to competing hypotheses explaining the same event. Crucial in this calculation is our estimation of the prior probability of the conspiracy theory, which Dentith thinks we usually set too low (p.584) because we tend to underestimate how often conspiracies occur in history.

There is some disagreement between authors about whether conspiracy theories may be selective in their choice of evidence. Hepfer (2015, p.78) warns against the selective acceptance of evidence which he calls selective coherentism (p.92), which for Hepfer explains, for example, the wealth of different conspiracy theories surrounding the assassination of John F. Kennedy. Dentith (2019, section 2), on the other hand, argues that scientific theories are also selective in their use of evidence, and that conspiracy theories are not different from other theories, such as scientific ones, in the way they use evidence. Dentith compares conspiracy theories about 9/11 to the work that historians usually do. In both cases, says Dentith, we see a selection of only part of the total evidence as salient.

Finally, Keeley (2003, p.106) considers whether lack of evidence for a conspiracy should count against a theory positing such a conspiracy. On the one hand, he points out that it is in general true that we should not confuse absence of evidence for a conspiracy with evidence of absence of a conspiracy. After all, since we are dealing with a conspiracy, we should expect that evidence will be hard to come by. This is also why falsifiability is in general not advocated as a criterion for evaluating conspiracy theories (see, e.g., Keeley 1999, p.121 and Basham 2003, p.93): In the case of conspiracy theories, something approaching unfalsifiability is a consequence of the theory. Nonetheless, Keeley (2003, p.106) thinks that if diligent efforts to find evidence for a conspiracy fail where similar efforts in other similar cases have succeeded, we are justified in lowering the credibility of the conspiracy theory.

iv. Errant Data (C4)

While the previous criterion already discussed how conspiracy theories relate to data, there is a particular kind of data that receives special attention both by conspiracy theorists and in the philosophical literature about conspiracy theories. Many conspiracy theories claim that they can explain “errant data” (Keeley, 1999, p.117), data which either contradicts the official theory or which the official theory leaves unexplained. According to Keeley (1999), conspiracy theories place great emphasis on errant data, an emphasis that also exists in periods of scientific innovation. However, Keeley thinks that conspiracy theories wrongly claim that errant data by itself is a problem for a theory, which Keeley thinks it is not, since not all the available data will in fact be true. Clarke (2002 p.139f) and Dentith (2019, section 3) are skeptical of Keeley’s argument: Clarke points out that the data labelled as “errant” will depend on the theory one adheres to, and Dentith thinks that conspiracy theories are no different from other theories in relation to such data.

Dentith (2014, 129ff), following Coady (2006c), points out that any theory, official or unofficial, will have errant data. While advocates of a conspiracy theory will point to data problematic for the official theory which the conspiracy theory can explain, there will usually also be data problematic to the conspiracy theory which the official theory can explain. As an example of data errant with regard to the official theory, Dentith mentions that the official theory about the assassination of John F. Kennedy does not explain why some witnesses heard more gunshots than the three gunshots Oswald is supposed to have fired. As an example of data errant with regard to a conspiracy theory, Dentith points out that some of the conspiracy theories about 9/11 cannot explain why there is a video of Osama Bin Laden claiming responsibility for the attacks. When it comes to evaluating a specific conspiracy theory, the conclusion is that we should be looking at the errant data of both the conspiracy theory and alternative theories.

b. Criteria Concerning Motives

i. Cui Bono: Who Benefits from the Conspiracy? (C5)

Hepfer (2015, p.98ff) uses the assassination of John F. Kennedy in 1963 to illustrate how motives enter into our evaluation of conspiracy theories. While there seems to be widespread agreement that the assassin was in fact Lee Harvey Oswald, conspiracy theories doubt the official theory that he was acting on his own. There are a number of possible conspirators with plausible motives that may have been behind Oswald: The military-industrial complex, the American mafia, the Russian secret service, the United States Secret Service and Fidel Castro. Which of these conspiracy theories we should accept also depends on how plausible we find the ascribed motives given our other beliefs about the world.

According to Hepfer (2015, p.98 and section 2.3), a conspiracy theory should be (a) clear about the motives or goals of the conspirators and (b) rational in the means-ends sense of rationality; that is, if successful, the conspiracy should further the goals the conspirators are claimed to have. If the goals of the conspirators are not explicitly part of the theory, we should be able to infer these goals, and they should be reasonable. Problematic conspiracy theories are those where the motives or goals of the conspirators are unclear, the goals ascribed to the conspirators conflict with our other knowledge about the goals of these agents, or a successful conspiracy would not further the goals the theory itself ascribes to the conspirators.

ii. Individual Trust (C6)

Trust plays a role in two different ways when it comes to conspiracy theories. First, Räikkä (2009b, section 4) raises the question of whether we can trust the motives of the author(s) or proponents of a conspiracy theory. Some conspiracy theorists may not themselves believe the theory they propose, and instead may have other motives for proposing the theory; for example, to manipulate the political debate or make money. Other conspiracy theorists may genuinely believe the conspiracy theory they propose, but the fact that the alleged conspirators are the political enemy of the theory’s proponent may cast doubt on the likelihood of the theory. The general question here is whether the author or proponent of a conspiracy theory has a motive to lie or mislead. Here, Räikkä uses as an example the conspiracy theory about global warming (p.462). If a person working for the fossil-fuel industry claims that there is a global conspiracy propagating the idea of global warming, the financial motive is clear. Conversely, people who reject a particular theory as “just” a conspiracy theory may also have a motive to mislead. As an example, Pidgen disscusses the case of Tony Blair,who labeled the idea that the Iraq war was fought for oil a mere conspiracy theory.

A second way in which trust enters into the analysis of conspiracy theories is in terms of epistemic authority. Many conspiracy theories refer to various authorities for the justification of certain claims. For instance, a 9/11 conspiracy theory may refer to a structural engineer who made a certain claim regarding the collapse of the World Trade Center. The question arises as to what extent we should trust claims of alleged epistemic authorities, that is, people who have relevant expertise in a particular domain. Levy (2007) takes a radically socialized view of knowledge: Since knowledge can only be produced by a complex network of inquiry in which the relevant epistemic authorities are embedded, a conspiracy theory conflicting with the official story coming out of this network is “prima facie unwarranted” (p.182, italics in the original). According to Levy, the best epistemic strategy is simply to “adjust one’s degree of belief in an explanation of an event or process to the degree to which the epistemic authorities accept that explanation” (p.190). Dentith (2018) criticizes Levy’s trust in epistemic authority. First, Dentith argues that since conspiracy theories cross disciplinary boundaries, there is no obvious group of experts when it comes to evaluating a conspiracy theory, since a conspiracy theory will usually involve claims connecting various disciplines. Furthermore, Dentith points out that the fact that a theory has authority in the sense of being official does not necessarily mean that it has epistemic authority, a point Levy also makes. Related to our first point about trust, Dentith also points out that epistemic authorities might have a motive to mislead, for example, when the funding sources might have influenced research. Finally, our trust in epistemic authority will also depend on the trust we place in the institutions accrediting expertise, and hence questions of individual trustworthiness relate to questions of institutional trustworthiness.

iii. Institutional Trust (C7)

As mentioned when discussing individual trust, when we want to assess the credibility of experts, part of that credibility judgment will depend on the extent to which we trust the institution accrediting the expertise, assuming there is such an institution to which the expert is linked. The question of institutional trust is relevant more generally when it comes to conspiracy theories, and this issue has been discussed at length in the philosophical literature on conspiracy theories.

The starting point of the discussion of institutional trust is Keeley (1999, p.121ff) who argues that the problem with conspiracy theories is that these theories cast doubt on precisely those institutions which are the guarantors of reliable data. If a conspiracy theory contradicts an official theory based on scientific expertise, this produces skepticism not only with regard to the institution of science, but may also produce skepticism with regard to other public institutions, for example the press, which accepts the official story instead of uncovering the conspiracy, the parliament and the government, which produce or propagate the conspiracy theory in the first place. Thus, the claim is that believing in a conspiracy theory implies a quite widespread distrust of our public institutions. If this implication is true, it can be used in two ways: Either to discredit the conspiracy theory, which is the route Keeley advocates, or to discredit our public institutions. In any case, our trust in our public institutions will influence the extent to which we hold a particular conspiracy theory to be likely. For this reason, both Keeley (1999, p.121ff) and Coady (2006a, p.10) think that conspiracy theories are more trustworthy in non-democratic societies.

Basham (2001, p.270ff) argues that it would be a mistake to simply assume our public institutions to be trustworthy and dismiss conspiracy theories. His position is one he calls “studied agnosticism” (p.275): In general, we are not in a position to decide for or against a conspiracy theory, except—and this is where the “studied” comes in—where a conspiracy theory can be dismissed due to internal faults (see C1). In fact, we are caught in a vicious circle: “We cannot help but assume an answer to the essential issue of how conspirational our society is in order to derive a well justified position on it” (p.274). Put differently, while an open society provides fewer grounds for believing in conspiracy theories, we cannot really know how open our society actually is (Basham 2003, p.99). In any case, an individual who tries to assess a particular conspiracy theory should thus also consider to what extent they trust or distrust our public institutions.

Clarke (2002,p.139ff) questions Keeley’s link between belief in conspiracy theories and general distrust in our public institutions. He claims that conspiracy theories actually do not require general institutional skepticism. Instead, in order to believe in a conspiracy theory, it will usually suffice to confine one’s skepticism to particular people and issues. Räikkä (2009a) also criticizes Keeley’s supposed link between conspiracy theories and institutional distrust, claiming that most conspiracy theories do not entail such pervasive institutional distrust, but that if such pervasive distrust were entailed by a conspiracy theory, it would lower the conspiracy theory’s credibility. A global conspiracy theory like the Flat Earth theory tends to involve more pervasive institutional distrust, since it involves multiple institutions from various societal domains, than a local conspiracy theory like the Watergate conspiracy. According to Clarke, even the latter does not have to engender institutional distrust with regard to the United States government as an institution, since distrust could remain limited to specific agents within the government.

c. Other Realist Criteria

i. Fundamental Attribution Error (C8)

Starting with Clarke (2002; see also his response to criticism in 2006), philosophers have discussed whether conspiracy theories commit the fundamental attribution error (FAE). In psychology, the fundamental attribution error refers to the human tendency to overestimate dispositional factors and underestimate situational factors in explaining the behavior of others. Clarke (p.143ff) claims that conspiracy theories commit this error: They tend to be dispositional explanations whereas official theories often are more situational explanations. As an example, Clarke considers the funeral of Elvis Presley. The official account is situational since it explains the funeral in terms of his death due to heart problems. On the other hand, the conspiracy theory which claims Elvis is still alive and staged his funeral is dispositional since it sees Elvis and his possible co-conspirators as having the intention to deceive the public.

Dentith (2016, p.580) questions whether conspiracy theories are generally more dispositional than other theories. Also, like in the case of 9/11, the official theory may also be dispositional. Pigden (2006, footnotes 27 and 30, and p.29) is critical of the psychological literature about the FAE, claiming that “if we often act differently because of different dispositions, then the fundamental attribution error is not an error” (footnote 30). Pigden is also critical of Clarke’s application of the FAE to conspiracy theories: Given that conspiracies are common, what Pigden calls “situationism” is either false or it does not imply that conspiracies are unlikely. Hence, Pigden concludes, the FAE has no relevant implications for our thinking about conspiracy theories. Coady (2003) is also critical of the existence of the FAE. Furthermore, he claims that belief in the FAE is paradoxical in that it commits the FAE: Believing that people think dispositionally rather than situationally is itself dispositional thinking.

ii. Ontology: Existence Claims the Conspiracy Theory Makes (C9)

Some conspiracy theories claim the existence or non-existence of certain entities. Among the examples Hepfer (2015, p.45) cites is a theory by Heribert Illig that claims that the years between 614 and 911 never actually happened. Another example would be a theory claiming the existence of a perpetual motion machine that is kept secret. Both existence claims go against the scientific consensus of what exists and what does not. Hepfer (2015, p.42) claims that the more unusual a conspiracy theory’s existence claims are, the more we should doubt its truth. This is because of the ontological baggage (p.49) that comes with such existence claims: Accepting these claims will force us to revise a major part of our hitherto accepted knowledge, and the more substantial the revision needed, the more we should be suspicious of such a theory.

iii. Übermensch: Does the Conspiracy Theory Ascribe Superhuman Qualities to Conspirators? (C10)

Hepfer (2015, p.104) and Räikkä (2009a, p.197) note that some conspiracy theories ascribe superhuman qualities to the conspirators that border on divine attributes like omnipotence and omniscience. Examples here might be the idea that Freemasons, Jews or George Soros control the world economy or the world’s governments. Sometimes the superhuman qualities ascribed to conspirators are moral and negative, that is, conspirators are demonized (Hepfer, 2015, p.131f). The antichrist has not only been seen in Adolf Hitler but also in the pope. In general, the more extraordinary the qualities ascribed to the conspirators, the more they should lower the credibility of the conspiracy theory.

iv. Scale: The Size and Duration of the Conspiracy
(C11)

The general claim here is that the more agents that are supposed to be involved in a conspiracy—its size—and the longer the conspiracy is supposed to be in existence—its duration—the less likely the conspiracy theory. Hepfer (2015, p.97) makes this point, and similarly Keeley (1999, p.122) says that the more institutions are supposed to be involved in a conspiracy, the less believable the theory should become. To some extent, this point is simply a matter of logic: The claim that A and B are involved in a conspiracy cannot be more likely than that A is involved in a conspiracy. Similarly, the claim that a conspiracy theory has been going on for at least 20 years cannot be more likely than the claim that it has been going on for at least 10 years. In this sense, conspiracy theories involving many agents over a long period of time will tend to be less likely than conspiracy theories involving fewer agents over a shorter period of time. Furthermore, Grimes (2016) has conducted simulations showing that large conspiracies with 1000 agents or more are unlikely to succeed due to problems with maintaining secrecy.

Basham (2001, p.272; 2003, p.93) takes an opposing view by referring to social hierarchies and mechanisms of control, saying that “the more fully developed and high placed a conspiracy is, the more experienced and able are its practitioners at controlling information and either co-opting, discrediting, or eliminating those who go astray or otherwise encounter the truth” (Basham 2001, p.272). Dentith (2019, section 7) also counters the scale argument by pointing out that any time an institution is involved in a conspiracy, only very few people of that institution actually are involved in the conspiracy. This reduces the number of total conspirators and questions the relevance of the results by Grimes of which Dentith is very critical.

d. Non-Realist Criteria

i. Instrumentalism: Conspiracy Theories as “as if” Theories (C12)

Grewal (2016) has shown how the philosophical opposition between scientific realism and various kinds of anti-realism also shows up in how we evaluate conspiracy theories. While most authors implicitly seem to interpret the claims of conspiracy theories along the lines of realism, Grewal has suggested that adherents of conspiracy theories may interpret or at least use these theories instrumentally. Viewed this way, conspiracy theories are “as-if”-theories which allow their adherents to make sense of a world that is causally opaque in a way that may often yield quite adequate predictions. “An assumption that the government operated as if it were controlled by a parallel and secret government may fit the historical data…while also providing better predictions than would, say, an exercise motivated by an analysis of constitutional authority or the statutory limitations to executive power” (p.36). As a more concrete example, Grewal mentions that “the most parsimonious way to understand financial decision making in the Eurozone might be to treat it as if it were run by and for the benefit of the Continent’s richest private banks” (p.37). Hence, our evaluation of a given conspiracy theory will also depend on basic philosophical commitments like what we expect our theories to do for us.

ii. Pragmatism (C13)

The previous arguments have mostly been epistemic or epistemological arguments, arguments that bear on the likelihood of a conspiracy theory to be true or at least epistemically useful. However, similar to Blaise Pascal’s pragmatic argument for belief in God (Pascal, 1995), some arguments concerning conspiracy theories that have nothing to do with their epistemic value can be reinterpreted pragmatically as arguments about belief: Pragmatically, our belief or disbelief should depend on the consequences the (dis)belief has for us personally or for society more generally.

Basham (2001) claims that epistemic rejection of conspiracy theories will often not work, and we have to be agnostic about their truth. Still, we should reject them for pragmatic reasons because “[t]here is nothing you can do,” given the impossibility of finding out the truth, and “[t]he futile pursuit of malevolent conspiracy theory sours and distracts us from what is good and valuable in life” (p.277). Similarly, Räikkä (2009a) says that “a person who strives for happiness in her personal life should not ponder on vicious conspiracies too much” (p.199). Then again, contrary to Basham’s claim, what you can do with regard to conspiracy theories will depend on your role. As a journalist, you may decide to investigate certain claims, and Räikkä (2009a, p.199f) thinks that “it is important that in every country there are some people who are interested in investigative journalism and political conspiracy theorizing.”

Like journalists, politicians play a special role when it comes to conspiracy theories. Muirhead and Rosenblum (2016) argue that politicians should oppose conspiracy theories if they (1) are fueled by hatred, or (2) when they present political opposition as treason and illegitimate, or (3) when they undermine epistemic or expert authority generally. Similarly, Räikkä (2018, p.213) argues that we must interfere with conspiracy theories when they include libels or hate speech. The presumed negative consequences of such conspiracy theories would be pragmatic reasons for disbelief.

Räikkä (2009b) lists both positive and negative effects of conspiracy theorizing, and we may apply these to concrete conspiracy theories to see which ones to believe in. The two positive effects he mentions are (a) that “the information gathering activities of conspiracy theorists and investigative journalists force governments and government agencies to watch out for their decisions and practices” (p.460) and (b) that conspiracy theories help to maintain openness in society. As negative effects, he mentions that a conspiracy theory “tends to undermine trust in democratic political institutions and its implications may be morally questionable, as it has close connections to populist discourse, as well as anti-Semitism and racism” (p.461). When a conspiracy theory blames certain people, Räikkä points out that there are moral costs for the people blamed. Furthermore, he thinks that the moral costs will depend on whether the people blamed are private individuals or public figures (p.463f).

5. Social and Political Effects of Conspiracy Theories

Räikkä (2009b, section 3) and Moore (2016, p.5) survey some of the social and political effects of conspiracy theories and conspiracy theorizing. One may look at the positive and negative effects of conspiracy theorizing in general, but it is also useful to consider the effects of a specific conspiracy theory, by looking at which effects mentioned below are likely to obtain for the conspiracy theory in question. Such an evaluation is related to the pragmatist evaluation criterion C13 just discussed, so some of the points mentioned there are revisited in what follows. Also, the effects of a conspiracy theory may be related to the type of conspiracy theory we are dealing with; see section 3 of this article.

On the positive side, conspiracy theories may be tools to uncover actual conspiracies, with the Watergate scandal as the standard example. When these conspiracies take place in our public institutions, conspiracy theories can thereby also help us to keep these institutions in check and to uncover institutional problems. Conspiracy theories can help us to remain critical of those holding power in politics, science and the media. One of the ways they can achieve this is by forcing these institutions to be more transparent. Since conspiracy theories claim the secret activity of certain agents, transparent decision making, open lines of communication and the public availability of documents are possible responses to conspiracy theories which can improve a democratic society, independent of whether they suffice to convince those believing conspiracy theories. We may call this the paradoxical effect of conspiracy theories: Conspiracy theories can help create or maintain the open society whose existence they deny.

Turning from positive to possible negative effects of conspiracy theories, a central point that already came up when discussing criterion C7 is institutional trust. Conspiracy theories can contribute to eroding trust in the institutions of politics, science and the media. The anti-vaccination conspiracy theory which claims that politicians and the pharmaceutical industry are hiding the ineffectiveness or even harmfulness of vaccines is an example of a conspiracy theory which can undermine public trust in science. Huneman and Vorms (2018) discuss how at times it can be difficult to draw the line between rational criticism of science and unwarranted skepticism. One fear is that eroding trust in institutions leads us via unwarranted skepticism to an all-out relativism or nihilism, a post-truth world where it suffices that a claim is repeated by a lot of people to make it acceptable (Muirhead and Rosenblum, 2019). Conspiracy theories have also been linked to increasing polarization, populism and racism (see Moore, 2016). Finally, as alluded to in section 1, Popper’s dislike of conspiracy theories was also because they create wrong ideas about the root causes of social events. By seeing social events as being caused by powerful people acting in secret, rather than as effects of structural social conditions, conspiracy theories arguably undermine effective political action and social change.

Bjerg and Presskorn-Thygesen (2017) have claimed that conspiracy theories cause a state of exception in the way introduced by Giorgio Agamben. Just like terrorism undermines democracy in such a way that it licenses a state of political exception justifying undemocratic measures, a conspiracy theory undermines rational discourse in such a way that it licenses a state of epistemic exception justifying irrational measures. Those measures consist in placing conspiracy theories outside of official public discourse, labeling them as irrational, as “just” conspiracy theories, and as not worthy of serious critical consideration and scrutiny. Seen in this way, conspiracy theories appear as a form of epistemic terrorism, through their erosion of trust in our knowledge-producing institutions.

6. What to Do about Conspiracy Theories?

Besides deciding to believe or not to believe in a conspiracy theory (section 4), there are other actions one may consider with regard to conspiracy theories. Philosophical discussion has mainly focused on what actions governments and politicians can or should take.

The seminal article concerning the question of government action is by Sunstein and Vermeule (2009). Besides describing different psychological and social mechanisms underlying belief in conspiracy theories, they consider a number of policy and legal responses a government might take when it comes to false and harmful conspiracy theories: banning conspiracy theories, taxing the dissemination of conspiracy theories, counterspeech and cognitive infiltration of groups producing conspiracy theories. While dismissing the first two options, Sunstein and Vermeule consider counterspeech and cognitive infiltration in more detail. First, the government may itself speak out against a conspiracy theory by providing its own account. However, Sunstein and Vermeule think that such official counterspeech will have only limited success, in particular when it comes to conspiracy theories involving the government. Alternatively, the government may try to involve private parties to infiltrate online fora and discussion groups associated with conspiracy theories in order to introduce cognitive diversity, breaking up one-sided discussion and introducing non-conspirational views.

The proposals by Sunstein and Vermeule have led to strong opposition, most explicitly by Coady (2018). He points out that Sunstein and Vermeule too easily assume good intentions on the part of the government. Furthermore, these policy proposals, coming from academics who have also been involved in governmental policy making, will only confirm the fears of the conspiracy theorists that the government is involved in conspirational activities. If the cognitive infiltration proposed by Sunstein and Vermeule were discovered, conspiracy theorists would be led to believe in conspiracy theories even more. Put differently, we are running the risk of a pragmatic inconsistency: The government would try to deceive, via covert cognitive infiltration, a certain part of the population to make it believe that it does not deceive, that it is not involved in conspiracies.

As mentioned when discussing evaluation criterion C13 in section 4, Muirhead and Rosenblum (2016) consider three kinds of conspiracy theories that should give politicians cause for official opposition. These are conspiracy theories that fuel hatred, equate political opposition with treason, or that express a general distrust of expertise. In these cases, politicians are called to speak truth to conspiracy, even though this might create a divide between them and their electorate. Muirhead and Rosenblum (2019) also consider what to do against new conspiracism (see the end of section 2). They note that such conspiracism is rampant in our society despite ever more transparency. As a counter measure, they not only advocate speaking truth to conspiracy, but also what they call “democratic enactment,” by which they mean “a strenuous adherence to the regular processes and forms of public decision-making” (p.175).

Both Sunstein and Vermeule, as well as Muirhead and Rosenblum, agree that what we should do about conspiracy theories will depend on the theory we are dealing with. They do not advocate action against all theories about groups acting in secret to achieve some aim. However, when a theory is of a particularly problematic kind—false and harmful, fueling hatred, and so forth—political action may be needed.

7. Related Disciplines

Philosophy is not the only discipline dealing with conspiracy theories, and in particular when it comes to discussing what to do about conspiracy theories, research from other fields is important. We have already seen some ways in which philosophical thinking about conspiracy theories touches on other disciplines, in particular in the previous section’s discussion of political science and law. As for other related fields, psychologists have done a lot of research about conspirational thinking and the psychological characteristics of people who believe in conspiracy theories. Historians have presented histories of conspiracy theories in the United States, the Arab world and elsewhere. Sociologists have studied how conspiracy theories can target racial minorities, as well as the structure and group dynamics of specific conspirational milieus. Uscinski (2018) covers many of the relevant disciplines which this article does not cover and also includes an interdisciplinary history of conspiracy theory research.

8. References and Further Reading

To get an overview of the philosophical thinking about conspiracy theories, the best works to start with are Dentith (2014), Coady (2006a) and Uscinski (2018).

Basham, L. (2001). “Living with the Conspiracy”, The Philosophical Forum, vol. 32, no. 3, p.265-280.
Basham, L. (2003). “Malevolent Global Conspiracy”, Journal of Social Philosophy, vol. 34, no. 1, p.91-103.
Bjerg, O. and T. Presskorn-Thygesen (2017). “Conspiracy Theory: Truth Claim or Language Game?”, Theory, Culture and Society, vol. 34, no. 1, p.137-159.
Buenting, J. and J. Taylor (2010). “Conspiracy Theories and Fortuitous Data”, Philosophy of the Social Sciences, vol. 40, no. 4, p. 567-578.
Clarke, St. (2002). “Conspiracy Theories and Conspiracy Theorizing”, Philosophy of the Social Sciences, vol. 32, no. 2, p.131-150.
Clarke, St. (2006). “Appealing to the Fundamental Attribution Error: Was it All a Big Mistake?”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.129-132.
Clarke, St. (2007). “Conspiracy Theories and the Internet: Controlled Demolition and Arrested Development”, Episteme, vol. 4, no. 2, p.167-180.
Coady, D. (2003). “Conspiracy Theories and Official Stories”, International Journal of Applied Philosophy, vol. 17, no. 2, p.197-209.
Coady, D., ed. (2006a). Conspiracy Theories: The Philosophical Debate. Ashgate.
Coady, D. (2006b). “An Introduction to the Philosophical Debate about Conspiracy Theories”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.1-11.
Coady, D. (2006c). “Conspiracy Theories and Official Stories”, in Conspiracy Theories: The Philosophical Debate. Edited by David Coady. Ashgate, p.115-128.
Coady, D. (2018). “Cass Sunstein and Adrian Vermeule on Conspiracy Theories”, Argumenta, vol. 3, no.2, p.291-302.
Dentith, M. (2014). The Philosophy of Conspiracy Theories. Palgrace MacMillan.
Dentith, M. (2016). “When Inferring to a Conspiracy might be the Best Explanation”, Social Epistemology, vol. 30, nos. 5-6, p.572-591.
Dentith, M. (2018). “Expertise and Conspiracy Theories”, Social Epistemology, vol. 32, no. 3, p.196-208.
Dentith, M. (2019). “Conspiracy theories on the basis of the evidence”, Synthese, vol. 196, no. 6, p.2243-2261.
Grewal, D. (2016). “Conspiracy Theories in a Networked World”, Critical Review, vol. 28, no. 1, p.24-43.
Grimes, D. (2016). “On the Viability of Conspirational Beliefs”, PLoS ONE, vol. 11, no. 1.
Hepfer, K. (2015). Verschwörungstheorien: Eine philosophische Kritik der Unvernunft. Transcript Verlag.
Huneman, Ph. and M. Vorms (2018). “Is a Unified Account of Conspiracy Theories Possible?”, Argumenta, vol. 3, no. 2, p.247-270.
Keeley, B. (1999). “Of Conspiracy Theories”, The Journal of Philosophy, vol. 96, no. 3, p.109-126.
Keeley, B. (2003). “Nobody Expects the Spanish Inquisition! More Thoughts on Conspiracy Theory”, Journal of Social Philosophy, vol. 34, no. 1, p.104-110
Lakatos, I. (1970). “Falsification and the Methodology of Scientific Research Programmes”, in I. Lakatos and A. Musgrave, editors, Criticism and the Growth of Knowledge. Cambridge University Press, p.91-196.
Levy, N. (2007). “Radically Socialized Knowledge and Conspiracy Theories”, Episteme, vol. 4 no. 2, p.181-192.
Mandik, P. (2007). “Shit Happens”, Episteme, vol. 4 no. 2, p.205-218.
Moore, A. (2016). “Conspiracy and Conspiracy Theories in Democratic Politics”, Critical Review, vol. 28, no. 1, p.1-23.
Muirhead, R. and N. Rosenblum (2016). “Speaking Truth to Conspiracy: Partisanship and Trust”, Critical Review, vol. 28, no. 1, p.63-88.
Muirhead, R. and N. Rosenblum (2019). A Lot of People are Saying: The New Conspiracism and the Assault on Democracy. Princeton University Press.
Pascal, B. (1995). Pensées and Other Writings, H. Levi (trans.). Oxford University Press.
Pigden, Ch. (1995). “Popper Revisited, or What Is Wrong With Conspiracy Theories?” Philosophy of the Social Sciences, vol. 25, no. 1, p.3-34.
Pigden, Ch. (2006). “Complots of Mischief”, in David Coady (ed.), Conspiracy Theories: The Philosophical Debate. Ashgate, p.139-166.
Pipes, D. (1997). Conspiracy: How the Paranoid Style Flourishes and Where It Comes From. Free Press.
Popper, K.R. (1966). The Open Society and Its Enemies, vol. 2: The High Tide of Prophecy, 5th edition, Routledge and Kegan Paul.
Popper, K.R. (1972). Conjectures and Refutations. 4th edition, Routledge and Kegan Paul.
Räikkä, J. (2009a). “On Political Conspiracy Theories”, Journal of Political Philosophy, vol. 17, no. 2, p.185-201.
Räikkä, J. (2009b). “The Ethics of Conspiracy Theorizing”, Journal of Value Inquiry, vol. 43, p.457-468.
Räikkä, J. (2018). “Conspiracies and Conspiracy Theories: An Introduction”, Argumenta, vol. 3, no. 2, p.205-216.
Sunstein, C. and A. Vermeule (2009). “Conspiracy Theories: Causes and Cures”, Journal of Political Philosophy, vol. 17, no. 2, p.202-227.
Uscinski, J.E., editor (2018). Conspiracy Theories and the People Who Believe Them. Oxford University Press.

Author Information

Marc Pauly
Email: m.pauly@rug.nl
University of Groningen
The Netherlands

René Descartes: Ethics

This article describes the main topics of Descartes’ ethics through discussion of key primary texts and corresponding interpretations in the secondary literature. Although Descartes never wrote a treatise dedicated solely to ethics, commentators have uncovered an array of texts that demonstrate a rich analysis of virtue, the good, happiness, moral judgment, the passions, and the systematic relationship between ethics and the rest of philosophy. The following ethical claims are often attributed to Descartes: the supreme good consists in virtue, which is a firm and constant resolution to use the will well; virtue presupposes knowledge of metaphysics and natural philosophy; happiness is the supreme contentment of mind which results from exercising virtue; the virtue of generosity is the key to all the virtues and a general remedy for regulating the passions; and virtue can be secured even though our first-order moral judgments never amount to knowledge.

Descartes’ ethics was a neglected aspect of his philosophical system until the late 20^th century. Since then, standard interpretations of Descartes’ ethics have emerged, debates have ensued, and commentators have carved out key interpretive questions that anyone must answer in trying to understand Descartes’ ethics. For example: what kind of normative ethics does Descartes espouse? Are the passions representational or merely motivational states? At what point in the progress of knowledge can the moral agent acquire and exercise virtue? Is Descartes’ ethics as systematic as he sometimes seems to envision?

Methodology
The Provisional Morality
Cartesian Virtue
1. The Unity of the Virtues
2. Virtue qua Perfection of the Will
The Epistemic Requirements of Virtue
1. Knowledge of the Truth
  1. Theoretical Knowledge of the Truth
  2. Practical Knowledge of the Truth
2. Intellect, Will, and Degrees of Virtue
Moral Epistemology
The Passions
Generosity
Love
1. The Metaphysical Reading
2. The Practical Reading
Happiness
Classifying Descartes’ Ethics
Systematicity Revisited
1. The Epistemological Reading
2. The Organic Reading
References and Further Reading

1. Methodology

a. Identifying the Texts

When one considers the heyday of early modern ethics, the following philosophers come to mind: Hobbes, Hutcheson, Hume, Butler, and, of course, Kant. Descartes certainly does not. Indeed, many philosophers and students of philosophy are unaware that Descartes wrote about ethics. Standard interpretations of Descartes’ philosophy place weight on the Discourse on the Method, Rules for the Direction of the Mind, Meditations on First Philosophy (with the corresponding Objections and Replies), and the Principles of Philosophy. Consequently, Descartes’ philosophical contributions to the early modern period are typically understood as falling under metaphysics, epistemology, philosophy of mind, and natural philosophy. When commentators do consider Descartes’ ethical writings, these writings are often regarded as an afterthought to his mature philosophical system. Indeed, Descartes’ contemporaries often did not think much of Descartes’ ethics. For example, Leibniz writes: “Descartes has not much advanced the practice of morality” (Letter to Molanus, AG: 241).

This view is understandable. Descartes certainly does not have a treatise devoted solely to ethics. This lack, in and of itself, creates an interpretive challenge for the commentator. Where does one even find Descartes’ ethics? On close inspection of Descartes’ corpus, however, one finds him tackling a variety of ethical themes—such as virtue, happiness, moral judgment, the regulation of the passions, and the good—throughout his treatises and correspondence. The following texts are of central importance in unpacking Descartes’ ethics: the Discourse on Method, the French Preface to the Principles, the Dedicatory Letter to Princess Elizabeth for the Principles, the Passions of the Soul, and perhaps most importantly, the correspondence with Princess Elizabeth of Bohemia, Queen Christina of Sweden, and the envoy Pierre Chanut (for more details on these important interlocutors—Princess Elizabeth in particular—and how they all interacted with each other in bringing about these letters see Shapiro [2007: 1–21]).

These ethical writings can be divided into an early period and a later—and possibly mature—period. That is, the early period of the Discourse (1637) and the later period spanning (roughly) from the French Preface to the Passions of the Soul (1644–1649).

b. The Tree of Philosophy and Systematicity

Why should we take seriously Descartes’ interspersed writings on ethics, especially since he did not take the time to write a systematic treatment of the topic? Indeed, one might think that we should not give much weight to Descartes’ ethical musings, given his expressed aversion to writing about ethics. In a letter to Chanut, Descartes writes:

It is true that normally I refuse to write down my thoughts concerning morality. I have two reasons for this. One is that there is no other subject in which malicious people can so readily find pretexts for vilifying me; and the other is that I believe only sovereigns, or those authorized by them, have the right to concern themselves with regulating the morals of other people. (Letter to Chanut 20 November 1647, AT V: 86–7/CSMK: 326)

However, one should take this text with a grain of salt. For in other texts, Descartes clearly does express a deep interest in ethics. Consider the famous tree of philosophy passage:

The whole of philosophy is like a tree. The roots are metaphysics, the trunk is physics, and the branches emerging from the trunk are all the other sciences, which may be reduced to three principal ones, namely, medicine, mechanics, and morals. By ‘morals’ I understand the highest and most perfect moral system, which presupposes a complete knowledge of the other sciences and is the ultimate level of wisdom.

Now just as it is not the roots or the trunk of a tree from which one gathers the fruit, but only the ends of the branches, so the principal benefit of philosophy depends on those parts of it which can only be learnt last of all. (French Preface to the Principles, AT IXB: 14/CSM I: 186)

This passage is surprising, to say the least. Descartes seems to claim that the proper end of his philosophical program is to establish a perfect moral system, as opposed to (say) overcoming skepticism, proving the existence of God, and establishing a mechanistic science. Moreover, Descartes seems to claim that ethics is systematically grounded in metaphysics, physics, medicine, and mechanics. Ethics is not supposed to float free from the metaphysical and scientific foundations of the system.

The tree of philosophy passage is a guiding text for many commentators in interpreting Descartes’ ethics, primarily because of its vision of philosophical systematicity (Marshall 1998, Morgan 1994, Rodis-Lewis 1987, Rutherford 2004, Shapiro 2008a). Indeed, the nature of the systematicity of Descartes’ ethics has been one of the main interpretive questions for commentators. Two distinct questions of systematicity are of importance here, which the reader should keep in mind as we engage Descartes’ ethical writings.

The first question of systematicity is internal to Descartes’ ethics itself. The early period of Descartes’ ethics, that is, the Discourse, is characterized by Descartes’ provisional morality. Broadly construed, the provisional morality seems to be a temporary moral guide—a stop gap, as it were—so that one can still live in the world of bodies and people while simultaneously engaging in hyperbolic doubt for the sake of attaining true and certain knowledge (scientia). As such, one might expect Descartes to revise the four maxims of the provisional morality once foundational scientia is achieved. Presumably, the perfect moral system that Descartes envisions in the tree of philosophy is not supposed to be a provisional morality. However, some commentators have claimed that the provisional morality is actually Descartes’ final moral view (Cimakasky & Polansky 2012). Others, however, take a developmental view, arguing that Descartes’ later period, although related to the provisional morality, makes novel and distinct advancements (Marshall 1998, Shapiro 2008a).

The second question of systematicity concerns how Descartes’ ethics relates to the rest of his philosophy. To fully understand this question, we must distinguish two senses of ethics (la morale) in the tree of philosophy (Parvizian 2016). First, there is ethics qua theoretical enterprise. This concerns a theory of virtue, happiness, the passions, and other areas. Second, there is ethics qua practical enterprise. That is, the exercise of virtue, the attainment of happiness, the regulation of the passions. Thus, one may distinguish, for example, the question of whether a theory of virtue depends on metaphysics, physics, and the like, from whether exercising virtue depends on knowledge of metaphysics, physics, and the like. Commentators tend to agree that theoretical ethics presupposes the other parts of the tree, although how this is supposed to work out with respect to each field has not been fully fleshed out. For example: what is the relationship between mechanics and ethics? However, there is substantive disagreement about whether exercising virtue presupposes knowledge of metaphysics or contributes to knowledge of metaphysics.

c. The Issue of Novelty

Another broad interpretive question concerns how Descartes’ ethics relates to past ethical theories, and whether Descartes’ ethics is truly novel (as he sometimes claims). It is undeniable that Descartes’ ethics is, in certain respects, underdeveloped. Given that Descartes is well versed in the ethical theories of his predecessors, one might be tempted to fill in the details Descartes does not spell out by drawing on other sources (for example, the Stoics).

This is a complicated matter. In section 3, Descartes claims that he is advancing beyond ancient ethics, particularly with his theory of virtue. This is in line with Descartes’ more general tendency to claim that his philosophical system breaks from the ancient and scholastic philosophical tradition (Discourse I, AT VI: 4–10/CSM I: 112–115). However, in some texts Descartes suggests that he is building upon past ethical theories. For example, Descartes tells Princess Elizabeth:

To entertain you, therefore, I shall simply write about the means which philosophy provides for acquiring that supreme felicity which common souls vainly expect from fortune, but which can be acquired only from ourselves.

One of the most useful of these means, I think, is to examine what the ancients have written on this question, and try to advance beyond them by adding something to their precepts. For in this way we can make the precepts perfectly our own and become disposed to put them into practice. (Letter to Princess Elizabeth 21 July 1645, AT IV: 252/CSMK: 256; emphasis added)

Given such a text, a commentator would certainly be justified in drawing on other sources to illuminate Descartes’ ethical positions (such as the nature of happiness vis-à-vis the Stoics). Thus, although Descartes claims that he is breaking with the past, one still ought to explore the possibility that his ethics builds on, for example, the Aristotelian and Stoic ethics with which he was surely acquainted. Indeed, some commentators have argued that Descartes’ ethics is indebted to Stoicism (Kambouchner 2009, Rodis-Lewis 1957, Rutherford 2004 & 2014).

2. The Provisional Morality

Descartes’ first stab at ethics is in Discourse III. In the Discourse, Descartes lays out a method for conducting reason in order to acquire knowledge. This method requires an engagement with skepticism, which raises the question of how one should live in the world when one has yet to acquire knowledge and must suspend judgment about all dubitable matters. Perhaps to ward off the classic apraxia objection to skepticism, that is, the objection that one cannot engage in practical affairs if one is truly a skeptic (Marshall 2003), Descartes offers a “provisional morality” to help the temporary skeptic and seeker of knowledge still act in the world. Descartes writes:

Now, before starting to rebuild your house, it is not enough simply to pull it down, to make provision for materials and architects (or else train yourself in architecture), and to have carefully drawn up the plans; you must also provide yourself with some other place where you can live comfortably while building is in progress. Likewise, lest I should remain indecisive in my actions while reason obliged me to be so in my judgements, and in order to live as happily as I could during this time, I formed for myself a provisional moral code consisting of just three or four maxims, which I should like to tell you about. (Discourse III, AT VI: 22/CSM I: 122)

Notice that Descartes is ambiguous about whether the provisional morality consists of three or four maxims. There is some interpretive debate about this matter. We will discuss all four candidate maxims. Furthermore, we will bracket the issue of how to understand the provisional nature of this morality (see, for example, LeDoeuff 1989, Marshall 1998 & 2003, Morgan 1994, Shapiro 2008a). However, it should be noted that Descartes does refer to the provisional morality even in his later ethical writings, which suggests that the maxims are not entirely abandoned once skepticism is defeated (see Letter to Princess Elizabeth 4 August 1645, AT IV: 265–6/CSMK: 257–8).

a. The First Maxim

Maxim One can be divided into three claims:

M1a: The moral agent ought to obey the laws and customs of her country.

M1b: The moral agent ought to follow their religion.

M1c: In all other matters not addressed by M1a and M1b, the moral agent ought to follow the most commonly accepted and sensible opinions of her community. (Discourse III, AT VI: 23/CSM I: 122)

Descartes claims that during his skeptical period he found his own “opinions worthless” (Ibid.). In the absence of genuine moral knowledge to guide our practical actions, Descartes claims that the best we can do is conform to the moral guidelines offered in the laws and customs of one’s country, religion, and the moderate and sensible opinions of one’s community. As Vance Morgan notes, M1 is strikingly anti-Cartesian, as it calls the moral agent to an “unreflective social conformism” (1994: 45). But as we see below, M1, at least partially, does not seem to be abandoned in Descartes’ later ethical writings.

b. The Second Maxim

Maxim Two states:

M2: The moral agent ought to be firm and decisive in her actions, and to follow even doubtful opinions once they are adopted, with no less constancy than if they were certain.

The motivation for M2 seems to be the avoidance of irresolution, which Descartes later characterizes as an anxiety of the soul in the face of uncertainty that prevents or delays the moral agent from taking up a course of action (Passions III.170, AT XI: 459–60/CSM I: 390–1). Descartes writes that, since “in everyday life we must often act without delay, it is a most certain truth that when it is not in our power to discern the truest opinions, we must follow the most probable” (Discourse III, AT VI: 25/CSM I: 123). Descartes discusses a traveler lost in a forest to illustrate the usefulness of M2. The traveler is lost, and he does not know how to get out of the woods. Descartes’ advice is that the traveler should pick a route, even if it is uncertain, and resolutely stick to it:

Keep walking as straight as he can in one direction, never changing it for slight reasons even if mere chance made him choose it in the first place; for in this way, even if he does not go exactly where he wishes, he will at least end up in a place where he is likely to be better off than in the middle of a forest. (Ibid.)

Descartes claims that following M2 prevents the moral agent from undergoing regret and remorse. This is important because regret and remorse prevent the moral agent from attaining happiness. The notion of sticking firmly and constantly to one’s moral judgments, even if they are not certain, is a recurring theme in Descartes’ later ethical writings (it is indeed constitutive of his virtue theory).

c. The Third Maxim

Maxim Three states:

M3: The moral agent ought to master herself rather than fortune, and to change her desires rather than the order of the world.

The justification for M3 is that “nothing lies entirely within our power except our thoughts” (Ibid.). Knowing this truth will lead the moral agent to orient her desires properly, because she will have accepted that “after doing our best in dealing with matters external to us, whatever we fail to achieve is absolutely impossible so far as we are concerned” (Ibid.). To be clear, the claim is that we should consider “all external goods as equally beyond our power” (Discourse III, AT VI: 26/CSM I: 124). Unsurprisingly, Descartes claims that it takes much work to accept M3: “it takes long practice and repeated meditation to become accustomed to seeing everything in this light” (Ibid.). The claim that only our thoughts lie within our power—and that knowing this is a key to regulating the passions—is another recurring theme in Descartes’ ethical writings, particularly in his theory of the passions and generosity (see section 7).

d. The Fourth Maxim

When reading Discourse III, it seems that the provisional morality ends after the discussion of M3. Indeed, in some texts Descartes refers to “three rules of morality” (see, for instance, Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 257). However, Descartes does seem to tack on a final Fourth Maxim:

M4: The moral agent ought to devote their life to cultivating reason and acquiring knowledge of the truth, according to the method outlined in the Discourse.

M4 has a different status than the other three maxims: it is the “sole basis of the foregoing three maxims” (Discourse III, AT VI: 27/CSM I: 124). It seems that M4 is not truly a maxim of morality, however, but a re-articulation of Descartes’ commitment to acquiring genuine knowledge. The moral agent must not get stuck in skepticism, resorting to a life of provisional morality, but rather must continue and persist in her search for knowledge of the truth (with the hope of establishing a well-founded morality—perhaps the “perfect moral system” of the tree of philosophy).

3. Cartesian Virtue

We now turn to Descartes’ later ethical writings (ca. 1644–1649). Arguably, the centerpiece of these writings is a theory of (moral) virtue. Though formulated in different ways, Descartes offers a consistent definition of virtue throughout his later ethical writings, namely, that virtue consists in the firm and constant resolution to use the will well (see Letter to Princess Elizabeth 18 August 1645, AT IV: 277/CSMK: 262; Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 258; Letter to Princess Elizabeth 6 October 1645, AT IV: 305/CSMK: 268; Passions II.148, AT XI: 442/CSM I: 382; Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325). This resolution to use the will well has two main features: (1) the firm and constant resolution to arrive at one’s best moral judgments, and (2) the firm and constant resolution to carry out these best moral judgments to the best of one’s abilities. It is important to note that the scope of the discussion here concerns moral virtue, not epistemic virtue (for an account of epistemic virtue see Davies 2001, Shapiro 2013, Sosa 2012).

a. The Unity of the Virtues

Descartes claims that his definition of virtue is wholly novel, and that he is breaking off from Scholastic and ancient definitions of virtue:

He should have a firm and constant resolution to carry out whatever reason recommends without being diverted by his passions or appetites. Virtue, I believe, consists precisely in sticking firmly to this resolution; though I do not know that anyone has ever so described it. Instead, they have divided it into different species to which they have given various names, because of the various objects to which it applies. (Letter to Princess Elizabeth 4 August 1645, AT IV: 265/CSMK: 258)

It is unclear what conception of virtue Descartes is criticizing here, but it is not far-fetched that he has in mind Aristotle’s account of virtue (arete) in the Nicomachean Ethics. For, according to Aristotle, there are a number of virtues—such as courage, temperance, and wisdom—each of which are distinct characterological traits that consist of a mean between an excess and a deficiency and guided by practical wisdom (phronesis) (Nicomachean Ethics II, 1106b–1107a). For example, the virtue of courage is the mean between rashness and cowardice. Although Descartes is willing to use a similar conceptual apparatus for distinguishing different virtues—for example, he will talk extensively about a “distinct” virtue of generosity—at bottom he thinks that there are no strict metaphysical divisions between the virtues. All of the so-called virtues have one and the same nature—they are reducible to the resolution to use the will well. As he tells Queen Christina:

I do not see that it is possible to dispose it [that is, the will] better than by a firm and constant resolution to carry out to the letter all the things which one judges to be best, and to employ all the powers of one’s mind in finding out what these are. This by itself constitutes all the virtues. (Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325)

Similarly, he writes in the Dedicatory Letter to Princess Elizabeth for the Principles:

The pure and genuine virtues, which proceed solely from knowledge of what is right, all have one and the same nature and are included under the single term ‘wisdom’. For whoever possesses the firm and powerful resolve always to use his reasoning powers correctly, as far as he can, and to carry out whatever he knows best, is truly wise, so far as his nature permits. And simply because of this, he will possess justice, courage, temperance, and all the other virtues; but they will be interlinked in such a way that no one virtue stands out among the others. (AT VIIIA: 2–3/CSM:191)

In these passages, Descartes is espousing a unique version of the unity of the virtues thesis. An Aristotelian unity of the virtues entails a reciprocity or inseparability among distinct virtues (Nichomachean Ethics VI, 1144b–1145a). According to Descartes, however, there is a unity of the “virtues” because, strictly speaking, there is only one virtue, namely, the resolution to use the will well (Alanen and Svensson 2007: fn. 8; Naaman-Zauderer 2010: 179–181). When the virtues are unified in this way, they exemplify wisdom.

b. Virtue qua Perfection of the Will

But what exactly is the nature of this resolution to use the will well? And how does one go about exercising this virtue? There are three main issues that need to be addressed in order to unpack Cartesian virtue. The first and foundational issue is Descartes’ rationale for locating virtue in a perfection of the will (section 3b). The second concerns the distinct epistemic requirements for virtue (section 4a). The third concerns Descartes’ characterization of virtue as a resolution of the will (section 5c).

According to Descartes, virtue is our “supreme good” (Letter to Princess Elizabeth 6 October 1645, AT IV: 305/CSMK: 268, Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325; see also Svensson 2019b). One avenue for tackling this claim about the supreme good is to think about what we can be legitimately praised or blamed for (Parvizian 2016). According to Descartes, virtue is certainly something that we can be praised for, and vice is certainly something that we can be blamed for. Now, in order to be legitimately praised or blamed for some property, f, f must be fully within our control. If f is not fully within our control, then we cannot truly be praised or blamed for possessing f. For example, Descartes cannot be praised or blamed for being French. This is a circumstantial fact about Descartes that is wholly outside of his control. However, Descartes can be praised or blamed for his choice to join the army of Prince Maurice of Nassau, for this is presumably a decision within his control, and it is either virtuous or vicious.

But what does it mean for f to be within our control? According to Descartes, control needs to be understood vis-à-vis the freedom to dispose of our volitions. The will is the source of our power and control—it is through the will that we affirm and deny perceptions at the cognitive level, and correspondingly act at the bodily level (Fourth Meditation, AT VII: 57/CSM II: 40). We have control over f insofar as f is fully under the purview of the will. As such, the reason why our supreme good lies in our will—or more specifically a virtuous use of our will—is because our will is the only thing we truly have control over. At bottom, everything else—our bodies, historical circumstances, and even intellectual capacities—are beyond the scope of our finite power.

This is not to deny that things outside of our control might be perfections or goods. Descartes clearly recognizes that wealth, beauty, intelligence and so forth are perfections, and desirable ones (Passions III.158, AT XI: 449/CSM I: 386). They can certainly contribute, in some sense, to well-being (see section 9). However, they are neither necessary nor sufficient for virtue and happiness. Descartes certainly allows for the possibility of the virtuous moral agent who is tortured “on the rack.” What matters is how we respond to the contingencies of the world, and how we incorporate contingent perfections into our life. Such responses are, of course, dependent on the will. Thus, it is through the will alone that we attain virtue.

As such, the will is also the only legitimate source of our personal value, and thus justified self-esteem. Indeed, Descartes claims that it is through the will alone that we bear any similarity to God. For it is through the will that we can become masters of ourselves, just as God is a master of Himself (Passions III.152, AT XI: 445/CSM I: 384).

4. The Epistemic Requirements of Virtue

Although virtue is located in a perfection of the will, the intellect does have a role in Cartesian virtue. One cannot use the will well in practical affairs unless the will is guided by the right kinds of perceptions—leaving open for now what we mean by ‘right’ (Morgan 1994: 113–128; Shapiro 2008: 456–7; Williston 2003: 308–310). Nonetheless, Descartes clearly claims that the virtuous will must be guided by the intellect:

Virtue unenlightened by the intellect is false: that is to say, the will and resolution to do well can carry us to evil courses, if we think them good; and in such a case the contentment which virtue brings is not solid. (Letter to Princess Elizabeth 4 August 1645, AT IV: 267/CSMK: 258)

More specifically, Descartes claims that we need knowledge of the truth to exercise virtue. However, Descartes recognizes that this knowledge cannot be comprehensive given our limited intellectual capacities:

It is true that we lack the infinite knowledge which would be necessary for a perfect acquaintance with all the goods between which we have to choose in the various situations of our lives. We must, I think, be contented with a modest knowledge of the most necessary truths. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)

This section tackles the issue of how to judge well based on knowledge of the truth, in other words how to arrive at our best moral or practical judgments. Notice that this seems to mark a departure from the provisional morality of the Discourse, in particular M1, where our moral judgments are not guided by any knowledge given the background engagement with skepticism.

a. Knowledge of the Truth

According to Descartes, in order to judge (and act) well we need to have knowledge of the truth in both a theoretical and practical sense. That is, we must assent to a certain set of truths at a theoretical level. However, in order to judge well in a moral situation, we need to have these truths ready at hand, that is, we need practical habits of belief.

i. Theoretical Knowledge of the Truth

In a letter to Princess Elizabeth, Descartes identifies six truths that we need in order to judge well in moral situations. Four of these truths are general in that they apply to all of our actions, and two of these truths are particular in that they are applicable to specific moral situations. Let us first examine what these truths are at a theoretical level, before turning to how these truths must be transformed into practical habits of belief.

Broadly put, the four general truths are:

T1: The existence of God

T2: The real distinction between mind and body

T3: The immensity of the universe

T4: The interconnectedness of the parts of the universe

The two particular truths are:

T5: The passions can misguide us.

T6: One can follow customary moral opinions when it is reasonable to do so.

On T1: Descartes claims that we must know that “there is a God on whom all things depend, whose perfections are infinite, whose power is immense and whose decrees are infallible” (Letter to Princess Elizabeth 15 September 1645, AT IV: 291/CSMK: 265) Knowing T1 is necessary for virtue, because it “teaches us to accept calmly all the things which happen to us as expressly sent by God,” and it engenders love for God in the moral agent (Ibid.).

On T2: Descartes says that we must know the nature of the soul, “that it subsists apart from the body, and is much nobler than the body, and that it is capable of enjoying countless satisfactions not to be found in this life” (Letter to Princess Elizabeth 15 September 1645, AT IV: 292/CSMK: 265–6). Knowing T2 is necessary for virtue because it prevents the moral agent from fearing death and helps her prioritize her intellectual pursuits over her bodily pursuits.

On T3: Descartes says that we must have a “vast idea of the extent of the universe” (Ibid.). He says this vast idea of the universe is conveyed in Principles III, and that it would be useful for moral agents to have read at least that part of his physics. Having knowledge of physics is necessary for virtue, because it prevents the moral agent from thinking that the universe was only created for her, thus wishing to “belong to God’s council” (Ibid.). It is important to note that this is one of the few places where Descartes draws out any connection between his physics and ethics, although he claims in a number of places that there are fundamental connections between these two disparate fields (Letter to Chanut 15 June 1646, AT IV: 441/CSMK: 289, Letter to Chanut 26 February 1649, AT V: 290-1/CSMK: 368).

On T4: Descartes says that “though each of us is a person distinct from others, whose interests are accordingly in some way different from those of the rest of the world, we ought still to think that none of us could subsist alone and that each one of us is really one of the many parts of the universe” (Letter to Princess Elizabeth 15 September 1645, AT IV: 293/CSMK: 266). Knowing T4 is necessary for virtue, because it helps engender an other-regarding character—perhaps love and generosity—that is particularly relevant to Cartesian virtue. Indeed, virtue requires that the “interests of the whole, of which each of us is a part, must always be preferred to those of our own particular person” (Ibid.).

On T5: Descartes seems to claim that the passions exaggerate the value of the goods they represent (and thus are misrepresentational), and that the passions correspondingly impel us to the pleasures of the body. Knowing T5 is necessary for virtue, because it helps us suspend our judgments when we are in the throes of the passions, so that we are not “deceived by the false appearances of the goods of this world” (Letter to Princess Elizabeth 15 September 1645, AT IV: 294–5/CSMK: 267).

On T6: Descartes claims that “one must also examine minutely all the customs of one’s place of abode to see how far they should be followed” (Ibid.). T6 is necessary for virtue because “though we cannot have demonstrations of everything, still we must take sides, and in matters of custom embrace the opinions that seem the most probable, so that we may never be irresolute when we need to act” (Ibid.). T6 seems to be a re-articulation of M1 in the provisional morality, specifically M1a above.

ii. Practical Knowledge of the Truth

T1–T6 must be known at a theoretical level. However, Descartes claims that we also need to transform T1–T6 into habits of belief:

Besides knowledge of the truth, practice is also required if one is to be always disposed to judge well. We cannot continually pay attention to the same thing; and so, however clear and evident the reasons may have been that convinced us of some truth in the past, we can later be turned away from believing it by some false appearances unless we have so imprinted it on our mind by long and frequent meditation that it has become a settled disposition within us. In this sense the Scholastics are right when they say that virtues are habits; for in fact our failings are rarely due to lack of theoretical knowledge of what we should do, but to lack of practical knowledge—that is, lack of a firm habit of belief. (Letter to Princess Elizabeth 15 September 1645, AT IV: 295–6/CSMK: 267)

The idea seems to be this: in order to actually judge well in a moral situation, T1–T6 need to be ready at hand. We need to bring them forth before the mind swiftly and efficiently in order to respond properly in a moral situation. In order to do that, we must meditate on T1–T6 until they become habits of belief.

b. Intellect, Will, and Degrees of Virtue

There seems to be an inconsistency between Descartes’ theory of virtue and his account of the epistemic requirements for virtue. Descartes is committed to the following two claims:

Theoretical and practical knowledge of T1–T6 is a necessary condition for virtue.
One can be virtuous even if they do not have theoretical and practical knowledge of T1–

We have seen that Descartes is committed to claim (1). But why is he committed to claim (2)? Consider the following passage from the Dedicatory Letter to Elizabeth:

Now there are two prerequisites for the kind of wisdom [that is, the unity of the virtues] just described, namely the perception of the intellect and the disposition of the will. But whereas what depends on the will is within the capacity of everyone, there are some people who possess far sharper intellectual vision than others. Those who are by nature somewhat backward intellectually should make a firm and faithful resolution to do their utmost to acquire knowledge of what is right, and always to pursue what they judge to be right; this should suffice to enable them, despite their ignorance on many points, to achieve wisdom according to their lights and thus to find great favour with God. (AT VIIIA: 3/CSM I: 191)

Descartes clearly commits himself to (2) in this passage. But in the continuation of this passage he offers a way to reconcile (1) and (2):

Nevertheless, they will be left far behind by those who possess not merely a very firm resolve to act rightly but also the sharpest intelligence combined with the utmost zeal for acquiring knowledge of the truth.

According to Descartes, virtue, at its essence, is a property of the will, not the intellect. Virtue consists in the firm and constant resolution to use the will well, which requires determining one’s best practical judgments and executing them to the best of one’s abilities. However, virtue comes in degrees, in accordance with what these best practical judgments are based on. The more knowledge one has (essentially, the more perfected one’s intellect is), the higher one’s degree of virtue.

In its ideal form, virtue presupposes, at a minimum, theoretical and practical knowledge of T1–T6 (and arguably one’s virtue would be improved by acquiring further relevant knowledge). But Descartes acknowledges that not everyone has the capacity or perhaps is in a position to acquire knowledge of the truth (for instance the peasant). Nonetheless, Descartes does not want to exclude such moral agents from acquiring virtue. Virtue is not just for the philosopher. If such moral agents resolve to acquire as much knowledge as they can, and have a firm and constant resolution to use their will well (according to that knowledge), then they will secure virtue (even if they have the wrong metaphysics, epistemology, natural philosophy, or the like). Claims (1) and (2) are rendered consistent, then, once they are properly revised:

(1)* Theoretical and practical knowledge of T1–T6 is a necessary condition for ideal virtue.

(2)* One can be non-ideally virtuous while lacking full theoretical and practical knowledge of T1–T6, so long as they do their best to acquire as much relevant knowledge as they can, and to have the firm and constant resolution to use their will well accordingly.

It is clear that Descartes is usually talking about an ideal form of virtue, whenever he uses the term ‘virtue.’ When he wants to highlight discussion of non-ideal forms of virtue, he is usually clear about his target (see, for example, Dedicatory Letter to Elizabeth, AT VIIIA: 2/CSM I: 190–1). In what follows, then, the reader should assume that the virtue being discussed is of the ideal variety, that is, it is based on some perfection of the intellect. As flagged earlier, there is disagreement, of course, about how much knowledge one must have to acquire certain virtues (for example, generosity).

5. Moral Epistemology

Does Descartes have a distinct moral epistemology? In the epistemology of the Meditations, Descartes distinguishes three different kinds of epistemic states: scientia/perfecte scire (perfect knowledge), cognitio (awareness), and persuasio (conviction or opinion). Broadly construed, the distinction between these three epistemic states is as follows. Scientia is an indefeasible judgment (it is true and absolutely certain), whereas cognitio and persuasio are both defeasible judgments. Nonetheless, cognitio is of a higher status than persuasio because there is—to some degree—better justification for cognitio than persuasio. Persuasio is mere opinion or belief, whereas cognitio is an opinion or awareness backed by some legitimate justification. For example, the atheist geometer can have cognitio of the Pythagorean theorem, and can justify that cognitio with a geometrical proof. However, this cognitio fails to achieve the status of scientia because the atheist geometer is unaware of God, and thus does not know the Truth Rule, namely, that her clear and distinct perceptions are true because God is not a deceiver (Second Replies, AT VII: 141/CSM II: 101; Third Meditation, AT VII: 35/CSM II: 24; Fourth Meditation, AT VII: 60–1/CSM II: 41).

a. The Contemplation of Truth vs. The Conduct of Life

There is an important question that must be raised about the epistemic status of our best moral judgments. In what sense is a “best moral judgment” the best? That is, is a best moral judgment the best because it amounts to scientia or does it fall short—that is, is it the best cognitio or persuasio? In the Meditations, where Descartes is engaged in a sustained hyperbolic doubt, he identifies two jointly necessary and sufficient conditions for knowledge in the strict sense, that is, scientia. In the standard interpretation, a judgment will amount to scientia when it is both true and absolutely certain (Principles I.45, AT VIIIA: 21–22/CSM I: 207). A judgment can meet the conditions of truth and absolute certainty when it is grounded in divinely guaranteed clear and distinct perceptions. Though the details are tricky, it is ultimately clear and distinct perceptions that make scientia indefeasible, because the intellect and its clear and distinct perceptions are epistemically guaranteed, in some sense, by God’s benevolence and non-deceptive nature. According to Descartes, however, the epistemic standards that we must abide by in theoretical matters or “the contemplation of the truth” should not be extended to practical matters or the “conduct of life.” As he writes in the Second Replies,

As far as the conduct of life is concerned, I am very far from thinking that we should assent only to what is clearly perceived. On the contrary, I do not think that we should always wait even for probable truths; from time to time we will have to choose one of many alternatives about which we have no knowledge, and once we have made our choice, so long as no reasons against it can be produced, we must stick to it as firmly as if it had been chosen for transparently clear reasons. (AT VII: 149/CSM II: 106)

This passage tells us that our best practical judgments cannot be the best in virtue of meeting the strict standards for scientia. This is because of the distinguishing factors between the contemplation of truth from the conduct of life. First and foremost, unlike the contemplation of truth, where the goal is to arrive at a true and absolutely certain theoretical judgment that amounts to knowledge, the conduct of life is concerned with arriving at a best practical moral judgment for the sake of carrying out a course of action. Given that in morality we are ultimately concerned with action in the conduct of life, we must keep in mind that there is a temporally indexed window of opportunity to act in a moral situation (Letter to Princess Elizabeth 6 October 1645 AT IV: 307/CSMK: 269). If a moral agent tries to obtain clear and distinct perceptions in moral deliberation—something that can take weeks or even months to attain according to Descartes (Second Replies, AT VII: 131/CSM II: 94; Seventh Replies, AT VII: 506/CSM II: 344)—the opportunity to act will pass and thus the moral agent will have failed to use her will well. In short, attaining clear and distinct perceptions in a moral situation is not advisable.

Second, and perhaps more importantly, it seems that we cannot attain clear and distinct perceptions in the conduct of life. Although our best moral judgments are guided by knowledge of the truth (which are presumably based on clear and distinct perceptions), we also base our best moral judgments, in part, on perceptions of the relevant features of the moral situation. These include information about other mind-body composites, bodies, and the consequences of our action. For example, in the famous trolley problem, the moral agent has to consider her perceptions of the people tied to the track, the train and the rails, and the possible consequences that follow from directing the train one way or another at the fork in the tracks. Such information about the other mind-body composites and bodies in this moral situation is ultimately provided by sensations. And sensations, according to Descartes, provide obscured and confused content to the mind about the nature of bodies (Principles I.45, AT VIIIA: 21–2/CSM I: 207–8, Principles I.66–68, AT VIIIA: 32–33/CSM I: 126–7). As for predicting the consequences of an action, this is done through the imagination, for these consequences do not exist yet. I need to represent to my myself, through imagination, the potential consequences my action will produce in the long run. And such fictitious representations can only be obscure and confused. In short, given the imperfect kinds of perceptions that are involved in moral deliberation, our best moral judgments can never be fully grounded in clear and distinct perceptions.

b. Moral Certainty and Moral Skepticism

These perceptual facts help explain why Descartes claims that our best moral judgments can achieve only moral certainty, that is,

[C]ertainty which is sufficient to regulate our behaviour, or which measures up to certainty we have on matters relating to the conduct of life which we never doubt, though we know that it is possible, absolutely speaking, that they may be false. (Principles IV.204, AT VIIIA: 327/CSM I: 289, fn. 1; see also Schachter 2005, Voss 1993)

Given that even our best moral judgments can achieve only moral certainty, Descartes seems to be claiming that we cannot have first-order moral knowledge. That is, when I make a moral judgment of the following form “I ought to f in x moral situation,” that moral judgment will never amount to knowledge in the strict sense. Nonetheless, morally certain moral judgments are certainly not persuasio, as they are backed with some justification. Thus, we should regard them as attaining the status of cognitio—just shy of scientia (but for different reasons than the cognitio of the atheist geometer, presuming that the moral agent has completed the Meditations and knows that her faculties—in normal circumstances—are reliable).

However, it is important to note that Descartes is not claiming that first-order moral knowledge is impossible tout court. That is, Descartes is not a non-cognitivist about moral judgments, claiming that moral judgments are neither true nor false. Cartesian moral judgments are truth-evaluable; that is, they are capable of being true or false. Descartes, then, is a cognitivist about moral judgments. As Descartes says, we must recognize that although our best practical moral judgments are morally certain, they may still, “absolutely speaking,” be false. If Descartes is a moral skeptic of some stripe, he should be understood as making a plausible claim about our limitations as finite minds. A finite mind, given its limited and imperfect perceptions, cannot attain first-order moral knowledge because it cannot ultimately know whether its first-order moral judgments are true or false. However, an infinite mind—God—surely knows whether the first-order moral judgments of finite minds are true or false. First-order moral knowledge is possible—finite minds just cannot attain it.

One final remark. One might resist the standard interpretation that we cannot have first-order moral knowledge, by claiming that Descartes is not a moral skeptic at all, because the standards for knowledge shift from the contemplation of truth to the conduct of life. That is, Descartes might be an epistemic contextualist. Epistemic contextualism is the claim that the meaning of the term ‘knows’ shifts depending on the context, in the same way the meaning of the indexical ‘here’ shifts depending on the context. If Jones utters the sentence ‘Brown is here,’ the meaning of the sentence will shift depending on where Jones is when he utters it (Rysiew 2007). This kind of contextualist view has been suggested in passing by Lex Newman (2016), who argues that Descartes’ epistemic standards shift depending on whether he is doing metaphysics or science (Principles IV.205–6, AT VIIIA: 327–9/CSM I: 289–291). Although Newman does not extend this contextualist interpretation to Descartes’ moral epistemology, it would take only a few steps to do so. Nonetheless, it strains credulity to think that first-order moral judgments could ever meet the standards of scientia in the Meditations.

c. Virtue qua Resolution

We can now clarify why Descartes characterizes virtue in terms of a resolution. The conduct of life presents us with a unique epistemic challenge that does not arise in the contemplation of truth. That is: (1) we have a short of window of opportunity to arrive at a moral judgment and then act, and (2) the perceptions that in part serve as the basis for our judgments are ultimately obscure and confused. These two features can give rise to irresolution. Irresolution, according to Descartes, is a kind of anxiety which causes a person to withhold from performing an action, creating a cognitive space for the person to make a choice (Passions III.170, AT XI: 459/CSM I: 390). As such, irresolution can be a good cognitive trait. However, irresolution becomes problematic when one has “too great a desire to do well” (Passions III. 170, AT XI: 460/CSM I: 390). If one wants to arrive at the most perfect moral judgment (such as through grounding their moral judgments in clear and distinct perceptions), they will ultimately fall into an excessive kind of irresolution which prevents them from judging and acting at all. Given the nature of moral situations and what is at stake within them (essentially, how we ought to treat other people), the conduct of life is ripe for producing this excessive kind of irresolution. Arguably, we do want perfection in our moral conduct.

This is why Descartes says virtue involves a resolution: we need to establish a firm and constant resolve to arrive at our best moral judgments and to carry them out, even though we realize that these judgments are only morally certain and can be false. So long as we have this firm resolve (which is of course guided by knowledge of the truth), we can be assured that we have done our duty, even if it turns out that we retrospectively determine that what we did was wrong. For we can control only our will—how our action plays out in the real world is beyond our control, and there is no way we can guarantee that we will always produce the right consequences. As Descartes tells Princess Elizabeth:

There is nothing to repent of when we have done what we judged best at the time when we had to decide to act, even though later, thinking it over at our leisure, we judge that we made a mistake. There would be more ground for repentance if we had acted against our conscience, even though we realized afterwards that we had done better than we thought. For we are only responsible for our thoughts, and it does not belong to human nature to be omniscient, or always to judge as well on the spur of the moment as when there is plenty of time to deliberate.

(Letter to Princess Elizabeth 6 October 1645, AT IV 308/CSMK: 269; see also Letter to Queen Christina 20 November 1647, AT V: 83/CSMK: 325)

Consistent with Descartes’ grounding of virtue in a perfection of the will, Descartes’ view of moral responsibility is that we are responsible only for what is truly under our control—that is, our thoughts (or more specifically our volitions). Notice that the seeds of this full analysis of virtue qua resolution is present in the provisional morality, namely, M2.

6. The Passions

Strictly speaking, Descartes’ Passions of the Soul is not an ethical treatise. As Descartes writes, “my intention was to explain the passions only as a natural philosopher, and not as a rhetorician or even as a moral philosopher” (Prefatory Letters, AT XI: 326/CSM I: 327). Nonetheless, the passions have a significant status within Descartes’ ethics. At the end of Passions, Descartes writes: “it is on the passions alone that all the good and evil of this life depends” and “the chief use of wisdom lies in its teaching us to be masters of our passions and to control them with such skill that the evils which they cause are quite bearable, and even become a source of joy” (Passions III.212, AT XI: 488/CSM I: 404). Thus, it is important to discuss Cartesian passions in order to understand Descartes’ ethics. We will consider (1) the function of the passions and (2) whether the passions are merely motivational or representational states.

a. The Definition of the Passions

Descartes identifies a general sense of the term ‘passion,’ which covers all states of the soul that are not, in any way, active. That is, passions are passive and thus are perceptions: “all our perceptions, both those we refer to objects outside us and those we refer to the various states of our body, are indeed passions with respect to our soul, so long as we use the term ‘passion’ in its most general sense” (Passions I.25, AT XI: 347–8/CSM I: 337). Thus, a general use of the term ‘passion’ would include the following kinds of perceptions: smells, sounds, colors, hunger, pain, and thirst, all of which are states that we refer to objects outside of us (Passions I.29, AT XI: 350/CSM I: 339). However, the more narrow and strict sense of passions that is examined in the Passions are “those perceptions, sensations, or emotions of the soul which we refer particularly to it, and which are caused, maintained and strengthened by some movement of the spirits” (Passions I.27, AT XI: 349/CSM I: 338–9). Descartes identifies six primitive passions, out of which all of the other more complex passions are composed. These are wonder, love, hatred, joy, sadness, and desire. Each primitive and complex passion is distinguished from the others in terms of its physiological and causal basis (roughly, the animal spirits which give rise to it) and its cognitive nature and specific function (Passions II.51–2, AT XI: 371–2/CSM I: 349).

b. The Function of the Passions

Given Descartes’ general resistance to teleology, there is much to be said about how to understand the nature of Cartesian functions in general, and specifically the function of the passions (Brown 2012). Setting aside the issue of reconciling any metaphysical inconsistencies, it is clear that Descartes does think that the passions have some kind of function, and we must be mindful of this in interpreting Descartes.

In Passions II.52, Descartes identifies the general function of the passions:

I observe, moreover, that the objects which stimulate the senses do not excite different passions in us because of differences in the objects, but only because of the various ways in which they may harm or benefit us or in general have importance for us. The function of all the passions consists solely in this, that they dispose our soul to want the things which nature deems useful for us, and to persist in this volition; and the same agitation of the spirits which normally causes the passions also disposes the body to make movements which help us to attain these things. (AT XI: 372/CSM I: 349, cf. Passions I.40, AT XI: 359/CSM I: 343)

Descartes claims that the general function of the passions is to dispose the soul to want the things which nature deems useful for us, and to also dispose the body to move in the appropriate ways so as to attain those things. Put more simply, the passions are designed to preserve the mind-body composite. How exactly that plays out will depend on the kind of passion under consideration. As Descartes writes in Passions I.40, fear disposes the soul to want to flee (a bodily action) and courage disposes the soul to want to fight (a bodily action as well).

It is important to note that the general function assigned to the passions is similar to, but slightly different than, the one assigned to sensations in the Sixth Meditation. In the context of his sensory theodicy, Descartes writes: “the proper purpose of the sensory perceptions given me by nature is simply to inform the mind of what is beneficial or harmful for the composite of which the mind is a part” (Sixth Meditation, AT VII: 83/CSM II: 57). Supposing that Descartes does not have passions in mind in the Sixth Meditation, and given Descartes’ strict distinction between sensations and passions in the Passions, it seems that passions and sensations have different functions. The function of a passion is to dispose the soul to want what is beneficial or harmful for it, while the function of a sensation is to inform the soul of what is beneficial or harmful for it. This would suggest that sensations are perhaps representational states (De Rosa 2007, Gottlieb & Parvizian 2018, Hatfield 2013, Simmons 1999), whereas the passions are merely motivational.

But matters are more complicated. A vexing issue for commentators has been how the passions fulfill their function of disposing the soul to want certain things. It is clear that the passions are motivational. But the interpretive issue for commentators has been whether the passions are merely motivational (and thus non-intentional, affective states), or whether they are, to some degree, representational as well. Settling this issue is important, because it helps clarify whether the passions ought to serve as guides to our practical behavior.

c. Whether the Passions are Representational or Motivational

The standard interpretation is that the passions are representational in addition to being motivational (Alanen 2003a & 2003b, Brown 2006, Clarke 2005, Franco 2015). Sometimes commentators describe the passions as being informative, but the best way to cash this out given Descartes’ philosophy of mind is in terms of representation. There are broad reasons for claiming that the passions are representational. If one thinks that the passions are a type of idea, then it seems that they must be representational, for Descartes claims in the Third Meditation that all ideas have intentionality: “there can be no ideas which are not as it were of things” (AT VII: 44/CSM II: 30). Moreover, Descartes seems to make a representationalist claim about the passions in T5: “all our passions represent to us the goods to whose pursuit they impel us as being much greater than they really are” (Letter to Princess Elizabeth 15 September 1645, AT IV: 294–295/CSMK: 267; see also Passions II. 90, AT XI: 395/CSM I: 360). Strictly speaking, the claim here seems to be that the passions have representational content—they represent goods for the mind-body composite—but they are ultimately misrepresentational because they exaggerate the value of those goods. However, it is claimed that the passions can be a guide to our survival and preservation once they are regulated by reason. According to John Marshall, once the passions are regulated they can become accurate representations of goods (1998: 119–125). As such, the passions can be reliable guides to our survival and preservation under the right circumstances.

Alternatively, it has been argued that, despite the textual evidence, Descartes’ considered view is that the passions are merely motivational states (Greenberg 2007, Brassfield 2012). Shoshana Brassfield has argued that the passions are motivational states which serve to strengthen and prolong certain thoughts which are good for the soul to cognitively sustain. When Descartes speaks of the passions representing, we need to re-read him as actually saying one of two things. First, he may be clarifying a representational content (distinct from a passion) that a particular passion strengthens and prolongs. Second, he may be discussing how the passions lead us to exaggerate the value of objects in our judgments, by prolonging and strengthening certain judgments, which thus make us mistakenly affirm that a particular object is more valuable than it actually is.

The upshot of this type of motivational reading of the passions is that the passions are not intrinsic guides to our survival and preservation, and that we should suspend judgment about how to act when we are moved by the passions. It is reason alone that is the guide to what is good and beneficial for the mind-body composite. The passions are, in some sense, beneficial when they are regulated by reason (and thus lead, for example, to proper experiences of joy and thus happiness), but they are not beneficial when reason is guided by the passions.

7. Generosity

According to Descartes, generosity—a species of wonder—is both a passion and a virtue (Passions III.153, AT XI: 445–6/CSM I: 384, Passions III.161, AT XI: 453–4/CSM I: 387–8). Generosity transitions from a passion to a virtue once the passion becomes a habit of the soul (Passions III.161, AT XI: 453–4/CSM I: 387–8). Having already discussed passions, we will focus on generosity qua virtue. Generosity is the chief virtue in Descartes’ ethics because it is the “key to all the virtues and a general remedy for every disorder of the passions” (Passions III.161, AT XI: 454/CSM I: 388). Descartes defines generosity as that:

Which causes a person’s self-esteem to be as great as it may legitimately be, [and] has only two components. The first consists in his knowing that nothing truly belongs to him but this freedom to dispose his volitions, and that he ought to be praised or blamed for no other reason than his using this freedom well or badly. The second consists in his feeling within himself a firm and constant resolution to use it well—that is, never to lack the will to undertake and carry out whatever he judges to be best. To do that is to pursue virtue in a perfect manner. (Passions III.153, AT XI: 445–6/CSM I: 384)

Generosity has two components. The first, broadly construed, consists in the knowledge that the only thing that truly belongs to us is our free will. The second, broadly construed, consists in feeling the firm and constant resolution to use this free will well.

a. Component One: What Truly Belongs to Us

What is particularly noteworthy about Descartes’ definition of generosity is the first component. Descartes claims that the first component of generosity consists in knowledge of the following proposition: the only thing that truly belongs to me is my free will. This is certainly a strong claim, which goes beyond Descartes’ account of the role of the will in virtue, as discussed in section 3. Recall that we claimed that virtue is grounded in a perfection of the will, because only our volitions are under our control. Descartes is taking this a step further here: he now seems to be claiming that the only thing that truly belongs to us is free will. In claiming that free will “truly belongs” to us, Descartes seems to be making a new metaphysical claim about the status of free will within a finite mind. But how exactly should this claim be interpreted?

The locution “belongs” and “truly belongs” is typically used by Descartes to make a metaphysical claim about the essence of a substance. For example, Descartes claims that his body does not truly belong to his essence (see Sixth Meditation, AT VII: 78/CSM II: 54). If Descartes is making a claim about our metaphysical essence in the definition of generosity, then this claim seems to be in clear contradiction with the account of our metaphysical essence in the Meditations and Principles. There, Descartes claims that he is essentially a thinking thing, res cogitans (Second Meditation, AT VII: 28/CSM II: 19). Although a body also belongs to him in some sense (Sixth Meditation, AT VII: 80/CSM II: 56; see also Chamberlain 2019), he can still draw a real distinction between his mind and body, which implies that what truly belongs to him is thought. Thought, in the Meditations, has a broad scope: in particular, it includes both the intellect and the will as well as all of the different types of perceptions and volitions that fall under these two faculties (Principles I.9, AT VIIIA: 7–9/CSM I: 195). However, in the first component of generosity, Descartes seems to be claiming that there is a particular kind of thought that truly belongs to us, namely, our free will and its corresponding volitions. As such, the moral agent is not strictly speaking a res cogitans; rather, she is a willing thing, res volans (Brown 2006: 25; Parvizian 2016).

Commentators have picked up on this difficulty in Descartes’ definition of generosity. There are two interpretations in the literature. One reading goes in for a metaphysical reading of ‘truly belongs,’ according to which Descartes is making a metaphysical claim about our true essence (Boehm 2014: 718–19). Another reading takes an evaluative reading which approximates the standard account of why virtue is a perfection of the will, namely, that Descartes is making a claim about what is under our control—that is, our volitions—and thus what we can be truly praised and blamed for (Parvizian 2016). In this reading, there is a sense in which a human being is truly a res volans, but this does not metaphysically exclude the other properties of a res cogitans from its nature.

b. Acquiring Generosity

How is the chief virtue of generosity acquired? Descartes writes:

If we occupy ourselves frequently in considering the nature of free will and the many advantages which proceed from a firm resolution to make good use of it—while also considering, on the other hand, the many vain and useless cares which trouble ambitious people—we may arouse the passion of generosity in ourselves and then acquire the virtue. (Passions III. 161, AT XI: 453–4/CSM I: 388)

Here, Descartes claims that we need to reflect on two aspects of the will. First, we need to reflect on the very nature of the will. This includes facts such as its freedom, its being infinite in scope, and its different functional capacities. Second, we need to reflect on the advantages and disadvantages that come from using it well and poorly, respectively. This reflection on the advantages and disadvantages, interestingly, seems to require observation of other people’s behavior. As Descartes writes, we need to observe “the main vain and useless cares which trouble ambitious people,” which will help us appreciate the value and efficacy of the will. There are some commentators who have claimed that this process for acquiring generosity is exemplified in the Second or Fourth Meditation (Boehm 2014, Shapiro 2005), while other commentators have argued that the meditator cannot engage in the process of acquiring generosity until after the Meditations have been completed (Parvizian 2016).

c. Generosity and the Regulation of the Passions

Throughout the Passions, Descartes indicates different ways to remedy the disorders of the passions. Descartes claims, for example, that the exercise of virtue is a remedy against the disorders of the passions, because then “his conscience cannot reproach him,” which allows the moral agent to be happy amidst “the most violent assaults of the passions” (Passions II.148, AT XI: 441–2/CSM I: 381–2). However, Descartes claims that generosity is a “general remedy for every disorder of the passions” (Passions III.161, AT XI: 454/CSM I: 388). Descartes writes:

They [generous people] have mastery over their desires, and over jealousy and envy, because everything they think sufficiently valuable to be worth pursuing is such that its acquisition depends solely on themselves; over hatred of other people, because they have esteem for everyone; over fear, because of the self-assurance which confidence in their own virtue gives them; and finally over anger, because they have little esteem for everything that depends on others, and so they never give their enemies any advantage by acknowledging that they are injured by them. (Passions III.156, AT XI: 447–8/CSM I: 385)

Generosity is a general remedy for the disorders of the passions because it ultimately leads the moral agent to a proper conception of what she ought to esteem. At bottom, the problem of the passions is that they lead us to misunderstand the value of various external objects, and to place our own self-esteem in them. Once we understand that the only property that is truly valuable is a virtuous will, then all the passions will be regulated.

d. The Other-Regarding Nature of Generosity

Although Descartes’ definition of generosity is certainly not standard, his account of how generosity manifests in the world does coincide with our standard intuitions about what generosity looks like. According to Descartes, the truly generous person is fundamentally other-regarding:

Those who are generous in this way are naturally led to do great deeds, and at the same time not to undertake anything of which they do not feel themselves capable. And because they esteem nothing more highly than doing good to others and disregarding their own self-interest, they are always perfectly courteous, gracious, and obliging to everyone. (Passions III.156, AT XI: 447–8/CSM I: 385)

The fundamental reason why the generous person is other-regarding is that she realizes that the very same thing that causes her own self-esteem, a virtuous will, is present or at least capable of being present in other people (Passions III.154, AT XI: 446–7/CSM I: 384). That is, since others have a free will, they are also worthy of value and esteem and thus must be treated in the best possible way. A fundamental task of the generous person is to help secure the conditions for other people to realize their potential to acquire a virtuous will.

8. Love

Love is a passion that has direct ethical implications for Descartes, for in its ideal form love is altruistic, other-regarding, and requires self-sacrifice. Descartes distinguishes between different kinds of love: affection, friendship, devotion, sensory love, and intellectual love (Passions II. 83 AT XI: 389–90/CSM I: 357–8; Letter to Chanut 1 February 1647, AT IV: 600–617/CSMK: 305–314). We examine love in general:

Love is an emotion of the soul caused by a movement of the spirits, which impels the soul to join itself willingly to objects that appear to be agreeable to it. (Passions II.79, AT XI: 387/CSM I: 356)

In explicating what it means for the soul to join itself willingly to objects, Descartes writes:

In using the word ‘willingly’ I am not speaking of desire, which is a completely separate passion relating to the future. I mean rather the assent by which we consider ourselves henceforth as joined with what we love in such a manner that we imagine a whole, of which we take ourselves to be only one part, and the thing loved to be the other. (Passions II.80 AT XI: 387/CSM I: 356)

In short, love involves an expansion of the self. The lover regards herself and the beloved as two parts of a larger whole. But this raises an important question: is there a metaphysical basis for this part-whole relationship? Or is the part-whole relationship merely a product of the imagination and the will?

a. The Metaphysical Reading

One could try to provide a metaphysical basis for love by arguing that people are metaphysical parts of larger wholes. If so, then there would be metaphysical grounds “to justify a very expansive love” (Frierson 2002: 325). Indeed, Descartes seems to claim as much in his account of T4:

Though each of us is a person distinct from others, whose interests are accordingly in some way different from those of the rest of the world, we ought still to think that none of us could subsist alone and that each one of us is really one of the many parts of the universe, and more particularly a part of the earth, the state, the society and the family to which we belong by our domicile . . . and the interests of the whole, of which each of us is a part, must always be preferred to those of our own particular person. (Letter to Princess Elizabeth 15 September 1645, AT IV: 293/CSMK: 266)

Descartes uses suggestive metaphysical language here. Indeed, he claims that people cannot subsist without the other parts of the universe (which includes other people), and that we are parts of a larger whole. Given this metaphysical basis of love, then, the interests of the whole should be preferred to the interests of any given part.

There are interpretive problems for a metaphysical basis of love, however. For one, Descartes does not spell out this metaphysical relation in any detail. Moreover, such a metaphysical relation seems to fly in the face of Descartes’ account of the independent nature of substances and the real distinction between minds and bodies. To say that persons (mind-body composites) are parts of larger wholes would seem to suggest that (1) mind-body composites are modes and not substances, and consequently that (2) there is no real distinction between mind-body composites.

b. The Practical Reading

Alternatively, one could give a practical basis for love, by arguing that we ought to consider or imagine ourselves as parts of larger wholes, even though metaphysically we are not (Frierson 2002). As Descartes writes to Princess Elizabeth:

If we thought only of ourselves, we could enjoy only the goods which are peculiar to ourselves; whereas, if we consider ourselves as parts of some other body, we share also in the goods which are common to its members, without losing any of those which belong only to ourselves. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)

There are practical reasons for loving others, because doing so allows us to partake in their joy and perfections. Of course, this raises the problem that we will also partake in their imperfections and suffering. On this issue Descartes writes:

With evils, the case is not the same, because philosophy teaches that evil is nothing real, but only a privation. When we are sad on account of some evil which has happened to our friends, we do not share in the defect in which this evil consists. (Letter to Princess Elizabeth 6 October 1645, AT IV: 308/CSMK: 269)

In either the metaphysical or practical reading, however, it is clear that love has a central role in Descartes’ ethics. According to Descartes, inculcating and exercising love is central for curbing one’s selfishness and securing the happiness, well-being, and virtue of others (see also Letter to Chanut 1 February 1647, AT VI: 600–617/CSMK: 305–314). For further important work on Cartesian love see Frigo (2016), Boros (2003), Beavers (1989), Williston (1997).

9. Happiness

In general, Descartes characterizes happiness as an inner contentment or satisfaction of the mind that results from the satisfaction of one’s desires. However, he draws a distinction between mere happiness (bonheur) and genuine happiness or blessedness (felicitas; félicité/béatitude). Mere happiness, according to Descartes, is contentment of mind that is acquired through luck and fortune. This occurs through the acquisition of goods—such as honors, riches, and health—that do not truly depend on the moral agent (that is, her will) but external conditions. Although the moral agent is satisfying her desires, these desires are not regulated by reason. As such, she seeks things beyond her control. Blessedness, however, is a supreme contentment of mind achieved when the moral agent satisfies desires that are regulated by reason, and reason dictates that we ought to prioritize and desire virtue and wisdom. This is because virtue and wisdom are goods that truly depend on the moral agent, as they truly proceed from the right use of the will, and do not depend on any external conditions. As Descartes writes:

We must consider what makes a life happy, that is, what are the things which can give us this supreme contentment. Such things, I observe, can be divided into two classes: those which depend on us, like virtue and wisdom, and those which do not, like honors, riches, and health. For it is certain that a person of good birth who is not ill, and who lacks nothing, can enjoy a more perfect contentment than another who is poor, unhealthy and deformed, provided the two are equally wise and virtuous. Nevertheless, a small vessel may be just as full as a large one, although it contains less liquid; and similarly if we regard each person’s contentment as the full satisfaction of all his desires duly regulated by reason, I do not doubt that the poorest people, least blest by nature and fortune, can be entirely content and satisfied just as much as everyone else, although they do not enjoy as many good things. It is only this sort of contentment which is here in question; to seek the other sort would be a waste of time, since it is not in our own power. (Letter to Princess Elizabeth 4 August 1645, AT IV: 264–5/CSMK: 257)

It is important to note that Descartes is not denying that honors, riches, beauty, health, and so on are genuine goods or perfection. Nor is he claiming that they are not desirable. Rather, he is merely claiming that such goods are neither necessary nor sufficient for blessedness. Virtue alone is necessary and sufficient for blessedness (Svensson 2015).

However, such external goods are conducive to well-being (the quality of life), and for that reason they are desirable (Svensson 2011). Compare a virtuous person, S, who is poor, unhealthy, and ugly and a virtuous person, S*, who is rich, healthy, and beautiful. In Svensson’s reading, S and S* will have the same degree of happiness. However, Descartes does have room to acknowledge that S* has more well-being than S, because S* possesses more perfections.

10. Classifying Descartes’ Ethics

We have examined the main features of Descartes’ ethics. But what kind of ethics does Descartes espouse? There are three distinct classifications of Descartes’ ethics in the literature: virtue ethics, deontological virtue ethics, and perfectionism.

a. Virtue Ethics

Given that virtue is the undeniable centerpiece of Descartes’ ethics, it is natural to read Descartes as a virtue ethicist. Broadly construed, according to virtue ethics, the standard for morality in ethics is possession of the right kinds of character traits (virtues), as opposed to producing the right sorts of consequences, or following the right kinds of moral laws, duties, or rules.

Lisa Shapiro has argued that Descartes is a virtue ethicist. Her contention is that Descartes’ commitment to virtue (as opposed to happiness) being the supreme good makes Descartes a virtue ethicist (2008a: 454). In this view, the ultimate explanation for why an action is good or bad is whether it proceeds from virtue. This would place Descartes in the tradition of Aristotelian virtue ethics, but Shapiro notes that there are significant differences. For Aristotle, virtue must be successful: “virtue requires the world cooperate with our intentions” (2008a: 455). Whereas given Descartes’ moral epistemology, for Descartes “good intentions are sufficient for virtue” (Ibid.).

b. Deontological Virtue Ethics

Noa Naaman-Zauderer (2010) agrees with Lisa Shapiro that Descartes is a virtue ethicist, due to his commitment to virtue being the supreme good. However, Naaman-Zauderer claims that Descartes has a deontological understanding of virtue, and thus Descartes is actually a deontological virtue ethicist. Broadly construed, deontological ethics maintain that the standard of morality consists in the fulfillment of imperatives, duties, or ends.

Descartes indeed speaks of virtue in deontological terms. For example, he writes that the supreme good (virtue) is “undoubtedly the thing we ought to set ourselves as the goal of all our actions” (Letter to Princess Elizabeth 18 August 1645, AT IV: 275/CSMK: 2561). According to Naaman-Zauderer, Descartes is claiming that we have a duty to practice virtue: “the practice of virtue as a command of reason, as a constitutive moral imperative that we must fulfill for its own sake” (2010: 185).

c. Perfectionism

Frans Svensson (2010; compare 2019a) has argued that Descartes is not a virtue ethicist, and that other commentators have mistakenly classified him as such due to a misunderstanding of the criteria of virtue ethics. Recall that Shapiro and Naaman-Zauderer claim that Descartes must be a virtue ethicist (of whatever stripe) due to his claim that virtue is the supreme good. However, Svensson claims that virtue ethics, deontological ethics, and consequential ethics alike can, strictly speaking, admit that virtue is the supreme good, in the sense that virtue should be the goal in all of our actions (2010: 217). Descartes’ account of the supreme good, then, does not make him a virtue ethicist.

The criterion for being a virtue ethicist is that “morally right conduct should be grounded ultimately in an account of virtue or a virtuous agent” (Ibid. 218). This requires an explanation of the nature of virtue that does not depend on some independent account of morally right conduct. The problem, however, is that although Descartes agrees that virtue can be explained without reference to some independent account of morally right conduct, Descartes departs from the virtue ethicist in that he thinks that virtue is not constitutive of morally right conduct.

Instead, Svensson proposes that Descartes is committed to perfectionism. In this view, what Descartes’ ethics demands is that the moral agent pursue “everything in his power in order to successfully promote his own overall perfection as far as possible” (Ibid. 221). As such, Svensson claims that Descartes’ ethics is “outcome-based, rather than virtue-based, and it is thus best understood as a kind of teleological, or even consequentialist ethics” (Ibid. 224).

11. Systematicity Revisited

Are there systematic connections between Descartes’ ethics and his metaphysics, epistemology, and natural philosophy? There are broadly two answers to this question in the literature: the epistemological reading and the organic reading.

a. The Epistemological Reading

In the epistemological reading, the tree of philosophy conveys an epistemological order to Cartesian philosophy (Marshall 1998, 2–4, 72– 74, 59–60; Morgan 1994, 204–211; Rutherford 2004, 190). One must learn philosophy in the following order: metaphysics and epistemology, physics, and then the various sub-branches of natural philosophy, and finally ethics. As applied to ethics, proponents of the epistemological reading are primarily concerned with an epistemological order to ethics qua practical enterprise, not theoretical enterprise. For example, in order to acquire virtue and happiness, one must have knowledge of metaphysics and epistemology. As Donald Rutherford writes: virtue and happiness “can be guaranteed only if reason itself has been perfected through the acquisition and proper ordering of intellectual knowledge” (2004: 190).

A consequence of the epistemological reading is that one cannot read any ethical practices into the Meditations. While there may be ethical themes in the Meditations, the meditator cannot acquire or exercise any kind of moral virtue (epistemic virtue is a separate matter). The issue of whether virtue has a role in the Meditations has been a contemporary topic of debate. In particular, there has been a debate about whether the meditator acquires the virtue of generosity. Recall that the virtue of generosity consists of two components: the knowledge that the only thing that truly belongs to us is free will, and the firm and constant resolution to use the will well. It seems that the meditator, in the Fourth Meditation, acquires both of these components through her reflection on the nature of the will and her resolution to use the will well. Indeed, Lisa Shapiro has argued extensively that this is exactly what is happening, and thus generosity—and ethics more generally—has a role in the epistemic achievements of the meditator and the regulation of her passions. Omri Boehm (2014) has also argued that the virtue of generosity is actually acquired in the Second Meditation vis-à-vis the cogito. Parvizian (2016) has argued against Shapiro and Boehm’s view, arguing that generosity presupposes the knowledge of T1–T6 explained in section 4, which the meditator does not have access to by the Second or Fourth Meditation. But let us turn to the view that argues that ethics does have a role in metaphysics and epistemology.

b. The Organic Reading

In the organic reading, the tree of philosophy does not represent strict divisions between philosophical fields, and there is not a strict epistemological order to philosophy, and especially ethics qua practical enterprise. Rather the tree is organic. This reading is drawn from Lisa Shapiro (2008a), Genevieve Rodis-Lewis (1987), Amy Schmitter (2002), and Vance Morgan (1994) (although Morgan does not draw the same conclusion about ethics as the rest of these commentators). Morgan writes: “in a living organism such as a tree, all the connected parts grow simultaneously, dependent upon one another . . . hence the basic structure of the tree, branches and all, is apparent at the very early stage in its development” (1994, 25). Developing Rodis-Lewis’ interpretation, Shapiro writes:

Generosity is a seed-bearing fruit, and that seed, if properly cultivated, will grow into the tree of philosophy. In this way, morals is not simply one branch among the three branches of philosophy, but provides the ‘ultimate level of wisdom’ by leading us to be virtuous and ensuring the tree of philosophy continues to thrive. (2008a: 459)

Applying this view to generosity, Shapiro claims that generosity is “the key to Cartesian metaphysics and epistemology” (2008a: 459). Placing generosity in the Meditations has interpretive benefits. In particular, it may be able to explain the presence and regulation of the meditator’s passions from the First to Sixth Meditation (Shapiro 2005). Moreover, it shows the deep systematicity of Descartes’ ethics, for ethical themes are present right at the foundations of the system.

12. References and Further Reading

a. Abbreviations

AG: Philosophical Essays (cited by page)
AT: Oeuvres de Descartes (cited by volume and page)
CSM: The Philosophical Writings of Descartes, vol. 1 & 2 (cited by volume and page) ‘CSMK’: The Philosophical Writings of Descartes, vol. 3 (cited by page).

b. Primary Sources

Aristotle. (2000). Nicomachean Ethics. Translated by Terence Irwin (Second Edition). Indianapolis: Hackett.
Descartes, R. (1996), Oeuvres de Descartes. (C. Adam, & P. Tannery, Eds.) Paris: J. Vrin.
Descartes, R. (1985). The Philosophical Writings of Descartes (Vol. II). (J. Cottingham, R. Stoothoff, & D. Murdoch, Trans.) Cambridge: Cambridge University Press.
Descartes, R. (1985). The Philosophical Writings of Descartes (Vol. I). (J. Cottingham, R. Stoothoff, & D. Murdoch, Trans.) Cambridge: Cambridge University Press.
Descartes, R. (1991). The Philosophical Writings of Descartes: The Correspondence (Vol. III). (J. Cottingham, R. Stoothoff, D. Murdoch, & A. Kenny, Trans.) Cambridge: Cambridge University Press.
Leibniz, G. W. (1989). Philosophical Essays. Trans. Ariew, R. and Garber, D. Indianapolis: Hackett.
Princess Elizabeth and Descartes (2007). The Correspondence Between Princess Elizabeth of Bohemia and René Descartes. Edited and Translated by Lisa Shapiro. University of Chicago Press.

c. Secondary Sources

Alanen, L. (2003a). Descartes’s Concept of Mind. Harvard University Press.
Alanen, L. (2003b). “The Intentionality of Cartesian Emotions,” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books. 107–27.
Alanen, L. and Svennson, F. (2007). Descartes on Virtue. In Hommage à Wlodek Philosophical Papers Dedicated to Wlodek Rabinowicz, ed. by J.B. Petersson, D. Josefsson, and T. Egonsson. Rønnow-Rasmussen. http://www.fil.lu.se/hommageawlodek.
Ariew, R. (1992). “Descartes and the Tree of Knowledge,” Synthese, 1:101–116.
Beardsley, W. (2005), “Love in the Ruins: Passions in Descartes’ Meditations.” In J. Jenkins, J. Whiting, & C. Williams (Eds.), Persons and Passions: Essays in Honor of Annette Baier (pp. 34–47). Notre Dame: University of Notre Dame Press.
Beavers, A. F. (1989). “Desire and Love in Descartes’s Late Philosophy.” History of Philosophy Quarterly 6 (3):279–294.
Boehm, O. (2014), “Freedom and the Cogito,” British Journal for the History of Philosophy, 22: 704–724.
Boros, G. (2003). “Love as a Guiding Principle of Descartes’s Late Philosophy.” History of Philosophy Quarterly, 20(2), 149–163.
Brassfield, S. (2013), “Descartes and the Danger of Irresolution.” Essays in Philosophy, 14: 162–78.
Brown, D. J. (2006), Descartes and the Passionate Mind. Cambridge: Cambridge University Press.
Brown, D. J. (2012). Cartesian Functional Analysis. Australasian Journal of Philosophy 90 (1):75–92.
Chamberlain, C. (2019). “The body I call ‘mine’”: A sense of bodily ownership in Descartes. European Journal of Philosophy 27 (1): 3–24.
Clarke, D. M. (2005). Descartes’s Theory of Mind. Oxford University Press.
Cimakasky, Joseph & Polansky, Ronald (2012). Descartes’ ‘provisional morality’. Pacific Philosophical Quarterly 93 (3):353–372.
Davies, R. (2001), Descartes: Belief, Skepticism, and Virtue. London: Routledge.
Des Chene, D. (2012), “Using the Passions,” in M. Pickavé and L. Shapiro (eds.), Emotion and Cognitive Life in Medieval and Early Modern Philosophy. Oxford: Oxford University Press.
De Rosa, R. (2007a). ‘The Myth of Cartesian Qualia,’ Pacific Philosophical Quarterly 88(2), pp. 181–207.
Franco, A. B. (2015). “The Function and Intentionality of Cartesian Émotions.” Philosophical Papers 44 (3):277–319.
Franco, A. B. (2016). “Cartesian Passions: Our (Imperfect) Natural Guides Towards Perfection.” Journal of Philosophical Research 41: 401–438
Frierson, Patrick (2002). “Learning to love: From egoism to generosity in Descartes.” Journal of the History of Philosophy 40 (3):313–338.
Frigo, Alberto (2016). A very obscure definition: Descartes’s account of love in the Passions of the Soul and its scholastic background. British Journal for the History of Philosophy 24 (6):1097–1116.
Gottlieb, Joseph & Parvizian, Saja (2018). “Cartesian Imperativism.” Pacific Philosophical Quarterly (99): 702–725
Greenberg, Sean (2007). Descartes on the passions: Function, representation, and motivation. Noûs 41 (4):714–734.
Hatfield, G. (2013). ‘Descartes on Sensory Representation, Objective Reality, and Material Falsity,’ in K. Detlefsen (ed.) Descartes’ Meditations: A Critical Guide. Cambridge: Cambridge University Press, pp. 127–150.
Kambouchner, D. (2009). Descartes, la philosophie morale, Paris: Hermann.
LeDoeuff, M. (1989). “Red Ink in the Margins.” In The Philosophical Imaginary, trans. C. Gordon. Standford: Stanford Unviersity Press.
Marshall, J. (1998), Descartes’s Moral Theory. Ithaca: Cornell University Press.
Marshall, J. (2003). “Descartes’ Morale Par Provision,” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books.191–238
Mihali, A. (2011). “Sum Res Volans: The Centrality of Willing for Descartes.” International Philosophical Quarterly 51 (2):149–179.
Morgan, V. G. (1994), Foundations of Cartesian Ethics. Atlantic Highlands: Humanities Press.
Murdoch, D. (1993). “Exclusion and Abstraction in Descartes’ Metaphysics,” The
Philosophical Quarterly, 43: 38–57.
Naaman-Zauderer, N. (2010), Descartes’ Deontological Turn: Reason, Will, and Virtue in the Later Writings. Cambridge: Cambridge University Press.
Newman, L., “Descartes’ Epistemology”, The Stanford Encyclopedia of Philosophy (Winter 2014 Edition), Edward N. Zalta (ed.), URL = .
Parvizian, Saja (2016). “Generosity, the Cogito, and the Fourth Meditation.” Res Philosophica 93 (1):219–243
Pereboom, Derk, 1994. “Stoic Psychotherapy in Descartes and Spinoza,” Faith and Philosophy, 11: 592–625.
Rodis-Lewis, G. (1957). La morale de Descartes. [1. ed.] Paris: Presses universitaires de France
Rodis-Lewis, G. (1987), “Le Dernier Fruit de la Métaphysique Cartésienne: la Generosity”, Etudes Philosophiques, 1: 43–54.
Rutherford, D. (2004), “On the Happy Life: Descartes vis-à-vis Seneca,” in S. K. Strange, & J. Zupko (eds.), Stoicism: Traditions and Transformations. Cambridge: Cambridge University Press.
Rutherford, D. (2014). “Reading Descartes as a Stoic: Appropriate Actions, Virtue, and the Passions,” Philosophie antique, 14: 129–155.
Rysiew, Patrick, “Epistemic Contextualism”, The Stanford Encyclopedia of Philosophy (Winter 2016 Edition), Edward N. Zalta (ed.), URL = .
Schmitter, A. M. (2002), “Descartes and the Primacy of Practice: The Role of the Passions in the Search for Truth,” Philosophical Studies (108), 99–108.
Shapiro, L. (1999), “Cartesian Generosity,” Acta Philosophica Fennica, 64: 249–75.
Shapiro, L. , “What Are the Passions Doing in the Meditations?,” in J. Jenkins, J. Whiting, & C. Williams (eds.), Persons and Passions: Essays in Honor of Annette Baier. Notre Dame: University of Notre Dame Press.
Shapiro, L. (2008a), “Descartes’s Ethics,” In J. Broughton, & J. Carriero (eds.), A Companion to Descartes. Malden: Blackwell Publishing.
Shapiro, L. (2008b), ‘”Turn My Will in Completely the Opposite Direction”: Radical Doubt and Descartes’s Account of Free Will,” in P. Hoffman, D. Owen, & G. Yaffe (eds.), Contemporary Perspectives on Early Modern Philosophy. Buffalo: Broadview Press.
Shapiro, L. (2011), “Descartes on Human Nature and the Human Good,” in C. Fraenkel, J. E. Smith, & P. Dario (eds.), The Rationalists: Between Tradition and Innovation. New York: Springer.
Shapiro, L. (2013), “Cartesian Selves,” in K. Detlefsen (ed.), Descartes’ Meditations: A Critical Guide. Cambridge: Cambridge University Press.
Simmons, A. (1999). ‘Are Cartesian Sensations Representational?’ Noûs 33(3), pp. 347–369.
Svensson, F. (2010). The Role of Virtue in Descartes’ Ethical Theory, Or: Was Descartes a Virtue Ethicist?. History of Philosophy Quarterly 27(3): 215–236
Svensson, F. (2011). Happiness, Well-being, and Their Relation to Virtue in Descartes’ Ethics. Theoria 77 (3):238–260.
Svensson, F. (2015). Non-Eudaimonism, The Sufficiency of Virtue for Happiness, and Two Senses of the Highest Good in Descartes’s Ethics. British Journal for the History of Philosophy 23 (2):277–296.
Svensson, F. (2019a) “A Cartesian Distinction in Virtue: Moral and Perfect.” In Mind, Body, and Morality: New Perspectives on Descartes and Spinoza edited by Martina Reuter, Frans Svensson. Routledge.
Svensson, F. (2019b). “Descartes on the Highest Good.” American Catholic Philosophical Quarterly 93 (4):701–721.
Sosa, E. (2012), “Descartes and Virtue Epistemology,” in K. J. Clark, & M. Rea (eds.), Reason, Metaphysics, and Mind: New Essays on the Philosophy of Alvin Plantinga. Oxford: Oxford University Press.
Williston, B. (1997). Descartes on Love and/as Error. Journal of the History of Ideas 58 (3):429–444.
Williston, B. (2003). “The Cartesian Sage and the Problem of Evil” in Passion and Virtue in Descartes, edited by B. Williston and A. Gombay. Amherst, NY: Humanity Books. 301–331.

Author Information

Saja Parvizian
Email: sparvizia@coastal.edu
Coastal Carolina University
U. S. A.

Aristotle (384 B.C.E.—322 B.C.E.)

Aristotle is a towering figure in ancient Greek philosophy, who made important contributions to logic, criticism, rhetoric, physics, biology, psychology, mathematics, metaphysics, ethics, and politics. He was a student of Plato for twenty years but is famous for rejecting Plato’s theory of forms. He was more empirically minded than both Plato and Plato’s teacher, Socrates.

A prolific writer, lecturer, and polymath, Aristotle radically transformed most of the topics he investigated. In his lifetime, he wrote dialogues and as many as 200 treatises, of which only 31 survive. These works are in the form of lecture notes and draft manuscripts never intended for general readership. Nevertheless, they are the earliest complete philosophical treatises we still possess.

As the father of western logic, Aristotle was the first to develop a formal system for reasoning. He observed that the deductive validity of any argument can be determined by its structure rather than its content, for example, in the syllogism: All men are mortal; Socrates is a man; therefore, Socrates is mortal. Even if the content of the argument were changed from being about Socrates to being about someone else, because of its structure, as long as the premises are true, then the conclusion must also be true. Aristotelian logic dominated until the rise of modern propositional logic and predicate logic 2000 years later.

The emphasis on good reasoning serves as the backdrop for Aristotle’s other investigations. In his natural philosophy, Aristotle combines logic with observation to make general, causal claims. For example, in his biology, Aristotle uses the concept of species to make empirical claims about the functions and behavior of individual animals. However, as revealed in his psychological works, Aristotle is no reductive materialist. Instead, he thinks of the body as the matter, and the psyche as the form of each living animal.

Though his natural scientific work is firmly based on observation, Aristotle also recognizes the possibility of knowledge that is not empirical. In his metaphysics, he claims that there must be a separate and unchanging being that is the source of all other beings. In his ethics, he holds that it is only by becoming excellent that one could achieve eudaimonia, a sort of happiness or blessedness that constitutes the best kind of human life.

Aristotle was the founder of the Lyceum, a school based in Athens, Greece; and he was the first of the Peripatetics, his followers from the Lyceum. Aristotle’s works, exerted tremendous influence on ancient and medieval thought and continue to inspire philosophers to this day.

Life and Lost Works
Analytics or “Logic”
Theoretical Philosophy
Practical Philosophy
Aristotle’s Influence
Abbreviations
1. Abbreviations of Aristotle’s Works
2. Other Abbreviations
References and Further Reading
1. Aristotle’s Complete Works
2. Secondary Sources

1. Life and Lost Works

Though our main ancient source on Aristotle’s life, Diogenes Laertius, is of questionable reliability, the outlines of his biography are credible. Diogenes reports that Aristotle’s Greek father, Nicomachus, served as private physician to the Macedonian king Amyntas (DL 5.1.1). At the age of seventeen, Aristotle migrated to Athens where he joined the Academy, studying under Plato for twenty years (DL 5.1.9). During this period Aristotle acquired his encyclopedic knowledge of the philosophical tradition, which he draws on extensively in his works.

Aristotle left Athens around the time Plato died, in 348 or 347 B.C.E. One explanation is that as a resident alien, Aristotle was excluded from leadership of the Academy in favor of Plato’s nephew, the Athenian citizen Speusippus. Another possibility is that Aristotle was forced to flee as Philip of Macedon’s expanding power led to the spread of anti-Macedonian sentiment in Athens (Chroust 1967). Whatever the cause, Aristotle subsequently moved to Atarneus, which was ruled by another former student at the Academy, Hermias. During his three years there, Aristotle married Pythias, the niece or adopted daughter of Hermias, and perhaps engaged in negotiations or espionage on behalf of the Macedonians (Chroust 1972). Whatever the case, the couple relocated to Macedonia, where Aristotle was employed by Philip, serving as tutor to his son, Alexander the Great (DL 5.1.3–4). Aristotle’s philosophical career was thus directly entangled with the rise of a major power.

After some time in Macedonia, Aristotle returned to Athens, where he founded his own school in rented buildings in the Lyceum. It was presumably during this period that he authored most of his surviving texts, which have the appearance of lecture transcripts edited so they could be read aloud in Aristotle’s absence. Indeed, this must have been necessary, since after his school had been in operation for thirteen years, he again departed from Athens, possibly because a charge of impiety was brought against him (DL 5.1.5). He died at age 63 in Chalcis (DL 5.1.10).

Diogenes tells us that Aristotle was a thin man who dressed flashily, wearing a fashionable hairstyle and a number of rings. If the will quoted by Diogenes (5.1.11–16) is authentic, Aristotle must have possessed significant personal wealth, since it promises a furnished house in Stagira, three female slaves, and a talent of silver to his concubine, Herpyllis. Aristotle fathered a daughter with Pythias and, with Herpyllis, a son, Nicomachus (named after his grandfather), who may have edited Aristotle’s Nicomachean Ethics. Unfortunately, since there are few extant sources on Aristotle’s life, one’s judgment about the accuracy and completeness of these details depends largely on how much one trusts Diogenes’ testimony.

Since commentaries on Aristotle’s work have been produced for around two thousand years, it is not immediately obvious which sources are reliable guides to his thought. Aristotle’s works have a condensed style and make use of a peculiar vocabulary. Though he wrote an introduction to philosophy, a critique of Plato’s theory of forms, and several philosophical dialogues, these works survive only in fragments. The extant Corpus Aristotelicum consists of Aristotle’s recorded lectures, which cover almost all the major areas of philosophy. Before the invention of the printing press, handwritten copies of these works circulated in the Near East, northern Africa, and southern Europe for centuries. The surviving manuscripts were collected and edited in August Immanuel Bekker’s authoritative 1831–1836 Berlin edition of the Corpus (“Bekker” 1910). All references to Aristotle’s works in this article follow the standard Bekker numbering.

The extant fragments of Aristotle’s lost works, which modern commentators sometimes use as the basis for conjectures about his philosophical development, are noteworthy. A fragment of his Protrepticus preserves a striking analogy according to which the psyche or soul’s attachment to the body is a form of punishment:

The ancients blessedly say that the psyche pays penalty and that our life is for the atonement of great sins. And the yoking of the psyche to the body seems very much like this. For they say that, as Etruscans torture captives by chaining the dead face to face with the living, fitting each to each part, so the psyche seems to be stretched throughout, and constrained to all the sensitive members of the body. (Pistelli 1888, 47.24–48.1)

According to this allegedly inspired theory, the fetters that bind the psyche to the body are similar to those by which the Etruscans torture their prisoners. Just as the Etruscans chain prisoners face to face with a dead body so that each part of the living body touches a part of the corpse, the psyche is said to be aligned with the parts of one’s living body. On this view, the psyche is embodied as a painful but corrective atonement for its badness. (See Bos 2003 and Hutchinson and Johnson’s webpage).

The incompatibility of this passage with Aristotle’s view that the psyche is inseparable from the body (discussed below) has been explained in various ways. Neo-Platonic commentators distinguish between Aristotle’s esoteric and exoteric writings, that is, writings intended for circulation within his school, and writings like the Protrepticus intended for a broader reading public (Gerson 2005, 47–75). Some modern scholars have argued to the contrary that the imprisonment of the psyche in the body indicates that Aristotle was still a Platonist at the time he composed the Protrepticus, which must have been written earlier than his mature works (Jaeger 1948, 100). Aristotle’s dialogue Eudemus, which contains arguments for the immortality of the psyche, and his Politicus, which is about the ideal statesman, seem to corroborate the view that Aristotle’s exoteric works hold much that is Platonic in spirit (Chroust 1965; 1966). The latter contains the seemingly Platonic assertion that “the good is the most exact of measures” (Kroll 1902, 168: 927b4–5).

But not all agree. Owen (1968, 162–163) argues that Aristotle’s fundamental logical distinction between individual and species depends on an antecedent break with Plato. According to this view, Aristotle’s On Ideas (Fine 1993), a collection of arguments against Platonic forms, shows that Aristotle rejected Platonism early in his career, though he later became more sympathetic to the master’s views. However, as Lachterman (1980) points out, such historical theses depend on substantive hermeneutical assumptions about how to read Aristotle and on theoretical assumptions about what constitutes a philosophical system. This article focuses not on this historical debate but on the theories propounded in Aristotle’s extant works.

2. Analytics or “Logic”

Aristotle is usually identified as the founder of logic in the West (although autonomous logical traditions also developed in India and China), where his “Organon,” consisting of his works the Categories, On Interpretation, Prior Analytics, Posterior Analytics, Sophistical Refutations, and Topics, long served as the traditional manuals of logic. Two other works—Rhetoric and Poetics—are not about logic, but also concern how to communicate to an audience. Curiously, Aristotle never used the words “logic” or “organon” to refer to his own work but calls this discipline “analytics.” Though Aristotelian logic is sometimes referred to as an “art” (Ross 1940, iii), it is clearly not an art in Aristotle’s sense, which would require it to be productive of some end outside itself. Nevertheless, this article follows the convention of referring to the content of Aristotle’s analytics as “logic.”

a. The Meaning and Purpose of Logic

What is logic for Aristotle? On Interpretation begins with a discussion of meaning, according to which written words are symbols of spoken words, while spoken words are symbols of thoughts (Int.16a3–8). This theory of signification can be understood as a semantics that explains how different alphabets can signify the same spoken language, while different languages can signify the same thoughts. Moreover, this theory connects the meaning of symbols to logical consequence, since commitment to some set of utterances rationally requires commitment to the thoughts signified by those utterances and to what is entailed by them. Hence, though Cook Wilson (1926, 30–33) correctly notes that Aristotle nowhere defines logic, it may be called the science of thinking, where the role of the science is not to describe ordinary human reasoning but rather to demonstrate what one ought to think given one’s other commitments. Though the elements of Aristotelian logic are implicit in our conscious reasoning, Aristotelian “analysis” makes explicit what was formerly implicit (Cook Wilson 1926, 49).

Aristotle shows how logic can demonstrate what one should think, given one’s commitments, by developing the syntactical concepts of truth, predication, and definition. In order for a written sentence, utterance, or thought to be true or false, Aristotle says, it must include at least two terms: a subject and a predicate. Thus, a simple thought or utterance such as “horse” is neither true nor false but must be combined with another term, say, “fast” in order to form a compound—“the horse is fast”—that describes reality truly or falsely. The written sentence “the horse is fast” has meaning insofar as it signifies the spoken sentence, which in turn has meaning in virtue of its signifying the thought that the horse is fast (Int.16a10–18, Cat.13b10–12, DA 430a26–b1). Aristotle holds that there are two kinds of constituents of meaningful sentences: nouns and their derivatives, which are conventional symbols without tense or aspect; and verbs, which have a tense and aspect. Though all meaningful speech consists of combinations of these constituents, Aristotle limits logic to the consideration of statements, which assert or deny the presence of something in the past, present, or future (Int.17a20–24).

Aristotle analyzes statements as cases of predication, in which a predicate P is attributed to a subject S as in a sentence of the form “S is P.” Since he holds that every statement expresses something about being, statements of this form are to be read as “S is (exists) as a P” (Bäck 2000, 11). In every true predication, either the subject and predicate are of the same category, or the subject term refers to a substance while the predicate term refers to one of the other categories. The primary substances are individuals, while secondary substances are species and genera composed of individuals (Cat.2a11–18). This distinction between primary and secondary reflects a dependence relation: if all the individuals of a species or genus were annihilated, the species and genus could not, in the present tense, be truly predicated of any subject.

Every individual is of a species and that species is predicated of the individual. Every species is the member of a genus, which is predicated of the species and of each individual of that species (Cat.2b13–22). For example, if Callias is of the species “man,” and the species is a member of the genus “animal,” then “man” is predicated of Callias, and “animal” is predicated both of “man” and of Callias. The individual, Callias, inherits the predicate “animal” in virtue of being of the species “man.” But inheritance stops at the individual and does not apply to its proper parts. For example, “man” is not truly predicated of Callias’ hand. A genus can be divided with reference to the specific differences among its members; for example, “biped” differentiates “man” from “horse.”

While no definition can be given of an individual or primary substance such as Callias, when one gives the genus and all the specific differences possessed by a kind of thing, one can define a thing’s species. A specific difference is a predicate that falls under one of the categories. Thus, Aristotelian categories can be seen as a taxonomical scheme, a way of organizing predicates for discovery, or as a metaphysical doctrine about the kinds of beings there are. But any reading must accommodate Aristotle’s views that primary substances are never predicated of a subject (Cat.3a6), that a predicate may fall under multiple categories (Cat.11a20–39), and that some terms, such as “good,” are predicated in all the categories (NE 1096a23–29). Moreover, definitions are reached not by demonstration but by other kinds of inquiry, such as dialectic, the art by which one makes divisions in a genus; and induction, which can reveal specific differences from the observation of individual examples.

b. Demonstrative Syllogistic

Syllogistic reasoning builds on Aristotle’s theory of predication, showing how to reason from premises to conclusions. A syllogism is a discourse in which when taking some statements as premises a different statement can be shown to follow as a conclusion (AnPr.24b18–22). The basic form of the Aristotelian syllogism involves a major premise, a minor premise, and a conclusion, so that it has the form

If A is predicated of all B,

And B is predicated of all C,

Then A is predicated of all C.

This is an assertion of formal logic, since by removing the values of the variables A, B, and C, one treats the inference formally, such that the values of the subject A and predicates B and C are not given as part of the syllogistic form (Łukasiewicz, 10–14).

Though this form can be utilized in dialectic, in which the major term A is related to C through the middle term B credibly rather than necessarily (AnPo.81b10–23), Aristotle is mainly concerned with how to use syllogistic in what he calls demonstrative reasoning, that is, in inference from certain premises to a certain conclusion. A demonstrative syllogism is not concerned with a mere opinion but proves a cause, that is, answers a “why” question (AnPo.85b 23–26).

The validity of a syllogism can be tested through comparison of four basic types of assertions: All S are P (A), No S are P (E), Some S are P (I), and Some S are not P (O). The truth conditions of these assertions are determined relationally: through contradiction, in which if one of the assertions is true, the other must be false; contrariety, in which both assertions cannot be true; and subalternation, in which the universal assertion’s being true requires that the particular assertion must be true, as well. These relationships are summed up in the traditional square of opposition used by medieval Aristotelian logicians. (see Groarke, Aristotle: Logic).

Figure 1: The Traditional Square of Opposition illustrates the relations between the fundamental judgment-forms in Aristotelian syllogistic: (A) All S are P, (E) No S are P, (I) Some S are P, and (O) Some S are not P.

Syllogistic may be employed dialectically when the premises are accepted on the authority of common opinion, from tradition, or from the wise. In any dialectical syllogism, the premises can be generally accepted opinions rather than necessary principles (Top.100a25–b21). At least some premises in rhetorical proofs must be not necessary but only probable, happening only for the most part.

When the premises are known, and conclusions are shown to follow from those premises, one gains knowledge by demonstration. Demonstration is necessary (AnPo.73a21–27) because the conclusion of a demonstrative syllogism predicates something that is either necessarily true or necessarily false of the subject of the premise. One has demonstrative knowledge when one knows the premises and has derived a necessary conclusion from them, since the cause given in the premises explains why the conclusion is so (AnPo.75a12–17, 35–37). Consequently, valid demonstration depends on the known premises containing terms for the genus of which the species in the conclusion is a member (AnPo.76a29–30).

One interesting problem that arises within Aristotle’s theory of demonstration concerns the connection between temporality and necessity. By the principle of excluded middle, necessarily, either there will be a sea-battle tomorrow or there will not be a sea-battle tomorrow. But since the sea-battle itself has yet neither come about nor failed to come about, it seems that one must say, paradoxically, that one alternative is necessary but that either alternative might come about (Int.19a22–34). The question of how to account for unrealized possibilities and necessities is part of Aristotle’s modal syllogistic, which is discussed at length in his Prior Analytics. For a discussion, see Malink (2013).

c. Induction, Experience, and Principles

Whenever a speaker reasons from premises, an auditor can ask for their demonstration. The speaker then needs to adduce additional premises for that demonstration. But if this line of questioning went on interminably, no demonstration could be made, since every premise would require a further demonstration, ad infinitum. In order to stop an infinite regress of premises, Aristotle postulates that for an inference to count as demonstrative, one must know its indemonstrable premises (AnPo.73a16–20). Thus, demonstrative science depends on the view that all teaching and learning proceed from already present knowledge (AnPo.72b5–20). In other words, the possibility of making a complete argument, whether inductive or deductive, depends on the reasoner possessing the concept in question.

The acquisition of concepts must in some way be perceptual, since Aristotle says that universals come to rest in the soul through experience, which comes about from many memories of the same thing, which in turn comes about by perception (AnPo.99b32–100a9). However, Aristotle holds that some concepts are already manifested in one’s perceptual experience: children initially call all men father and all women mother, only later developing the capacity to apply the relevant concepts to particular individuals (Phys.184b3–5). As Cook Wilson (1926, 45) puts it, perception is in a way already of a universal. Upon learning to speak, the child already possesses the concept “mother” but does not grasp the conditions of its correct application. The role of perception, and hence of memory and experience, is then not to supply the child with universal concepts but to fix the conditions under which they are correctly predicated of an individual or species. Hence the ability to arrive at definitions, which serve as starting points of a science, rests on the human being’s natural capacity to use language and on the culturally specific social and political conditions in which that capacity is manifested (Winslow 2013, 45–49).

While deduction proceeds by a form of syllogistic reasoning in which the major and minor premise both predicate what is necessarily true of a subject, inductive reasoning moves from particulars to universals, so it is impossible to gain knowledge of universals except by induction (AnPo.81a38–b9). This movement, from the observation of the same occurrence, to an experience that emerges from many memories, to a universal judgment, is a cognitive process by which human beings understand reality (see AnPo.88a2–5, Met.980b28–981a1, EN 1098b2–4, 1142a12).

But what makes such an inference a good one? Aristotle seems to say an inductive inference is sound when what is true in each case is also true of the class under which the cases fall (AnPr.68b15–29). For example, it is inferred from the observation that each kind of bileless animal (men, horses, mules, and so on) is long-lived just when the following syllogism is sound: (1) All men, horses, mules, and so on are long-lived; (2) All long-lived animals are bileless; therefore (3) all men, horses, mules, and so on are bileless (see Groarke sections 10 and 11). However, Aristotle does not think that knowledge of universals is pieced together from knowledge of particulars but rather he thinks that induction is what allows one to actualize knowledge by grasping how the particular case falls under the universal (AnPr.67a31–b5).

A true definition reveals the essential nature of something, what it is to be that thing (AnPo.90b30–31). A sound demonstration shows what is necessary of an observed subject (AnPo.90b38–91a5). It is essential, however, that the observation on which a definition is based be inductively true, that is, that it be based on causes rather than on chance. Regardless of whether one is asking what something is in a definition or why something is the way it is by giving its cause, it is only when the principles or starting points of a science are given that demonstration becomes possible. Since experience is what gives the principles of each science (AnPr.46a17–27), logic can only be employed at a later stage to demonstrate conclusions from these starting points. This is why logic, though it is employed in all branches of philosophy, is not a part of philosophy. Rather, in the Aristotelian tradition, logic is an instrument for the philosopher, just as a hammer and anvil are instruments for the blacksmith (Ierodiakonou 1998).

d. Rhetoric and Poetics

Just as dialectic searches for truth, Aristotelian rhetoric serves as its counterpart (Rhet.1354a1), searching for the means by which truth can be grasped through language. Thus, rhetorical demonstration, or enthymeme, is a kind of syllogism that strictly speaking belongs to dialectic (Rhet.1355a8–10). Because rhetoric uses the particularly human capacity of reason to formulate verbal arguments, it is the art that can cause the most harm when it is used wrongly. It is thus not a technique for persuasion at any cost, as some Sophists have taught, but a fundamentally second-personal way of using language that allows the auditor to reach a judgment (Grimaldi 1972, 3–5). More fundamentally, rhetoric is defined as the detection of persuasive features of each subject matter (Rhet.1355b12–22).

Proofs given in speech depend on three things: the character (ethos) of the speaker, the disposition (pathos) of the audience, and the meaning (logos) of the sounds and gestures used (Rhet.1356a2–6). Rhetorical proofs show that the speaker is worthy of credence, producing an emotional state (pathos) in the audience, or demonstrating a consequence using the words alone. Aristotle holds that ethos is the most important of these elements, since trust in the speaker is required if one is to believe the speech. However, the best speech balances ethos, pathos, and logos. In rhetoric, enthymemes play a deductive role, while examples play an inductive role (Rhet.1356b11–18).

The deductive form of rhetoric, enthymeme, is a dialectical syllogism in which the probable premise is suppressed so that one reasons directly from the necessary premise to the conclusion. For example, one may reason that an animal has given birth because she has milk (Rhet.1357b14–16) without providing the intermediate premise. Aristotle also calls this deductive form of inference “reasoning by signs” or “reasoning from evidence,” since the animal’s having milk is a sign of, or evidence for, her having given birth. Though the audience seemingly “immediately” grasps the fact of birth without it being given in perception, the passage from the perception to the fact is inferential and depends on the background assumption of the suppressed premise.

The inductive form of rhetoric, reasoning from example, can be illustrated as follows. Peisistratus in Athens and Theagenes in Megara both petitioned for guards shortly before establishing themselves as tyrants. Thus, someone plotting a tyranny requests a guard (Rhet.1357b30–37). This proof by example does not have the force of necessity or universality and does not count as a case of scientific induction, since it is possible someone could petition for a guard without plotting a tyranny. But when it is necessary to base some decision, for example, whether to grant a request for a bodyguard, on its likely outcome, one must look to prior examples. It is the work of the rhetorician to know these examples and to formulate them in such a way as to suggest definite policies on the basis of that knowledge.

Rhetoric is divided into deliberative, forensic, and display rhetoric. Deliberative rhetoric is concerned with the future, namely with what to do, and the deliberative rhetorician is to discuss the advantages and harms associated with a specific course of action. Forensic rhetoric, typical of the courtroom, concerns the past, especially what was done and whether it was just or unjust. Display rhetoric concerns the present and is about what is noble or base, that is, what should be praised or denigrated (Rhet.1358b6–16). In all these domains, the rhetorician practices a kind of reasoning that draws on similarities and differences to produce a likely prediction that is of value to the political community.

A common characteristic of insightful philosophers, rhetoricians, and poets is the capacity to observe similarities in things that are unlike, as Archytas did when he said that a judge and an alter are kindred, since someone who has been wronged has recourse to both (Rhet.1412a10–14). This noticing of similarities and differences is part of what separates those who are living the good life from those who are merely living (Sens.437a2–3). Likewise, the highest achievement of poetry is to use good metaphors, since to make metaphors well is to contemplate what is like (Poet.1459a6–9). Poetry is thus closely related to both philosophy and rhetoric, though it differs from them in being fundamentally mimetic, imitating reality through an artistic form.

Imitation in poetry is achieved by means of rhythm, language, and harmony (Poet.1447a13–16, 21–22). While other arts share some or all these elements—painting imitates visually by the same means, while dance imitates only through rhythm—poetry is a kind of vocalized music, in which voice and discursive meaning are combined. Aristotle is interested primarily in the kinds of poetry that imitate human actions, which fall into the broad categories of comedy and tragedy. Comedy is an imitation of worse types of people and actions, which reflect our lower natures. These imitations are not despicable or painful, but simply ridiculous or distorted, and observing them gives us pleasure (Poet.1449a31–38). Aristotle wrote a book of his Poetics on comedy, but the book did not survive. Hence, through a historical accident, the traditions of aesthetics and criticism that proceed from Aristotle are concerned almost completely with tragedy.

Tragedy imitates actions that are excellent and complete. As opposed to comedy, which is episodic, tragedy should have a single plot that ends in a presentation of pity and fear and thus a catharsis—a cleansing or purgation—of the passions (Poet.1449b24–28). (As discussed below, the passions or emotions also play an important role in Aristotle’s practical philosophy.) The most important aspect of a tragedy is how it uses a story or myth to lead the psyches of its audience to this catharsis (Poet.1450a32–34). Since the beauty or fineness of a thing—say, of an animal—consists in the orderly arrangement of parts of a definite magnitude (Poet.1450b35–38), the parts of a tragedy should also be proportionate.

A tragedy’s ability to lead the psyche depends on its myth turning at a moment of recognition at which the central character moves from a state of ignorance to a state of knowledge. In the best case, this recognition coincides with a reversal of intention, such as in Sophocles’ Oedipus, in which Oedipus recognizes himself as the man who was prophesied to murder his father and marry his mother. This moment produces pity and fear in the audience, fulfilling the purpose of tragic imitation (Poet.1452a23–b1). The pity and fear produced by imitative poetry are the source of a peculiar form of pleasure (Poet.1453b11–14). Though the imitation itself is a kind of technique or art, this pleasure is natural to human beings. Because of this potential to produce emotions and lead the psyche, poetics borders both on what is well natured and on madness (Poet.1455a30–34).

Why do people write plays, read stories, and watch movies? Aristotle thinks that because a series of sounds with minute differences can be strung together to form conventional symbols that name particular things, hearing has the accidental property of supporting meaningful speech, which is the cause of learning (Sens.437a10–18). Consequently, though sound is not intrinsically meaningful, voice can carry meaning when it “ensouled,” transmitting an appearance about how absent things might be (DA 420b5-10, 27–33). Poetry picks up on this natural capacity, artfully imitating reality in language without requiring that things are actually the way they are presented as being (Poet.1447a13–16).

The poet’s consequent power to lead the psyche through true or false imitations, like the rhetorician’s power to lead it through persuasive speech, leads to a parallel question: how should the poet use his power? Should the poet imitate things as they are, or as they should be? Though it is clear that the standard of correctness in poetry and politics is not the same (Poet.1460b13–1461a1), the question of how and to what extent the state should constrain poetic production remains unresolved.

3. Theoretical Philosophy

Aristotle’s classification of the sciences makes a distinction between theoretical philosophy, which aims at contemplation, and practical philosophy, which aims at action or production. Within theoretical philosophy, first philosophy studies objects that are motionless and separate from material things, mathematics studies objects that are motionless but not separate, and natural philosophy studies objects that are in motion and not separate (Met.1026a6–22).

This threefold distinction among the beings that can be contemplated corresponds to the level of precision that can be attained by each branch of theoretical philosophy. First philosophy can be perfectly exact because there is no variation among its objects and thus it has the potential to give one knowledge in the most profound sense. Mathematics is also absolutely certain because its objects are unchanging, but since there are many mathematical objects of a given kind (for example, one could draw a potentially infinite number of different triangles), mathematical proofs require a peculiar method that Aristotle calls “abstraction.” Natural philosophy gives less exact knowledge because of the diversity and variability of natural things and thus requires attention to particular, empirical facts. Studies of nature—including treatises on special sciences like cosmology, biology, and psychology—account for a large part of Aristotle’s surviving writings.

a. Natural Philosophy

Aristotle’s natural philosophy aims for theoretical knowledge about things that are subject to change. Whereas all generated things, including artifacts and products of chance, have a source that generates them, natural change is caused by a thing’s inner principle and cause, which may accordingly be called the thing’s “nature” (Phys.192b8–20). To grasp the nature of a thing is to be able to explain why it was generated essentially: the nature of a thing does not merely contribute to a change but is the primary determinant of the change as such (Waterlow 1982, p.28).

Though some hold that Aristotle’s principles are epistemic, explanatory concepts, principles are best understood ontologically as unique, continuous natures that govern the generation and self-preservation of natural beings. To understand a thing’s nature is primarily to grasp “how a being displays itself by its nature.” Such a grasp counts as a correct explanation only insofar as it constitutes a form of understanding of beings in themselves as they give themselves (Winslow 2007, 3–7).

Aristotle’s description of principles as the start and end of change (Phys.235b6) distinguishes between two kinds of natural change. Substantial change occurs when a substance is generated (Phys.225a1–5), for example, when the seed of a plant gives rise to another plant of the same kind. Non-substantial change occurs when a substance’s accidental qualities are affected, for example, the change of color in a ripening pomegranate. Aristotelians describe this as the activity of contraries of blackness and whiteness in the plant’s material in which the fruit of the pomegranate, as its juices become colored by ripening, itself becomes shaded, changing to a purple color (de Coloribus 796a20–26). Ripening occurs when heat burns up the air in the part of the plant near the ground, causing convection that alters the originally light color of the fruit to its dark contrary (de Plantis 820b19–23). Both kinds of change are caused by the plant’s containing in itself a principle of change. In substantial change, a new primary substance is generated; in non-substantial change, some property of preexisting substance changes to a contrary state.

A process of change is completely described when its four causes are given. This can be illustrated with Aristotle’s favorite example of the production of a bronze sculpture. The (1) material cause of the change is given when the underlying matter of the thing has been described, such as the bronze matter of which a statue is composed. The (2) formal cause is given when one says what kind of thing the thing is, for example, “sphere” for a bronze sphere or “Callias” for a bronze statue of Callias. The (3) efficient cause is given when one says what brought the change about, for example, when one names the sculptor. The (4) final cause is given when one says the purpose of the change, for example, when one says why the sculptor chose to make the bronze sphere (Phys.194b16–195a2).

In natural change the principle of change is internal, so the formal, efficient, and final causes typically coincide. Moreover, in such cases, the metaphysical and epistemological sides of causal explanation are normally unified: a formal cause counts both as a thing’s essence—what it is to be that thing—and as its rational account or reason for being (Bianchi 2014, 35). Thus, when speaking of natural changes rather than the making of an artifact, Aristotle will usually offer “hylomorphic” descriptions of the natural being as a compound of matter and form.

Because Aristotle holds that a thing’s underlying nature is analogous to the bronze in a statue (Phys.191a7–12), some have argued that the underlying thing refers to “prime matter,” that is, to an absolutely indeterminate matter that has no form. But Cook (1989) has shown that the underlying thing normally means matter that already has some form. Indeed, Aristotle claims that the matter of perceptible things has no separate existence but is always already informed by a contrary (Gen et Corr.329a25–27). The matter that traditional natural philosophy calls the “elements”—fire, water, air, and earth—already has the form of the basic contraries, hot and cold, and moist and dry, so that, for example, fire is matter with a hot and dry form (Gen et Corr.330a25–b4). Thus, even in the most basic cases, matter is always actually informed, even though the form is potentially subject to change. For example, throwing water on a fire cools and moistens it, and bringing about a new quality in the underlying material. Thus, Aristotle sometimes describes natural powers as being latent or active “in the material” (Meteor.370b14–18).

Aristotle’s general works in natural philosophy offer analyses of concepts necessarily assumed in accounts of natural processes, including time, change, and place. In general, Aristotle will describe changes that occur in time as arising from a potential, which is actualized when the change is complete. However, what is actual is logically prior to what is potential, since a potentiality aims at its own actualization and thus must be defined in terms of what is actual. Indeed, generically the actual is also temporally prior to potentiality, since there must invariably be a preexisting actuality that brings the potentiality to its own actualization (Met.1049b4–19). Perhaps because of the priority of the actual to the potential, whenever Aristotle speaks of natural change, he is concerned with a field of naturalistic inquiry that is continuous rather than atomistic and purposeful or teleological rather than mechanical. In his more specific naturalistic works, Aristotle lays out a program of specialized studies about the heavens and Earth, living things, and the psyche.

i. Cosmology and Geology

Aristotle’s cosmology depends on the basic observation that while bodies on Earth either rise to a limit or fall to Earth, heavenly bodies keep moving, without any apparent external force being exerted on them (DC 284a10–15). On the basis of this observation, he distinguishes between circular motion, which is operative in the “superlunary” heavens, and rectilinear motion on “sublunary” Earth below the Moon. Since all sublunary bodies move in a rectilinear pattern, the heavenly bodies must be composed of a different body that naturally moves in a circle (DC 269a2–10, Meteor.340b6–15). This body cannot have an opposite, because there is no opposite to circular motion (DC 270a20, compare 269a19–22). Indeed, since there is nothing to oppose its motion, Aristotle supposes that this fifth element, which he calls “aether,” as well as the heavenly bodies composed of it, move eternally (DC 275b1–5, 21–25).

In Aristotle’s view the heavens are ungenerated, neither coming to be nor passing away (DC 279b18–21, 282a24–30). Aristotle defines time as the number of motion, since motion is necessarily measured by time (Phys.224a24). Thus, the motion of the eternal bodies is what makes time, so the life and being of sublunary things depends on them. Indeed, Aristotle says that their own time is eternal or “aeon.”

Noticing that water naturally forms spherical droplets and that it flows towards the lowest point on a plane, Aristotle concludes that both the heavens and the earth are spherical (DC 287b1–14). This is further confirmed by observations of eclipses (DC 297b23–31) and that different stars are visible at different latitudes (DC 297b14–298a22).

The gathering of such observations is an important part of Aristotle’s scientific procedure (AnPr.46a17–22) and sets his theories above those of the ancients that lacked such “experience” (Phys.191a24–27). Just as in his biology, where Aristotle draws on animal anatomy observed at sacrifices (HA 496b25) and records reports from India (HA 501a25), so in his astronomy he cites Egyptian and Babylonian observations of the planets (DC 292a4–9). By gathering evidence from many sources, Aristotle is able to conclude that the stars and the Moon are spherical (DC 291b11–20) and that the Milky Way is an appearance produced by the sight of many stars moving in the outermost sphere (Meteor.346a16–24).

Assuming the hypothesis that the Earth does not move (DC 289b6–7), Aristotle argues that there are in the heavens both stars, which are large and distant from earth, and planets, which are smaller and closer. The two can be distinguished since stars appear to twinkle while planets do not (Aristotle somewhat mysteriously attributes the twinkling stars to their distance from the eye of the observer) (DC 290b14–24). Unlike earthly creatures, which move because of their distinct organs or parts, both the moving stars and the unmoving heaven that contains them are spherical (DC 289a30–b11). As opposed to superlunary (eternal) substances, sublunary beings, like clouds and human beings, participate in the eternal through coming to be and passing away. In doing so, the individual or primary substance is not preserved, but rather the species or secondary substance is preserved (as we shall see below, the same thought is utilized in Aristotle’s explanation of biological reproduction) (Gen et Corr.338b6–20).

Aristotle holds that the Earth is composed of four spheres, each of which is dominated by one of the four elements. The innermost and heaviest sphere is predominantly earth, on which rests upper spheres of water, air, and fire. The sun acts to burn up or vaporize the water, which rises to the upper spheres when heated, but when cooled later condenses into rain (Meteor.354b24–34). If unqualified necessity is restricted to the superlunary sphere, teleology—the seeking of ends that may or may not be brought about—seems to be limited to the sublunary sphere.

Due to his belief that the Earth is eternal, being neither created nor destroyed, Aristotle holds that the epochs move cyclically in patterns of increase and decrease (Meteor.351b5–19). Aristotle’s cyclical understanding of both natural and human history is implicit in his comment that while Egypt used to be a fertile land, it has over the centuries grown arid (Meteor.351b28–35). Indeed, parts of the world that are ocean periodically become land, while those that are land are covered over by ocean (Meteor.253a15–24). Because of periodic catastrophes, all human wisdom that is now sought concerning both the arts and divine things was previously possessed by forgotten ancestors. However, some of this wisdom is preserved in myths, which pass on knowledge of the divine by allegorically portraying the gods in human or animal form so that the masses can be persuaded to follow laws (Met.1074a38-b14, compare Meteor.339b28–30, Pol.1329b25).

Aristotle’s geology or earth science, given in the latter books of his Meteorology, offers theories of the formation of oceans, of wind and rainfall, and of other natural events such as earthquakes, lightning, and thunder. His theory of the rainbow suggests that drops of water suspended in the air form mirrors which reflect the multiply-colored visual ray that proceeds from the eye without its proper magnitude (Meteor.373a32–373b34). Though the explanations given by Aristotle of these phenomena contradict those of modern physics, his careful observations often give interest to his account.

Aristotle’s material science offers the first description of what are now called non-Newtonian fluids—honey and must—which he characterizes as liquids in which earth and heat predominate (Meteor.385b1–5). Although the Ancient Greeks did not distill alcohol, he reports on the accidental distillation of some ethanol from wine (“sweet wine”), which he observes is more combustible than ordinary wine (Meteor.387b10–14). Finally, Aristotle’s material science makes an informative distinction between compounds, in which the constituents maintain their identity, and mixtures, in which one constituent comes to dominate or in which a new kind of material is generated (see Sharvy 1983 for discussion). Though it would be inaccurate to describe him as a methodological empiricist, Aristotle’s collection and careful recording of observations shows that in all of his scientific endeavors, his explanations were designed to accord with publicly observable natural phenomena.

ii. Biology

The phenomenon of life, as opposed to inanimate nature, involves distinctive types of change (Phys.244b10–245a5) and thus requires distinctive types of explanation. Biological explanations should give all four causes of an organism or species—the material of which it is composed, the processes that bring it about, the particular form it has, and its purpose. For Aristotle, the investigation of individual organisms gives one causal knowledge since the individuals belong to a natural kind. Men and horses both have eyes, which serve similar functions in each of them, but because their species are different, a man’s eye is similar to the eyes of other men, while a horse’s eyes are similar to the eyes of other horses (HA 486a15–20). Biology should explain both why homologous forms exist in different species and the ways in which they differ, and therefore the causes for the persistence of each natural kind of living thing.

Although all four causes are relevant in biology, Aristotle tends to group final causes with formal causes in teleological explanations, and material causes with efficient causes in mechanical explanations. Boylan (section 4) shows, for example, that Aristotle’s teleological explanation of respiration is that it exists in order to bring air into the body to produce pneuma, which is the means by which an animal moves itself. Aristotle’s mechanical explanation is that air that has been heated in the lungs is pushed out by colder air outside the body (On Breath 481b10–16, PA 642a31–b4).

Teleological explanations are necessary conditionally; that is, they depend on the assumption that the biologist has correctly identified the end for the sake of which the organism behaves as it does. Mechanical explanations, in distinction, have absolute necessity in the sense that they require no assumptions about the purpose of the organism or behavior. In general, however, teleological explanations are more important in biology (PA 639b24–26), because making a distinction between living and inanimate things depends on the assumption that “nature does nothing in vain” (GA 741b5).

The final cause of each kind corresponds to the reason that it continues to persist. As opposed to superlunary, eternal substances, sublunary living things cannot preserve themselves individually or, as Aristotle puts it, “in number.” Nevertheless, because living is better than not living (EN 1170b2–5), each individual has a natural drive to preserve itself “in kind.” Such a drive for self-preservation is the primary way in which living creatures participate in the divine (DA 415a25–b7). Nutrition and reproduction therefore are, in Aristotle’s philosophy, value-laden and goal-directed activities. They are activated, whether consciously or not, for the good of the species, namely for its continuation, in which it imitates the eternal things (Gen et Corr.338b12–17). In this way, life can be considered to be directed toward and imitative of the divine (DC 292b18–22).

This basic teleological or goal-directed orientation of Aristotle’s biology allows him to explain the various functions of living creatures in terms of their growth and preservation of form. Perhaps foremost among these is reproduction, which establishes the continuity of a species through a generation. As Aristotle puts it, the seed is temporally prior to the fully developed organism, since each organism develops from a seed. But the fully developed organism is logically prior to the seed, since it is the end or final cause, for the sake of which the seed is produced (PA 641b29–642a2).

In asexual reproduction in plants and animals, the seed is produced by an individual organism and implanted in soil, which activates it and thus actualizes its potentiality to become an organism of the kind from which it was produced. Aristotle thus utilizes a conception of “type” as an endogenous teleonomic principle, which explains why an individual animal can produce other animals of its own type (Mayr 1982, 88). Hence, the natural kind to which an individual belongs makes it what it is. Animals of the same natural kind have the same form of life and can reproduce with one another but not with animals of other kinds.

In animal sexual reproduction, Aristotle understands the seed possessed by the male as the source or principle of generation, which contains the form of the animal and must be implanted in the female, who provides the matter (GA 716a14–25). In providing the form, the male sets up the formation of the embryo in the matter provided by the female, as rennet causes milk to coagulate into cheese (GA 729a10–14). Just as rennet causes milk to separate into a solid, earthy part (or cheese), and a fluid, watery part (or whey), so the semen causes the menstrual fluid to set. In this process, the principle of growth potentially contained in the seed is activated, which, like a seed planted in soil, produces an animal’s body as the embryo (GA 739b21–740a9).

The form of the animal, its psyche, may thus be said to be potentially in the matter, since the matter contains all the necessary nutrients for the production of the complete organism. However, it is invariably the male that brings about the reproduction by providing the principle of the perceptual soul, a process Aristotle compares with the movement of automatic puppets by a mover that is not in the puppet (GA 741b6–15). (Whether the female produces the nutritive psyche is an open question.) Thus, form or psyche is provided by the male, while the matter is provided by the female: when the two come together, they form a hylomorphic product—the living animal.

While the form of an animal is preserved in kind by reproduction, organisms are also preserved individually over their natural lifespans through feeding. In species that have blood, feeding is a kind of concoction, in which food is chewed and broken down in the stomach, then enters the blood, and is finally cooked up to form the external parts of the body. In plants, feeding occurs by the nutritive psyche alone. But in animals, the senses exist for the sake of detecting food, since it is by the senses that animals pursue what is beneficial and avoid what is harmful. In human beings, a similar explanation can be given of the intellectual powers: understanding and practical wisdom exist so that human beings might not only live but also enjoy the good life achievable by action (Sens.436b19–437a3).

Although Aristotle’s teleology has been criticized by some modern biologists, others have argued that his biological work is still of interest to naturalists. For example, Haldane (1955) shows that Aristotle gave the earliest report of the bee waggle dance, which received a comprehensive explanation only in the 20^th century work of Von Frisch. Aristotle also observed lordosis behavior in cattle (HA 572b1–2) and notes that some plants and animals are divisible (Youth and Old Age 468b2–15), a fact that has been vividly illustrated in modern studies of planaria. Even when Aristotle’s biological explanations are incorrect, his observations may be of enduring value.

iii. Psychology

Psychology is the study of the psyche, which is often translated as “soul.” While prior philosophers were interested in the psyche as a part of political inquiry, for Aristotle, the study of the psyche is part of natural science (Ibn Bajjah 1961, 24), continuous with biology. This is because Aristotle conceives of the psyche as the form of a living being, the body being its material. Although the psyche and body are never really separated, they can be given different descriptions. For example, the passion of anger can be described physiologically as a boiling of the blood around the heart, while it can be described dialectically as the desire to pay back with pain someone who has insulted one (DA 403a25–b2). While the physiologist examines the material and efficient causes, the dialectician considers only the form and definition of the object of investigation (DA 403a30–b3). Since the psyche is “the first principle of the living thing” (DA 402a6–7), neither the dialectical method nor the physiological method nor a combination of the two is sufficient for a systematic account of the psyche (DA 403a2, b8). Rather than relying on dialectical or materialist speculation, Aristotle holds that demonstration is the proper method of psychology, since the starting point is a definition (DA 402b25–26), and the psyche is the form and definition of a living thing.

Aristotle conceives of psychology as an exact science, with greater precision than the lesser sciences (DA 402a1–5), and accordingly offers a complete sequence of the kinds or “parts” of psyche. The nutritive psyche—possessed by both plants and animals—is responsible for the basic functions of nourishment and reproduction. Perception is possible only in an animal that also has the nutritive power that allows it to grow and reproduce, while desire depends on perceiving the object desired, and locomotion depends on desiring objects in different locations (DA 415a1–8). More intellectual powers like imagination, judgment, and understanding itself exist only in humans, who also have the lower powers.

The succession of psychological powers ensures the completeness, order, and necessity of the relations of psychological parts. Like rectilinear figures, which proceed from triangles to quadrilaterals, to pentagons, and so forth, without there being any intermediate forms, there are no other psyches than those in this succession (DA 414b20–32). This demonstrative approach ensures that although the methods of psychology and physiology are distinct, psychological divisions map onto biological distinctions. For Aristotle, the parts of the psyche are not separable or “modular” but related genetically: each posterior part of the psyche “contains” the parts before it, and each lower part is the necessary but not sufficient condition for possession of the part that comes after it.

The psyche is defined by Aristotle as the first actuality of a living animal, which is the form of a natural body potentially having life (DA 412a19–22). This form is possessed even when it is not being used; for example, a sleeping person has the power to hear a melody, though while he is sleeping, he is not exercising the power. In distinction, though a corpse looks just like a sleeping body, it has no psyche, since it lacks the power to respond to such stimuli. The second actuality of an animal comes when the power is actually exercised such as when one actually hears the melody (DA 417b9–16).

Perception is the reception of the form of an object of perception without its matter, just as wax receives the seal of a ring without its iron or gold (DA 424a17–28). When one sees wine, for example, one perceives something dark and liquid without becoming dark and liquid. Some hold that Aristotle thinks the reception of the form happens in matter so that part of the body becomes like the object perceived (for example, one’s eye might be dark while one is looking at wine). Others hold that Aristotelian perception is a spiritual change so that no bodily change is required. But presumably one is changing both bodily and spiritually all the time, even when one is not perceiving. Consequently, the formulation that perception is of “form without matter” is probably not intended to describe physiological or spiritual change but rather to indicate the conceptual nature of perception. For, as discussed in the section on first philosophy below, Aristotle considers forms to be definitions or concepts; for example, one defines “horse” by articulating its form. If he is using “form” in the same way in his discussion of perception, he means that in perceiving something, such as in seeing a horse, one gains an awareness of it as it is; that is, one grasps the concept of the horse. In that case, all the doctrine means is that perception is conceptual, giving one a grasp not just of parts of perceptible objects, say, the color and shape of a horse, but of the objects themselves, that is, of the horse as horse. Indeed, Aristotle describes perception as conferring knowledge of particulars and in that sense being like contemplation (DA 417b19–24).

This theory of perception distinguishes three kinds of perceptible objects: proper sensibles, which are perceived only by one sense modality; common sensibles, which are perceived by all the senses; and accidental sensibles, which are facts about the sensible object that are not directly given (DA 418a8–23). For example, in seeing wine, its color is a proper sensible, its volume a common sensible, and the fact that it belongs to Callias an accidental sensible. While one normally could not be wrong about the wine’s color, one might overestimate or underestimate its volume under nonstandard conditions, and one is apt to be completely wrong about the accidental sensible (for example, Callias might have sold the wine).

The five senses are distinguished by their proper sensibles: though the wine’s color might accidentally make one aware that it is sweet, color is proper to sight and sweetness to taste. But this raises a question: how do the different senses work together to give one a coherent experience of reality? If they were not coordinated, then one would perceive each quality of an object separately, for example, darkness and sweetness without putting them together. However, actual perceptual experience is coordinated: one perceives wine as both dark and sweet. In order to explain this, Aristotle says that they must be coordinated by the central sense, which is probably located in the body’s central organ, the heart. When one is awake, and the external sense organs are functioning normally, they are coordinated in the heart to discern reality as being the way it is (Sens.448b31–449a22).

Aristotle claims that one hears that one hears and sees that one sees (DA 425b12–17). Though there is a puzzle as to whether such higher-order seeing is due to sight itself or to the central perceptual power (compare On Sleep 455a3–26), the higher-order perception counts as an awareness of how the perceptual power grasps an object in the world. Though later philosophers named this higher-order perception “consciousness” and argued that it could be separated from an actualized perception of a real object, for Aristotle it is intrinsically dependent on the first-order grasp of an object (Nakahata 2014, 109–110). Indeed, Aristotle describes perceptual powers as being potentially like the perceptual object in actuality (DA 418a3–5) and goes so far as to say that the activity of the external object and that of the perceptual power are one, though what it is to be each one is different (DA 425b26–27). Thus, consciousness seems to be a property that arises automatically when perception is activated.

In at least some animals, the perceptual powers give rise to other psychological powers that are not themselves perceptual in a strict sense. In one simple case, the perception of a color is altered by its surroundings, that is, by how it is illuminated and by the other colors in one’s field of vision. Far from assuming the constancy of perception, Aristotle notes that under such circumstances, one color can take the place of another and appear differently than it does under standard conditions, for example, of full illumination (Meteor.375a22–28).

Memory is another power that arises through the collection of many perceptions. Memory is an affection of perception (though when the content of the memory is intellectual, it is an affection of the judgmental power of the psyche, see Mem.449b24–25), produced when the motion of perception acts like a signet ring in sealing wax, impressing itself on an animal and leaving an image in the psyche (Mem.450a25–b1). The resultant image has a depictive function so that it can be present even when the object it portrays is absent: when one remembers a person, for example, the memory-image is fully present in one’s psyche, though the person might be absent (Mem.450b20–25).

Closely related to memory, the imagination is a power to present absent things to oneself. Identical neither to perception nor judgment (DA 427b27–8, 433a10), imagining has an “as if” quality. For example, imagining a terror is like looking at a picture without feeling the corresponding emotion of fear (DA 427b21–24). Imagination may be defined as a kind of change or motion that comes about by means of activated perception (DA 429a1–2). This does not entail that imagination is merely reproductive but simply that activated perceptions trigger the imagination, which in turn produces an image or appearance “before our eyes” (DA 427b19–20). The resultant appearances that “comes to be for us” (DA 428a1–2, 11–12) could be true or false, since unlike the object of perception, what is imagined is not present (Humphreys 2019).

Human beings are distinct from other animals, Aristotle says, in their possession of rational psyche. Foremost among the rational powers is intellect or understanding (this article uses the terms interchangeably), which grasps universals in a way that is analogous to the perceptual grasp of particulars. However, unlike material particulars grasped by perception, universals are not mixed with body and are thus in a sense contained in the psyche itself (DA 417b22–24, 432a1–3). This has sometimes been called the intentional inexistence of an object, or intentionality, the property of being directed to or about something. Since one can think or understand any universal, the understanding is potentially about anything, like an empty writing tablet (DA 429b29–430a1).

The doctrine of the intentionality of intellect leads Aristotle to make a distinction between two kinds of intellect. Receptive or passive intellect is characterized by the ability to become like all things and is analogous to the writing tablet. Productive or active intellect is characterized by the ability to bring about all things and is analogous to the act of writing. The active intellect is thus akin to the light that illuminates objects, making them perceptible by sight. Aristotle holds that the soul never thinks without an image produced by imagination to serve as its material. Thus, in understanding something, the productive intellect actuates the receptive intellect, which stimulates the imagination to produce a particular image corresponding to the universal content of the understanding. Hence, while Aristotle describes the active intellect as unaffected, separate, and immaterial, it serves to bring to completion the passive intellect, the latter of which is inseparable from imagination and hence from perception and nutrition.

Aristotle’s insistence that intellect is not a subject of natural science (PA 641a33–b9) motivates the view that thinking requires a contribution from the supernatural or divine. Indeed, in Metaphysics (1072b19–30) Aristotle argues that intellect actively understanding the intelligible is the everlasting God. For readers like the medieval Arabic commentator Ibn Rushd, passive intellect is spread like matter among thinking beings. This “material intellect” is activated by God, the agent intellect, so that when one is thinking, one participates in the activity of the divine intellect. According to this view, every act of thinking is also an act of divine illumination in which God actuates one’s thinking power as the writer actuates a blank writing tablet.

However, in other passages Aristotle says that when the body is destroyed, the soul is destroyed too (Length and Shortness of Life, 465b23–32). Thus, it seems that Aristotle’s psychological explanations assume embodiment and require that thinking be something done by the individual human being. Indeed, Aristotle argues that if thinking is either a kind of imaginative representation or impossible without imagination, then it will be impossible without body (DA 403a8–10). But the psyche never thinks without imagination (DA 431a16–17). It seems to follow that far from being a part of the everlasting thinking of God, human thinking is something that happens in a living body and ends when that body is no longer alive. Thus, Jiminez (2014, 95–99) argues that thinking is embodied in three ways: it is proceeded by bodily processes, simultaneous with embodied processes, and anticipates bodily processes, namely intentional actions. For further discussion see Jiminez (2017).

The whole psyche governs the characteristic functions and changes of a living thing. The nutritive psyche is the formal cause of growth and metabolism and is shared by plants, while the perceptual psyche gives rise to desire, which causes self-moving animals to act. When one becomes aware of an apparent good by perception or imagination, one forms either an appetite, the desire for pleasure, or thumos, the spirited desire for revenge or honor. A third form of desire, wish, is the product of the rational psyche (DA 433a20–30).

Boeri has pointed out that Aristotle’s psychology cuts a middle path between physicalism, which identifies the psyche with body, and dualism, which posits the independent existence of the soul and body. By characterizing the psyche as he does, Aristotle can at once deny that the psyche is a body but also insist that it does not exist without a body. The living body of an animal can thus be thought of as a form that has been “materialized” (Boeri 2018, 166–169).

b. Mathematics

Aristotle was educated in Plato’s Academy, in which it was commonly argued that mathematical objects like lines and numbers exist independently of physical beings and are thus ”separable” from matter. Aristotle’s conception of the hierarchy of beings led him to reject Platonism since the category of quantity is posterior to that of substance. But he also rejects nominalism, the view that mathematical things are not real. Against both positions, Aristotle argues that mathematical things are real but do not exist separately from sensible bodies (Met.1090a29–30, 1093b27–28). Mathematical objects thus depend on the things in which they inhere and have no separate or independent being (Met.1059b12–14).

Although mathematical beings are not separate from the material cosmos, when the mathematician defines what it is to be a sphere or circle, he does not include a material like gold or bronze in the definition, because it is not the gold ball or bronze ring that the mathematician wants to define. The mathematician is justified in proceeding in this way, because although there are no separate entities beyond the concrete thing, it is just the mathematical aspects of real things that are relevant to mathematics (DC 278a2–6). This process by which the material features of a substance are systematically ignored by the mathematician, who focuses only on the quantitative features, Aristotle describes as “abstraction.” Because it always involves final ends, no abstraction is possible in natural science (PA 641b11–13, Phys.193b31–35). A consequence of this abstraction is that “why” questions in mathematics are invariably answered not by providing a final cause but by giving the correct definition (Phys.198a14–21, 200a30–34).

One reason that Aristotle believes that mathematics must proceed by abstraction is that he wants to prevent a multiplication of entities. For example, he does not want to say that, in addition to there being a sphere of bronze, there is another separate, mathematical sphere, and that in addition to that sphere, there is a separate mathematical plane cutting it, and that in addition to that plane, there is an additional line limiting the plane (see Katz 2014). It is enough for a mathematical ontology simply to acknowledge that natural objects have real mathematical properties not separate in being, which can nevertheless be studied independently from natural investigation. Aristotle also favors this view due to his belief that mathematics is a demonstrative science. Aristotle was aware that geometry uses diagrammatic representations of abstracted properties, which allow one to grasp how a demonstration is true not just of a particular object but of any class of objects that share its quantitative features (Humphreys 2017). Through the concept of abstraction, Aristotle could explain why a particular diagram may be used to prove a universal geometrical result.

Why study mathematics? Although Aristotle rejected the Platonic doctrine that mathematical beings are separate, intermediate entities between perceptible things and forms, he agreed with the Platonists that mathematics is about things that are beautiful and good, since it offers insight into the nature of arrangement, symmetry, and definiteness (Met.1078a31–b6). Thus, the study of mathematics reveals that beauty is not so much in the eye of the beholder as it is in the nature of things (Hoinski and Polansky 2016, 51–60). Moreover, Aristotle holds that mathematical beings are all potential objects of the intellect, which exist only potentially when they are not understood. The activity of understanding is the actuation of their being, but also actuates the intellect (Met.1051a26–33). Mathematics, then, not only gives insight into beauty but is also a source of intellectual pleasure, since gaining mathematical knowledge exercises the human being’s best power.

c. First Philosophy

In addition to natural and mathematical sciences, there is a science of independent beings that Aristotle calls “first philosophy” or “wisdom.” What is the proper aim of this science? In some instances, Aristotle seems to say that it concerns being insofar as it is (Met.1003a21–22), whereas in others, he seems to consider it to be equivalent to “theology,” restricting contemplation to the highest kind of being (Met.1026a19–22), which is unchanging and separable from matter. However, Menn (2013, 10–11) shows that Aristotle is primarily concerned with describing first philosophy as a science that seeks the causes and sources of being qua being. Hence, when Aristotle holds that wisdom is a kind of rational knowledge concerning causes and principles (Met.982a1–3), he probably means that the investigation of these causes of being as being seeks to discover the divine things as the cause of ordinary beings. First philosophy is consequently quite unlike natural philosophy and mathematics, since rather than proceeding from systematic observation or from hypotheses, it begins with an attitude of wonder towards ordinary things and aims to contemplate them not under a particular description but simply as beings (Sachs 2018).

The fundamental premise of this science is the law of noncontradiction, which states that something cannot both be and not be (Met.1006a1). Aristotle holds that this law is indemonstrable and necessary to assume in any meaningful discussion about being. Consequently, a person who demands a demonstration of this principle is no better than a plant. As Anscombe (1961, 40) puts it, “Aristotle evidently had some very irritating people to argue with.” But as Anscombe also points out, this principle is what allows Aristotle to make a distinction between substances as the primary kind of being and accidents that fall in the other categories. While it is possible for a substance to take on contrary accidents, for example, coffee first being hot and later cold, substances have no contraries. The law requires that a substance either is or is not, independently of its further, accidental properties.

Aristotle insists that in order for the word “being” to have any meaning at all, there must be some primary beings, whereas other beings modify these primary beings (Met.1003b6–10). As we saw in the section on Aristotle’s logic, primary substances are individual substances while their accidents are what is predicated of them in the categories. This takes on metaphysical significance when one thinks of this distinction in terms of a dependence relation in which substances can exist independently of their accidents, but accidents are dependent in being on a substance. For example, a shaggy dog is substantially a dog, but only accidentally shaggy. If it lost all its hair, it would cease to be shaggy but would be no less a dog: it would then be a non-shaggy dog. But if it ceased to be a dog—for example, if it were turned into fertilizer—then it would cease to be shaggy at the same moment. Unlike the “shagginess,” “dogness” cannot be separated from a shaggy dog: the “what it is to be” a dog is the dog’s dogness in the category of substance, while its accidents are in other categories, in this case shagginess being in the category of quality (Met.1031a1–5).

Given that substances can be characterized as forms, as matter, or as compounds of form and matter, it seems that Aristotle gives the cause and source of a being by listing its material and formal cause. Indeed, Aristotle sometimes describes primary being as the “immanent form” from which the concrete primary being is derived (Met.1037a29). This probably means that a primary substance is always a compound, its formal component serving as the substance’s final cause. However, primary beings are not composed of other primary beings (Met.1041a3–5). Thus, despite some controversy on the question, there seems to be no form of an individual, form being what is shared by all the individuals of a kind.

A substance is defined by a universal, and thus when one defines the form, one defines the substance (Met.1035b31–1036a1). However, when one grasps a substance directly in perception or thought, one grasps the compound of form and matter (Met.1036a2–8). But since form by itself does not make a primary substance, it must be immanent—that is, compounded with matter—in each individual, primary substance. Rather, in a form-matter compound, such as a living thing, the matter is both the prior stuff out of which the thing has become and the contemporaneous stuff of which it is composed. The form is what makes what a thing is made of, its matter, into that thing (Anscombe 1961, 49, 53).

Due to this hylomorphic account, one might worry that natural science seems to explain everything there is to explain about substances. However, Aristotle insists that there is a kind of separable and immovable being that serves as the principle or source of all other beings, which is the special object of wisdom (Met.1064a35–b1). This being might be called the good itself, which is implicitly pursued by substances when they come to be what they are. In any case, Aristotle insists that this source and first of beings sets in motion the primary motion. But since whatever is in motion must be moved by something else, and the first thing is not moved by something else, it is itself motionless (Met.1073a25–34). As we have seen, even the human intellect is “not affected” (DA 429b19–430a9), producing its own object of contemplation in a pure activity. Following this, Aristotle describes the primary being as an intellect or a kind of intellect that “thinks itself” perpetually (Met.1072b19–20). Thus, we can conceive of the Aristotelian god as being like our own intellect but unclouded by what we undergo as mortal, changing, and fallible beings (Marx 1977, 7–8).

4. Practical Philosophy

Practical philosophy is distinguished from theoretical philosophy both in its goals and in its methods. While the aim of theoretical philosophy is contemplation and the understanding of the highest things, the aim of practical philosophy is good action, that is, acting in a way that constitutes or contributes to the good life. But human beings can only thrive in a political community: the human is a “political animal” and thus the political community exists by nature (Pol.1253a2–5, compare EN 1169b16–19). Thus, ethical inquiry is part of political inquiry into what makes the best life for a human being. Because of the intrinsic variability and complexity of human life, however, this inquiry does not possess the exactness of theoretical philosophy (EN 1094b10–27).

In a similar way that he holds animals are said to seek characteristic ends in his biology, Aristotle holds in his “ergon argument” that the human being has a proper ergon—work or function (EN 1097b24–1098a18). Just as craftsmen like flautists and sculptors and bodily organs like eyes and ears have a peculiar work they do, so the human being must do something peculiarly human. Such function is definitive, that is, distinguishes what it is to be the thing that carries it out. For example, a flautist is a flautist insofar as she plays the flute. But the function serves as an implicit success condition for being that thing. For example, what makes a flautist good as a what she is (“good qua flautist” one might say) is that she plays the flute well. Regardless of the other work she does in her other capacities (qua human, qua friend, and so forth) the question “is she a good flautist?” can be answered only in reference to the ergon of the flautist, namely flute playing.

The human function cannot be nutrition or perception, since those activities are shared with other living things. Since other animals lack reason, the human function must be an activity of the psyche not without reason. A human being that performs this function well will be functioning well as a human being. In other words, by acting virtuously one will by that fact achieve the human good (Angier 2010, 60–61). Thus, Aristotle can summarize the good life as consisting of activities and actions in accordance with arete—excellence or virtue—and the good for the human being as the activity of the psyche in accordance with excellence in a complete life (EN 1098a12–19). Though it has sometimes been objected that Aristotle assumes without argument that human beings must have a characteristic function, Angier (2010, 73–76) has shown that the key to Aristotle’s argument is his comparison of the human function to a craft: just as a sculptor must possess a wide variety of subordinate skills to achieve mastery in his specialized activity, so in acting well the human being must possess an inclusive set of dispositions and capacities that serve to fulfill the specialized task of reason.

Ethics and politics are, however, not oriented merely to giving descriptions of human behavior but on saying what ends human beings ought to pursue, that is, on what constitutes the good life for man. While the many, who have no exposure to philosophy, should agree that the good life consists in eudaimonia—happiness or blessedness—there is disagreement as to what constitutes this state (EN 1095a18–26). The special task of practical philosophy is therefore to say what the good life consists in, that is, to give a more comprehensive account of eudaimonia than is available from the observation of the diverse ends pursued by human beings. As Baracchi (2008, 81–83) points out, eudaimonia indicates a life lived under the benevolent or beneficial sway of the daimonic, that is, of an order of existence beyond the human. Thus, the view that eudaimonia is a state of utmost perfection and completion for a human being (Magna Moralia 1184a14, b8) indicates that the full actualization of a human depends on seeking something beyond what is strictly speaking proper to the human.

a. Habituation and Excellence

Though the original meaning of ethics has been obscured due to modern confusion of pursuing proper ends with following moral rules, in the Aristotelian works, ethical inquiry is limited to the investigation of what it is for a human being to flourish according to her own nature. For the purposes of this inquiry, Aristotle distinguishes three parts of the psyche: passions, powers, and habits (EN 1105b20). Passions include attitudes such as feeling fear, hatred, or pity for others, while powers are those parts of our form that allow us to have such passions and to gain knowledge of the world. However, while all human beings share passions and powers, they differ with regard to how they are trained or habituated and thus with respect to their dispositions or states of character. Those who are habituated correctly are said to be excellent and praiseworthy, while those whose characters are misshapen through bad habituation are blameworthy (EN 1105b28–a2).

How does a human being become good, cultivating excellence within herself? Aristotle holds that this happens by two related but distinct mechanisms. Intellectual excellences arise by teaching, whereas ethical excellences by character, such as moderation and courage, arise by ethos, habituation, or training (EN 1103a14–26). Since pleasure or pain results from each of our activities (EN 1104b4), training happens through activity; for example, one learns to be just by doing just things (EN 1103a35–b36). Legislators, who aim to make citizens good, therefore must ensure that citizens are trained from childhood to produce certain good habits—excellences of character—in them (EN 1103b23–25).

Such training takes place via pleasure and pain. If one is brought up to take pleasure or suffer pain in certain activities, one will develop the corresponding character (EN 1104b18–25). This is why no one becomes good unless one does good things (EN 1105b11–12). Rather than trying to answer the question of why one ought to be good in the abstract, Aristotle assumes that taking pleasure in the right kinds of activities will lead one to have a good life, where “right kinds” means those activities that contribute to one’s goal in life. Hence the desires of children can be cultivated into virtuous dispositions by providing rewards and punishments that induce them to follow good reason (EN 1119b2–6).

Since Aristotle conceives of perception as the reception of the perceived object’s form without its matter, to perceive correctly is to grasp an object as having a pleasurable or painful generic form (DA 424a17–19, 434a27–30). The cognitive capacity of perception and the motive capacity of desire are linked through pleasure, which is also “in the soul” (EE 1218b35). Excellence is not itself a pleasure but rather a deliberative disposition to take pleasure in certain activities, a mean between extreme states (EN 1106b36–1107a2).

Although he offers detailed descriptions of the virtues in his ethical works, Aristotle summarizes them in a table:

Excess	Mean	Deficiency
Irascibility	Gentleness	Spiritlessness
Rashness	Courage	Cowardice
Shamelessness	Modesty	Diffidence
Profligacy	Temperance	Insensitiveness
Envy	Righteous Indignation	Malice
Greed	Justice	Loss
Prodigality	Liberality	Meanness
Boastfulness	Honesty	Self-deprecation
Flattery	Friendliness	Surliness
Subservience	Dignity	Stubborness
Luxuriousness	Hardness	Endurance
Vanity	Greatness of Spirit	Smallness of Spirit
Extravagance	Magnificence	Shabbiness
Rascality	Prudence	Simpleness

This shows that each excellence is a mean between excessive and defective states of character (EE 1220b35–1221a15). Accordingly, good habituation is concerned with avoiding extreme or pathological states of character. Thus, Aristotle can say that ethical excellence is “concerned with pleasures and pains” (EN 1104b8–11), since whenever one has been properly trained to take the correct pleasure and suffer correct pain when one acts in excess or defect, one possesses the excellence in question.

b. Ethical Deliberation

Human action displays excellence only when it is undertaken voluntarily, that is, is chosen as the means to bring about a goal wished for by the agent. Excellence in general is thus best understood as a disposition to make correct choices (EN 1106b36–1107a2), where “choice” is understood as the product of deliberation or what “has been deliberated upon” (EN 1113a4). Deliberation is not about ends but about what contributes to an end already given by one of the three types of desire discussed above: appetite, thumos, or wish (EN 1112b11–12, 33–34).

But if all excellent action must be chosen, how can actions undertaken in an instant, such as when one acts courageously, be excellent? Since such actions can be undertaken without the agent having undergone a prior process of conscious deliberation, which takes time, it seems that one must say that quick actions were hypothetically deliberated, that is, that they count as what one would have chosen to do had one had time to deliberate (Segvic 2008, 162–163).

Such reasoning can be schematized by the so-called the “practical syllogism.” For example, supposing one accepts the premises

One should not drink heavy water

This water in this cup is heavy

The syllogism concludes with one’s not drinking water from the cup (EN 1142a22–23). If this is how Aristotle understands ethical deliberation, then it seems that all one’s voluntary actions count as deliberated even if one has not spent any time thinking about what to do.

However, Contreras (2018, 341) points out that the “practical syllogism” cannot represent deliberation since its conclusion is an action, whereas the conclusion of deliberation is choice. Though one’s choice typically causes one to act, something external could prevent one from acting even once the choice has been made. Thus, neither are choice and action the same, nor are the processes or conditions from which they result identical. Moreover, even non-rational desires like appetite and thumos present things under the “guise of the good” so that whatever one desires appears to be good. Hence an action based on those desires could still be described by a practical syllogism, though it would not be chosen through deliberation. Deliberation does not describe a kind of deduction but a process of seeking things that contribute to an aim already presented under the guise of the good (Segvic 2008, 164–167).

This “seeking” aspect of deliberation is brought out in Aristotle’s comparison of the deliberator to the geometer, who searches and analyzes by diagrams (EN 1112b20–24). Geometrical analysis is the method by which a mathematician works backwards from a desired result to find the elements that constitute that result. Similarly, deliberation is a search for the elements that would allow the end one has in view to be realized (EN 1141b8–15).

However, while geometrical reasoning is abstracted from material conditions, the prospective reasoning of deliberation is constrained both modally and temporally. One cannot deliberate about necessities, since practical things must admit of being otherwise than they are (DA 433a29–30). Similarly, one cannot deliberate about the past, since what is chosen is not what has become—“no one chooses that Ilium be destroyed”—but what may or may not come about in the future (EN 1139b5–9, DA 431b7–8). One can describe deliberation, then, as starting from premises in the future perfect tense, and as working backwards to discover what actions would make those statements true.

In addition to these constraints, the deliberating agent must have a belief about herself, namely that she is able to either bring about or not bring about the future state in question (EN 1112a18–31). Since rational powers alone are productive of contrary effects, deliberation must be distinctively rational, since it produces a choice to undertake or not to undertake a certain course of action (Met.1048a2–11). In distinction to technical deliberation, the goal of which is to produce something external to the activity that brings it about, in ethical deliberation there is no external end since good action is itself the end (EN 1140b7). So rather than concerning what an agent might produce externally, deliberation is ethical when it is about the agent’s own activity. Thus, deliberation ends when one has reached a decision, which may be immediately acted upon or put into practice later when the proper conditions arise.

c. Self and Others

Life will tend to go well for a person who has been habituated to the right kinds of pleasures and pains and who deliberates well about what to do. Unfortunately, this is not always sufficient for happiness. For although excellence might help one manage misfortunes well and avoid becoming miserable as their result, it is not reasonable to call someone struck with a major misfortune blessed or happy (EN 1100b33–1101a13). So there seems to be an element of luck in happiness: although bad luck cannot make one miserable, one must possess at least some external goods in order to be happy.

One could also ruin things by acting in ignorance. When one fails to recognize a particular as what it is, one might bring about an end one never intended. For example, one might set off a loaded catapult through one’s ignorance of the fact that it was loaded. Such actions are involuntary. But there is a more fundamental kind of moral ignorance for which one can be blamed, which is not the cause of involuntary actions but of badness (EN 1110b25–1111a11). In the first case, one does what one does not want to do because of ignorance, so is not worthy of blame. In the second case, one does what one wants to do and is thus to be blamed for the action.

Given that badness is a form of ignorance about what one should do, it is reasonable to ask whether acting acratically, that is, doing what one does not want to do, just comes down to being ignorant. This is the teaching of Socrates, who, arguing against what appears to be the case, reduced acrasia to ignorance (EN 1145b25–27). Though Aristotle holds that acrasia is distinct from ignorance, he also thinks it is impossible for knowledge to be dragged around by the passions like a slave. Aristotle must, then, explain how being overcome by one’s passions is possible, when knowledge is stronger than the passions.

Aristotle’s solution is to limit acrasia to those cases in which one generically knows what to do but fails to act on it because one’s knowledge of sensibles is dragged along by the passions (EN 1147b15–19). In other words, he admits that the passions can overpower perceptual knowledge of particulars but denies that it can dominate intellectual knowledge of universals. Hence, like Socrates, Aristotle thinks of acrasia as a form of ignorance, though unlike Socrates, he holds that this ignorance is temporary and relates only to one’s knowledge of particulars. Acrasia consists, then, in being unruled with respect to thumos or with respect to sensory pleasures. In such cases, one is unruled because one’s passions or lower desires temporarily take over and prevent one from grasping things as one should (EN 1148a2–22). In this sense, acrasia represents a conflict between the reasoning and unreasoning parts of the psyche (for discussion see Weinman 2007, 95–99).

If living well and acting well are the same (EN 1095a18–20, EE 1219b1–4) and acting well consists in part in taking the proper pleasure in one’s action, then living well must be pleasurable. Aristotle thinks the pleasure one has in living well comes about through a kind of self-consciousness, that of being aware of one’s own activity. In such activity, one grasps oneself as the object of a pleasurable act of perception or contemplation and consequently takes pleasure in that act (Ortiz de Landázuri 2012). But one takes pleasure in a friend’s life and activity almost as one takes pleasure in one’s own life (EN 1170a15–b8). Thus, the good life may be accompanied not only by a pleasurable relation to oneself but also by relationships to others in which one takes a contemplative pleasure in their activities.

The value of friendship follows from the ideas that when a person is a friend to himself, he wishes the good for himself and thus to improve his own character. Only such a person who has a healthy love of self can form a friendship with another person (EN 1166b25–29). Indeed, one’s attitudes towards a friend are based on one’s attitudes towards oneself (EN 1166a1–10), attitudes which are extended to another in the formation of a friendship (EN 1168b4–7). However, because people are by nature communal or political, in order to lead a complete life, one needs to form friendships with excellent people, and it is in living together with others that one comes to lead a happy life. When a true friendship between excellent persons is formed, each will regard one another with the same attitude with which he regards himself, and thus as an “another self” (EN 1170b5–19)

Friendship is a bridging concept between ethics concerning the relations of individuals and political science, which concerns the nature and function of the state. For Aristotle, friendship holds a state together, so the lawgiver must focus on promoting friendship above all else (EN 1155a22–26). Indeed, when people are friends, they treat one another with mutual respect so that justice is unnecessary or redundant (EN 1155a27–29). Aristotle’s ethics are thus part of his political philosophy. Just as an individual’s good action depends on her taking the right kinds of pleasures, so a thriving political community depends on citizens taking pleasure in one another’s actions. Such love of others and mutual pleasure are strictly speaking neither egoistic nor altruistic. Instead, they rest on the establishment of a harmony of self and others in which the completion of the individual life and the life of the community amount to the same thing.

d. The Household and the State

Aristotle’s political philosophy stems from the idea that the political community or state is a creation of nature prior to the individual who lives within it. This is shown by the fact that the individual human being is dependent on the political community for his formation and survival. One who lives outside the state is either a beast or a god, that is, does not participate in what is common to humanity (Pol.1253a25–31). The political community is natural and essentially human, then, because it is only within this community that the individual realizes his nature as a human being. Thus, the state exists not only for the continuation of life but for the sake of the good life (Pol.1280a31–33).

Aristotle holds that the human being is a “political animal” due to his use of speech. While other gregarious animals have voice, which nature has fashioned to indicate pleasure and pain, the power of speech enables human beings to indicate not only this but also what is expedient and inexpedient and what is just and unjust (Pol.1253a9–18). Berns (1976, 188–189) notes that for Aristotle, the speech symbol’s causes are largely natural: the material cause of sound, the efficient cause of the living creatures that produce them, and the final cause of living together, are all parts of human nature. However, the formal cause, the distinctive way in which symbols are organized, is conventional. This allows for a variability of constitutions and hence the establishment of good or bad laws. Thus, although the state is natural for human beings, the specific form it takes depends on the wisdom of the legislator.

Though the various forms of constitution cannot be discussed here (for discussion, see Clayton, Aristotle: Politics), the purpose of the state is the good of all the citizens (Pol.1252a3), so a city is excellent when its citizens are excellent (Pol.1332a4). This human thriving is most possible, however, when the political community is ruled not by an individual but by laws themselves. This is because even the best rulers are subject to thumos, which is like a “wild beast,” whereas law itself cannot be perverted by the passions. Thus, Aristotle likens rule of law to the “rule of God and reason alone” (Pol.1287a16–32). Although this is the best kind of political community, Aristotle does not say that the best life for an individual is necessarily the political life. Instead he leaves open the possibility that the theoretical life, in which philosophy is pursued for its own sake, is the best way for a person to live.

The establishment of any political community depends on the existence of the sub-political sphere of the household, the productive unit in which goods are produced for consumption. Whereas the political sphere is a sphere of freedom and action, the household consists of relations of domination: that of the master and slave, that of marriage, and that of procreation. Hence household management or “economics” is distinct from politics, since the organization of the household has the purpose of production of goods rather than action (Pol.1253b9–14). Crucial to this household production is the slave, which Aristotle defines as a living tool (Pol.1253b30–33) who is controlled by a master in order to produce the means necessary for the survival and thriving of the household and state. As household management, economics is concerned primarily with structuring slave labor, that is, with organizing the instruments of production so as to make property necessary for the superior, political life.

Aristotle thus offers a staunch defense of the institution of slavery. Against those who claim that slavery is contrary to nature, Aristotle argues that there are natural slaves, humans who are born to be ruled by others (Pol.1254a13–17). This can be seen by analogy: the body is the natural slave of the psyche, such that a good person exerts a despotic rule over his body. In the same way, humans ought to rule over other animals, males over females, and masters over slaves (Pol.1254a20–b25). But this is only natural when the ruling part is more noble than the part that is ruled. Thus, the enslavement of the children of conquered nobles by victors in a war is a mere convention since the children may possess the natures of free people. For Aristotle, then, slavery is natural and just only when it is in the interest of slave and master alike (Pol.1255b13–15).

The result of these doctrines is the view that political community is composed of “unlikes.” Just as a living animal is composed of psyche and body, and psyche is composed of a rational part and an appetite, so the family is composed of husband and wife, and property of master and slave. It is these relations of domination, in Aristotle’s view, that constitute the state, holding it together and making it function (Pol.1277a5–11). As noted in the biographical section, Aristotle had close ties to the expanding Macedonian empire. Thus his political philosophy, insofar as it is prescriptive of how a political community should be managed, might have been intended to be put into practice in the colonies established by Alexander. If that is the case, then perhaps Aristotle’s politics is at base a didactic project intended to teach an indefinite number of future legislators (Strauss 1964, 21).

5. Aristotle’s Influence

Aristotle and Plato were the most influential philosophers in antiquity, both because their works were widely circulated and read and because the schools they founded continued to exert influence for hundreds of years after their deaths. Aristotle’s school gave rise to the Peripatetic movement, with his student Theophrastus being its most famous member. In late antiquity, there emerged a tradition of commentators on Aristotle’s works, beginning with Alexander of Aphrodisias, but including the Neo-Platonists Simplicius, Syrianus, and Ammonius. Many of their commentaries have been edited and translated into English as part of the Ancient Commentators on Aristotle project.

In the middle ages, Aristotle’s works were translated into Arabic, which led to generations of Islamic Aristotelians, such as Ibn Bajjah and Ibn Rushd (see Alwishah and Hayes 2015). In the Jewish philosophical tradition, Maimonides calls Aristotle the chief of the philosophers and uses Aristotelian concepts to analyze the contents of the Hebrew Bible. Though Boethius’ Latin commentaries on Aristotle’s logical works were available from the fifth century onwards, the publication of Aristotle’s works in Latin in the 11^th and 12^th centuries led to a revival of Aristotelian ideas in Europe. Indeed, a major controversy broke out at the University of Paris in the 1260s between the Averroists—followers of Ibn Rushd who believed that thinking happens through divine illumination—and those who held that the active intellect is individual in humans (see McInerny 2002). A further debate, concerning realism (the doctrine that universals are real) and nominalism (the doctrine that universals exist “in name” only) continued for centuries. Although they disagreed in their interpretations, prominent scholastics like Bacon, Buridan, Ockham, Scotus, and Aquinas, tended to accept Aristotelian doctrines on authority, often referring to Aristotle simply as “The Philosopher.”

Beginning in the sixteenth century, the scholastics came under attack, particularly from natural philosophers, often leading to the disparagement of Aristotelian positions. Copernicus’ model made Earth not the center of the universe as in Aristotle’s cosmology but a mere satellite of the sun. Galileo showed that some of the predictions of Aristotle’s physical theory were incorrect; for example, heavier objects do not fall faster than lighter objects. Descartes attacked the teleological aspect of Aristotle’s physics, arguing for a mechanical conception of all of nature, including living things. Hobbes critiqued the theory of perception, which he believed unrealistically described forms or ideas as travelling through the air. Later, Hume disparaged causal powers as mysterious, thus undermining the conception of the four causes. Kantian and utilitarian ethics argued that duties to humanity rather than happiness were the proper norms for action. Darwin showed that species are not eternal, casting doubt on Aristotle’s conception of biological kinds. Frege’s logic in the late nineteenth century developed notions of quantification and predication that made the syllogism obsolete. By the beginning of the twentieth century, Aristotle looked not particularly relevant to modern philosophical concerns.

The latter part of the twentieth century, however, has seen a slow but steady intellectual shift, which has led to a large family of neo-Aristotelian positions being defended by contemporary philosophers. Anscombe’s (1958) argument for a return to virtue ethics can be taken as a convenient starting point of this change. Anscombe’s claim, in summary, is that rule-based ethics of the deontological or utilitarian style is unconvincing in an era wherein monotheistic religions have declined, and commandments are no longer understood to issue from a divine authority. Modern relativism and nihilism on this view are products of the correct realization that without anyone making moral commandments, there is no reason to follow them. Since virtue ethics grounds morality in states of character rather than in universal rules, only a return to virtue ethics would allow for a morality in a secular society. In accordance with this modern turn to virtue ethics, neo-Aristotelian theories of natural normativity have increasingly been defended, for example, by Thompson (2008). In political philosophy, Arendt’s (1958) distinction between the public and private spheres takes the tension between the political community and household as a fundamental force of historical change.

In the 21^st century, philosophers have drawn on Aristotle’s theoretical philosophy. Cartwright and Pemberton (2013) revive the concept of natural powers being part of the basic ontology of nature, which explain many of the successes of modern science. Umphrey (2016) argues for the real existence of natural kinds, which serve to classify material entities. Finally, the ‘Sydney School’ has adopted a neo-Aristotelian, realist ontology of mathematics that avoids the extremes of Platonism and nominalism (Franklin 2011). These philosophers argue that, far from being useless antiques, Aristotelian ideas offer fruitful solutions to contemporary philosophical problems.

6. Abbreviations

a. Abbreviations of Aristotle’s Works

Cat.	Categoriae	Categories
Int.	Liber de interpretatione	On Interpretation
AnPr.	Analytica priora	Prior Analytics
AnPo.	Analytica posteriora	Posterior Analytics
Phys.	Physica	Physics
Met.	Metaphysica	Metaphysics
Meteor.	Meteorologica	Meteorology
DC	De Caelo	On the Heavens
HA	Historia Animalium	The History of Animals
Genn et Corr.	De Generatione et Corruptione	On Generation and Corruption
EN	Ethica Nicomachea	Nicomachean Ethics
DA	De Anima	On the Soul
MA	De Motu Animalium	On the Motion of Animal
Mem.	De Memoria	On Memory
Sens.	De Sensu et Sensibili	On Sense and its Objects
Pol.	Politica	Politics
Top.	Topica	Topics
Rhet.	Rhetorica	Rhetoric
Poet.	Poetica	Poetics
SophElen.	De Sophisticiis Elenchiis	Sophistical Refutations

b. Other Abbreviations

DL	Diogenes Laertius, The Life of Aristotle.
Bekker	“August Immanuel Bekker.” Encyclopedia Britannica. 9th ed., vol. 3, Cambridge University Press, 1910, p. 661.

7. References and Further Reading

a. Aristotle’s Complete Works

Aristotelis Opera. Edited by A.I. Bekker, Clarendon, 1837.
Complete Works of Aristotle. Edited by J. Barnes, Princeton University Press, 1984.

b. Secondary Sources

i. Life and Early Works

Bos, A.P. “Aristotle on the Etruscan Robbers: A Core Text of ‘Aristotelian Dualism.’” Journal of the History of Philosophy, vol. 41, no. 3, 2003, pp. 289–306.
Chroust, A-H. “Aristotle’s Politicus: A Lost Dialogue.” Rheinisches Museum für Philologie, Neue Folge, 108. Bd., 4. H, 1965, pp. 346–353.
Chroust, A-H. “Eudemus or on the Soul: A Lost Dialogue of Aristotle on the Immortality of the Soul.” Mnemosyne, Fourth Series, vol. 19, fasc. 1, 1966, pp. 17–30.
Chroust, A-H. “Aristotle Leaves the Academy.” Greece and Rome, vol. 14, issue 1, April 1967, pp. 39–43.
Chroust, A-H. “Aristotle’s Sojourn in Assos.” Historia: Zeitschrift für Alte Geschischte, Bd. 21, H. 2, 1972, pp. 170–176.
Fine, G. On Ideas. Oxford University Press, 1993.
Jaeger, W. Aristotle: Fundamentals of the History of His Development. 2nd ed., Oxford: Clarendon Press, 1948.
Kroll, W., editor. Syrianus Commentaria in Metaphysica (Commentaria in Aristotelem Graeca, vol. VI, part I). Berolini, typ. et impensis G. Reimeri, 1902.
Lachterman, D.R. “Did Aristotle ‘Develop’? Reflections on Werner Jaeger’s Thesis.” The Society for Ancient Greek Philosophy Newsletter, vol. 33, 1980.
Owen, G.E.L. “The Platonism of Aristotle.” Studies in the Philosophy of Thought and Action, edited by P.F. Strawson, Oxford University Press, 1968, pp. 147–174.
Pistelli, H., editor. Iamblichi Protrepticus. Lipsiae: In Aedibus B.G. Tubneri, 1888.

ii. Logic

Bäck, A.T. Aristotle’s Theory of Predication. Leiden: Brill, 2000.
Cook Wilson, J. Statement and Inference, vol.1. Clarendon, 1926.
Groarke, L.F. “Aristotle: Logic.” Internet Encyclopedia of Philosophy, www.iep.utm.edu/aris-log.
Ierodiakonou, K. “Aristotle’s Logic: An Instrument, Not a Part of Philosophy.” Aristotle: Logic, Language and Science, edited by N. Avgelis and F. Peonidis, Thessaloniki, 1998, pp. 33–53.
Lukasiewicz, J. Aristotle’s Syllogistic. 2nd ed., Clarendon, 1957.
Malink, M. Aristotle’s Modal Syllogistic. Harvard University Press, 2013.

iii. Theoretical Philosophy

Anscombe, G.E.M. and P.T. Geach. Three Philosophers. Cornell University Press, 1961.
Bianchi, E. The Feminine Symptom. Fordham University Press, 2014.
Boeri, M. D. “Plato and Aristotle on What Is Common to Soul and Body. Some Remarks on a Complicated Issue.” Soul and Mind in Greek Thought. Psychological Issues in Plato and Aristotle, edited by M.D. Boeri, Y.Y. Kanayama, and J. Mittelmann, Springer, 2018, pp. 153–176.
Boylan, M. “Aristotle: Biology.” Internet Encyclopedia of Philosophy, https://www.iep.utm.edu/aris-bio.
Cook, K. “The Underlying Thing, the Underlying Nature and Matter: Aristotle’s Analogy in Physics I 7.” Apeiron, vol. 22, no. 4, 1989, pp. 105–119.
Hoinski, D. and R. Polansky. “Aristotle on Beauty in Mathematics.” Dia-noesis, October 2016, pp. 37–64.
Humphreys, J. “Abstraction and Diagrammatic Reasoning in Aristotle’s Philosophy of Geometry.” Apeiron, vol. 50, no. 2, April 2017, pp. 197–224.
Humphreys, J. “Aristotelian Imagination and Decaying Sense.” Social Imaginaries. 5:1, 37-55, Spring 2019.
Ibn Bjjah. Ibn Bajjah’s ‘Ilm al-Nafs (Book on the Soul). Translated by M.S.H. Ma’Sumi, Karachi: Pakistan Historical Society, 1961.
Ibn Rushd. Long Commentary on the De Anima of Aristotle. Translated by R.C. Taylor, Yale University Press, 2009.
Jiminez, E. R. “Mind in Body in Aristotle.” The Bloomsbury Companion to Aristotle, edited by C. Baracchi, Bloomsbury, 2014.
Jiminez, E. R. Aristotle’s Concept of Mind. Cambridge University Press, 2017.
Katz, E. “An Absurd Accumulation: Metaphysics M.2, 1076b11–36.” Phronesis, vol. 59, no. 4, 2014, pp. 343–368.
Marx, W. Introduction to Aristotle’s Theory of Being as Being. The Hague: Martinus Nijhoff, 1977.
Mayr, E. The Growth of Biological Thought. Harvard University Press, 1982.
Menn, S. “The Aim and the Argument of Aristotle’s Metaphysics.” Humboldt-Universität zu Berlin, 2013, www.philosophie.hu-berlin.de/de/lehrbereiche/antike/mitarbeiter/menn/contents.
Nakahata, M. “Aristotle and Descartes on Perceiving That We See.” The Journal of Greco-Roman Studies, vol. 53, no. 3, 2014, pp. 99–112.
Sachs, J. “Aristotle: Metaphysics.” Internet Encyclopedia of Philosophy, www.iep.utm.edu/aris-met.
Sharvy, R. “Aristotle on Mixtures.” The Journal of Philosophy, vol. 80, no. 8, 1983, pp. 439–457.
Waterlow, S. Nature, Change, and Agency in Aristotle’s Physics: A Philosophical Study. Clarendon, 1982.
Winslow, R. Aristotle and Rational Discovery. New York: Continuum, 2007.

iv. Practical Philosophy

Angier, T. Techne in Aristotle’s Ethics: Crafting the Moral Life. London: Continuum, 2010.
Baracchi, C. Aristotle’s Ethics as First Philosophy. Cambridge University Press, 2008.
Berns, L. “Rational Animal-Political Animal: Nature and Convention in Human Speech and Politics.” The Review of Politics, vol. 38, no. 2, 1976, pp. 177–189.
Clayton, E. “Aristotle: Politics.” Internet Encyclopedia of Philosophy, www.iep.utm.edu/aris-pol.
Contreras, K.E. “The Rational Expression of the Soul in the Aristotelian Psychology: Deliberating Reasoning and Action.” Eidos, vol. 29, 2018, pp. 339–365 (in Spanish).
Ortiz de Landázuri, M.C. “Aristotle on Self-Perception and Pleasure.” Journal of Ancient Philosophy, vol. VI, issue. 2, 2012.
Segvic, H. From Protagoras to Aristotle. Princeton University Press, 2008.
Strauss, L. The City and Man. University of Chicago Press, 1964.
Weinman, M. Pleasure in Aristotle’s Ethics. London: Continuum, 2007.

v. Aristotle’s Influence

Alwishah, A. and J. Hayes, editors. Aristotle and the Arabic Tradition. Cambridge University Press, 2015.
Anscombe, G.E.M. “Modern Moral Philosophy.” Philosophy, vol. 33, no. 124, 1958, pp. 1–19.
Arendt, H. The Human Condition. 2nd ed., University of Chicago Press, 1958.
Cartwright, N. and J. Pemberton. “Aristotelian Powers: Without Them, What Would Modern Science Do?” Powers and Capacities in Philosophy: The New Aristotelianism, edited by R. Groff and J. Greco, Routledge, 2013, pp. 93–112.
Franklin, J. “Aristotelianism in the Philosophy of Mathematics.” Studia Neoaristotelica, vol. 8, no. 1, 2011, pp. 3–15.
McInerny, R. Aquinas Against the Averroists: On There Being Only One Intellect. Purdue University Press, 2002.
Umphrey, S. Natural Kinds and Genesis. Lanham: Lexington Books, 2016.

Author Information

Justin Humphreys
Email: jhh@sas.upenn.edu
University of Pennsylvania
U. S. A.

David Hume: Moral Philosophy

Although David Hume (1711-1776) is commonly known for his philosophical skepticism, and empiricist theory of knowledge, he also made many important contributions to moral philosophy. Hume’s ethical thought grapples with questions about the relationship between morality and reason, the role of human emotion in thought and action, the nature of moral evaluation, human sociability, and what it means to live a virtuous life. As a central figure in the Scottish Enlightenment, Hume’s ethical thought variously influenced, was influenced by, and faced criticism from, thinkers such as Shaftesbury (1671-1713), Francis Hutcheson (1694-1745), Adam Smith (1723-1790), and Thomas Reid (1710-1796). Hume’s ethical theory continues to be relevant for contemporary philosophers and psychologists interested in topics such as metaethics, the role of sympathy and empathy within moral evaluation and moral psychology, as well as virtue ethics.

Hume’s moral thought carves out numerous distinctive philosophical positions. He rejects the rationalist conception of morality whereby humans make moral evaluations, and understand right and wrong, through reason alone. In place of the rationalist view, Hume contends that moral evaluations depend significantly on sentiment or feeling. Specifically, it is because we have the requisite emotional capacities, in addition to our faculty of reason, that we can determine that some action is ethically wrong, or a person has a virtuous moral character. As such, Hume sees moral evaluations, like our evaluations of aesthetic beauty, as arising from the human faculty of taste. Furthermore, this process of moral evaluation relies significantly upon the human capacity for sympathy, or our ability to partake of the feelings, beliefs, and emotions of other people. Thus, for Hume there is a strong connection between morality and human sociability.

Hume’s philosophy is also known for a novel distinction between natural and artificial virtue. Regarding the latter, we find a sophisticated account of justice in which the rules that govern property, promising, and allegiance to government arise through complex processes of social interaction. Hume’s account of the natural virtues, such as kindness, benevolence, pride, and courage, is explained with rhetorically gripping and vivid illustrations. The picture of human excellence that Hume paints for the reader equally recognizes the human tendency to praise the qualities of the good friend and those of the inspiring leader. Finally, the overall orientation of Hume’s moral philosophy is naturalistic. Instead of basing morality on religious and divine sources of authority, Hume seeks an empirical theory of morality grounded on observation of human nature.

Hume’s moral philosophy is found primarily in Book 3 of The Treatise of Human Nature and his Enquiry Concerning the Principles of Morals, although further context and explanation of certain concepts discussed in those works can also be found in his Essays Moral, Political, and Literary. This article discusses each of the topics outlined above, with special attention given to the arguments he develops in the Treatise.

Hume’s Rejection of Moral Rationalism
Hume’s Moral Sense Theory
1. The Moral Sense
2. The General Point of View
Sympathy and Humanity
1. Sympathy
2. Humanity
Hume’s Classification of the Virtues and the Standard of Virtue
Justice and the Artificial Virtues
The Natural Virtues
References and Further Reading

1. Hume’s Rejection of Moral Rationalism

Many philosophers have believed that the ability to reason marks a strict separation between humans and the rest of the natural world. Views of this sort can be found in thinkers such as Plato, Aristotle, Aquinas, Descartes, and Kant. One of the more philosophically radical aspects of Hume’s thought is his attack on this traditional conception. For example, he argues that the same evidence we have for thinking that human beings possess reason should also lead us to conclude that animals are rational (T 1.3.16, EHU 9). Hume also contends that the intellect, or “reason alone,” is relatively powerless on its own and needs the assistance of the emotions or “passions” to be effective. This conception of reason and emotion plays a critical role in Hume’s moral philosophy.

One of the foremost topics debated in the seventeenth and eighteenth century about the nature of morality was the relationship between reason and moral evaluation. Hume rejected a position known as moral rationalism. The moral rationalists held that ethical evaluations are made solely upon the basis of reason without the influence of the passions or feelings. The seventeenth and eighteenth century moral rationalists include Ralph Cudworth (1617-1688), Samuel Clarke (1675-1729), and John Balguy (1688-1748). Clarke, for instance, writes that morality consists in certain “necessary and eternal” relations (Clarke 1991[1706]: 192). He argues that it is “fit and reasonable in itself” that one should preserve the life of an innocent person and, likewise, unfit and unreasonable to take someone’s life without justification (Clarke 1991[1706]: 194). The very relationship between myself, a rational human being, and this other individual, another rational human being who is innocent of any wrongdoing, implies that it would be wrong of me to kill this person. The moral truths implied by such relations are just as evident as the truths implied by mathematical relations. It is just as irrational to (a) deny the wrongness of killing an innocent person as it would be to (b) deny that three multiplied by three is equal to nine (Clarke 1991[1706]: 194). As evidence, Clarke points out that both (a) and (b) enjoy nearly universal agreement. Thus, Clarke believes we should conclude that both (a) and (b) are self-evident propositions discoverable by reason alone. Consequently, it is in virtue of the human ability to reason that we make moral evaluations and recognize our moral duties.

a. The Influence Argument

Although Hume rejects the rationalist position, Hume does allow that reason has some role to play in moral evaluation. In the second Enquiry Hume argues that, although our determinations of virtue and vice are based upon an “internal sense or feeling,” reason is needed to ascertain the facts required to form an accurate view of the person being evaluated and, thus, is necessary for accurate moral evaluations (EPM 1.9). Hume’s claim, then, is more specific. He denies that moral evaluation is the product of “reason alone.” It is not solely because of the rational part of human nature that we can distinguish moral goodness from moral badness. Not “every rational being” can make moral evaluations (T 3.1.1.4). Purely rational beings that are devoid of feelings and emotion, if any such beings exist, could not understand the difference between virtue and vice. Something other than reason is required. Below is an outline of the argument Hume gives for this conclusion at T 3.1.1.16. Call this the “Influence Argument.”

Moral distinctions can influence human actions.
“Reason alone” cannot influence human actions.
Therefore, moral distinctions are not the product of “reason alone.”

Let us begin by considering premise (1). Notice that premise (1) uses the term “moral distinctions.” By “moral distinction” Hume means evaluations that differentiate actions or character traits in terms of their moral qualities (T 3.1.1.3). Unlike the distinctions we make with our pure reasoning faculty, Hume claims moral distinctions can influence how we act. The claim that some action, X, is vicious can make us less likely to perform X, and the opposite in the case of virtue. Those who believe it is morally wrong to kill innocent people will, consequently, be less likely to kill innocent people. This does not mean moral evaluations motivate decisively. One might recognize that X is a moral duty, but still fail to do X for various reasons. Hume only claims that the recognition of moral right and wrong can motivate action. If moral distinctions were not practical in this sense, then it would be pointless to attempt to influence human behavior with moral rules (T 3.1.1.5).

Premise (2) requires a more extensive justification. Hume provides two separate arguments in support of (2), which have been termed by Rachel Cohon as the “Divide and Conquer Argument” and the “Representation Argument” (Cohon 2008). These arguments are discussed below.

b. The Divide and Conquer Argument

Hume reminds us that the justification for premise (2) of the Influence Argument was already established earlier at Treatise 2.3.3 in a section entitled “Of the influencing motives of the will.” Hume begins this section by observing that many believe humans act well by resisting the influence of our passions and following the demands of reason (T 2.3.3.1). For instance, in the Republic Plato (427–347 B.C.E.) outlines a conception of the well-ordered soul in which the rational part rules over the soul’s spirited and appetitive parts. Or, consider someone who knows that eating another piece of cake is harmful to her health, and values her health, but still eats another piece of cake. Such situations are often characterized as letting passion or emotion defeat reason. Below is the argument that Hume uses to reject this conception.

Reason is either demonstrative or probable.
Demonstrative reason alone cannot influence the will (or influence human action).
Probable reason alone cannot influence the will (or influence human action).
Therefore, “reason alone” cannot influence the will (or influence human action).

This argument is referred to as the “Divide and Conquer Argument” because Hume divides reasoning into two types, and then demonstrates that neither type of reasoning can influence the human will by itself. From this, it follows that “reason alone” cannot influence the will.

The first type of reasoning Hume discusses is demonstrative reasoning that involves “abstract relations of ideas” (T 2.3.3.2). Consider demonstratively certain judgments such as “2+2=4” or “the interior angles of a triangle equal 180 degrees.” This type of reason cannot motivate action because our will is only influenced by what we believe has physical existence. Demonstrative reason, however, only acquaints us with abstract concepts (T 2.3.3.2). Using Hume’s example, mathematical demonstrations might provide a merchant with information about how much money she owes to another person. Yet, this information only matters because she has a desire to square her debt (T 2.3.3.2). It is this desire, not the demonstrative reasoning itself, which provides the motivational force.

Why can probable reasoning not have practical influence? Probable reasoning involves making inferences on the basis of experience (T 2.3.3.1). An example of this is the judgments we make of cause and effect. As Hume established earlier in the Treatise, our judgments of cause and effect involve recognizing the “constant conjunction” of certain objects as revealed through experience (see, for instance, T 1.3.6.15). Since probable reasoning can inform us of what actions have a “constant conjunction” with pleasure or pain, it might seem that probable reasoning could influence the will. However, the fundamental motivational force does not arise from our ability to infer the relation of cause and effect. Rather, the source of our motivation is the “impulse” to pursue pleasure and avoid pain. Thus, once again, reason simply plays the role of discovering how to satisfy our desires (T 2.3.3.3). For example, my belief that eating a certain fruit will cause good health seems capable of motivating me to eat that fruit (T 3.3.1.2). However, Hume argues that this causal belief must be accompanied with some passion, specifically the desire for good health, for it to move the will. We would not care about the fact that eating the fruit contributes to our health if health was not a desired goal. Thus, Hume sketches a picture in which the motivational force to pursue a goal always comes from passion, and reason merely informs us of the best means for achieving that goal (T 2.3.3.3).

Consequently, when we say that some passion is “unreasonable,” we mean either that the passion is founded upon a false belief or that passion impelled us to choose the wrong method for achieving our desired end (T 2.3.3.7). In this context Hume famously states that it is “not contrary to reason to prefer the destruction of the whole world to the scratching of my finger” (T 2.3.3.6). It can be easy to misunderstand Hume’s point here. Hume does not believe there is no basis for condemning the person who prioritizes scratching her finger. Hume’s point is simply that reason itself cannot distinguish between these choices. A being that felt completely indifferent toward both the suffering and well-being of other human beings would have no preference for what outcome results (EPM 6.4).

The second part of Hume’s thesis is that, because “reason alone” cannot motivate actions, there is no real conflict between reason and passion (T 2.3.3.1). The view that reason and passion can conflict misunderstands how each functions. Reason can only serve the ends determined by our passions. As Hume explains in another well-known quote “Reason is, and ought only to be the slave of the passions” (T 2.3.3.4). Reason and passion have fundamentally different functions and, thus, cannot encroach upon one another. Why do we commonly describe succumbing to temptation as a failure to follow reason? Hume explains that the operations of the passions and reason often feel similar. Specifically, both the calm passions that direct us toward our long-term interest, as well as the operations of reason, exert themselves calmly (T 2.3.3.8). Thus, the person who possesses “strength of mind,” or what is commonly called “will power,” is not the individual whose reason conquers her passions. Instead, being strong-willed means having a will that is primarily influenced by calm instead of violent passions (T 2.3.3.10).

c. The Representation Argument

The second argument in support of premise (2) of the “Influence Argument” is found in both T 3.3.1 and T 2.3.3. This argument is commonly referred to as the “Representation Argument.” It is expressed most succinctly at T 3.3.1.9. The argument has two parts. The first part of the argument is outlined below.

That which is an object of reason must be capable of being evaluated as true or false (or be “truth-apt”).
That which is capable of being evaluated as true or false (or is “truth-apt”) must be capable of agreement (or disagreement) with some relation of ideas or matter of fact.
Therefore, that which can neither agree (nor disagree) with any relation of ideas or matter of fact cannot be an object of reason.

The first portion of the argument establishes what reason can (and cannot) accomplish. Premise (1) relies on the idea that the purpose of reason is to discover truth and falsehood. In fact, in an earlier Treatise section Hume describes truth as the “natural effect” of our reason (T 1.4.1.1). So, whatever is investigated or revealed through reason must be the sort of claim that it makes sense to evaluate as true or false. Philosophers call such claims “truth-apt.” What sorts of claims are truth-apt? Only those claims which can agree (or disagree) with some abstract relation of ideas or fact about existence. For instance, the claim that “the interior angles of a triangle add up to 180 degrees” agrees with the relation of ideas that makes up our concept of triangle. Thus, such a claim is true. The claim that “China is the most populated country on planet Earth” agrees with the empirical facts about world population and, thus, can also be described as true. Likewise, the claims that “the interior angles of a triangle add up to 200 degrees” or that “the United States is the most populated country on planet Earth” do not agree with the relevant ideas or existential facts. Yet, because it is appropriate to label each of these as false, both claims are still “truth-apt.” From this, it follows that something can only be an object of reason if it can agree or disagree with a relation of ideas or matter of fact.

Is that which motivates our actions “truth-apt” and, consequently, within the purview of reason? Hume addresses that point in the second part of the Representation Argument:

4. Human “passions, volitions, and actions” (PVAs) can neither agree (nor disagree) with any relation of ideas or matter of fact.

5. Therefore, PVAs cannot be objects of reason (or reason cannot produce action).

Why does the argument talk about “passions, volitions, and actions” (PVAs) in premise (4)? PVAs are the component parts of motivation. Passions cause desire or aversion toward a certain object, which results in the willing of certain actions. Thus, the argument hinges on premise (4)’s claim that PVAs can never agree or disagree with relations of ideas or matters of fact. Hume’s justification for this claim is again found at T 2.3.3.5 from the earlier Treatise section “Of the Influencing Motives of the Will.” Here Hume argues that for something to be truth-apt it must have a “representative quality” (T 2.3.3.5). That is, it must represent some type of external reality. The claim that “the interior angles of a triangle equal 180 degrees” represents a fact about our concept of a triangle. The claim that “China is the most populated country on planet Earth” represents a fact about the current population distribution of Earth. Hume argues the same cannot be said of passions such as anger. The feeling of anger, just like the feeling of being thirsty or being ill, is not meant to be a representation of some external object (T 2.3.3.5). Anger, of course, is a response to something external. For example, one might feel anger in response to a friend’s betrayal. However, this feeling of anger is not meant to represent my friend’s betrayal. A passion or emotion is simply a fact about the person who feels it. Consequently, since reason only deals with what is truth-apt, it follows that (5) PVAs cannot be objects of reason.

d. Hume and Contemporary Metaethics

Hume’s moral philosophy has continued to influence contemporary philosophical debates in metaethics. Consider the following three metaethical debates.

Moral Realism and Anti-Realism: Moral realism holds that moral statements, such as “lying is morally wrong,” describe mind-independent facts about the world. Moral anti-realism denies that moral statements describe mind-independent facts about the world.

Moral Cognitivism and Noncognitivism: Moral cognitivism holds that moral statements, such as “lying is morally wrong,” are capable of being evaluated as true or false (or are “truth-apt”). Moral noncognitivism denies that such statements can be evaluated as true or false (or can be “truth-apt”).

Moral Internalism and Externalism: Moral internalism holds that someone who recognizes that it is one’s moral obligation to perform X necessarily has at least some motive to perform X. Moral externalism holds that one can recognize that it is one’s moral obligation to perform X and simultaneously not have any motive to perform X.

While there is not just one “Humean” position on each of these debates, many contemporary meta-ethicists who see Hume as a precursor take a position that combines anti-realism, noncognitivism, and internalism. Much of the support for reading Hume as an anti-realist comes from consideration of his moral sense theory (which is examined in the next section). Evidence for an anti-realist reading of Hume is often found at T 3.1.1.26. Hume claims that, for any vicious action, the moral wrongness of the action “entirely escapes you, as long as you consider the object.” Instead, to encounter the moral wrongness you must “turn your reflexion into your own breast” (T 3.1.1.26). The wrongness of murder, taking Hume’s example, does not lie in the act itself as something that exists apart from the human mind. Rather, the wrongness of murder lies in how the observer reacts to the murder or, as we will see below, the painful sentiment that such an act produces in the observer.

The justification for reading Hume as an internalist comes primarily from the Influence Argument, which relies on the internalist idea that moral distinctions can, by themselves, influence the will and produce action. The claim that Hume is a noncognitivist is more controversial. Support for reading Hume as a noncognitivist is sometimes found in the so-called “is-ought” paragraph. There Hume warns us against deriving a conclusion that we “ought, or ought not” do something from the claim that something “is, and is not” the case (T 3.1.1.27). There is significant debate among Hume scholars about what Hume means to say in this passage. According to one interpretation, Hume is denying that it is appropriate to derive moral conclusions (such as “one should give to charity”) from any set of strictly factual or descriptive premises (such as “charity relieves suffering”). This is taken to imply support for noncognitivism by introducing a strict separation between facts (which are truth-apt) and values (which are not truth-apt).

Some have questioned the standard view of Hume as a noncognitivist. Hume does think (as seen in the Representation Argument) that the passions, which influence the will, are not truth-apt. Does the same hold for the moral distinctions themselves? Rachel Cohon has argued, to the contrary, that moral distinctions describe statements that are evaluable as true or false (Cohon 2008). Specifically, they describe beliefs about what character traits produce pleasure and pain in human spectators. If this interpretation is correct, then Hume’s metaethics remains anti-realist (moral distinctions refer to facts about the minds of human observers), but it can also be cognitivist. That is because the claim that human observers feel pleasure in response to some character trait represents an external matter of fact and, thus, can be denominated true or false depending upon whether it represents this matter of fact accurately.

2. Hume’s Moral Sense Theory

Hume claims that if reason is not responsible for our ability to distinguish moral goodness from badness, then there must be some other capacity of human beings that enables us to make moral distinctions (T 3.1.1.4). Like his predecessors Shaftesbury (1671-1713) and Francis Hutcheson (1694-1745), Hume believes that moral distinctions are the product of a moral sense. In this respect, Hume is a moral sentimentalist. It is primarily in virtue of our ability to feel pleasure and pain in response to various traits of character, and not in virtue of our capacity of “reason alone,” that we can distinguish between virtue and vice. This section covers the major elements of Hume’s moral sense theory.

a. The Moral Sense

Moral sense theory holds, roughly, that moral distinctions are recognized through a process analogous to sense perception. Hume explains that virtue is that which causes pleasurable sensations of a specific type in an observer, while vice causes painful sensations of a specific type. While all moral approval is a sort of pleasurable sensation, this does not mean that all pleasurable sensations qualify as instances of moral approval. Just as the pleasure we feel in response to excellent music is different from the pleasure we derive from excellent wine, so the pleasure we derive from viewing a person’s character is different from the pleasure we derive from inanimate objects (T 3.1.2.4). So, moral approval is a specific type of pleasurable sensation, only felt in response to persons, with a particular phenomenological quality.

Along with the common experience of feeling pleasure in response to virtue and pain when confronted with vice (T 3.1.2.2), Hume also thinks this view follows from his rejection of moral rationalism. Everything in the mind, Hume argues, is either an impression or idea. Hume understands an impression to be the first, and most forceful, appearance of a sensation or feeling in the human mind. An idea, by contrast, is a less forceful copy of that initial impression that is preserved in memory (T 1.1.1.1). Hume holds that all reasoning involves comparing our ideas. This means that moral rationalism must hold that we arrive at an understanding of morality merely through a comparison of ideas (T 3.1.1.4). However, since Hume has shown that moral distinctions are not the product of reason alone, moral distinctions cannot be made merely through comparison of ideas. Therefore, if moral distinctions are not made by comparing ideas, they must be based upon our impressions or feelings.

Hume’s claim is not that virtue is an inherent quality of certain characters or actions, and that when we encounter a virtuous character we feel a pleasurable sensation that constitutes evidence of that inherent quality. If that were true, then the moral status of some character trait would be inferred from the fact that we are experiencing a pleasurable sensation. This would conflict with Hume’s anti-rationalism. Hume reiterates this point, stating that spectators “do not infer a character to be virtuous, because it pleases: But in feeling that it pleases [they] in effect feel that it is virtuous” (T 3.1.2.3). Because moral distinctions are not made through a comparison of ideas, Hume believes it is more accurate to say that morality is a matter of feeling rather than judgment (T 3.1.2.1). Since virtue and vice are not inherent properties of actions or persons, what constitutes the virtuousness (or viciousness) of some action or character must be found within the observer or spectator. When, for example, someone determines that some action or character trait is vicious, this just means that your (human) nature is constituted such that you respond to that action or character trait with a feeling of disapproval (T 3.1.1.26). One’s ability to see the act of murder, not merely as a cause of suffering and misery, but as morally wrong, depends upon the emotional capacity to feel a painful sentiment in response to this phenomenon. Thus, Hume claims that the quality of “vice entirely escapes you, as long as you consider the object” (T 3.1.1.26). Virtue and vice exist, in some sense, through the sentimental reactions that human observers toward various “objects.”

This provides the basis for Hume’s comparison between moral evaluation and sense perception, which lies at the foundation of his moral sense theory. Just like the experiences of taste, smell, sight, hearing, and touch produced by our physical senses, virtue and vice exist in the minds of human observers instead of in the actions themselves (T 3.1.1.26). Here Hume appeals to the primary-secondary quality distinction. Sensory qualities and moral qualities are both observer-dependent. Just as there would be no appearance of color if there were no observers, so there would also be no such thing as virtue or vice without beings capable of feeling approval or disapproval in response to human actions. Likewise, a human being who lacked the required emotional capacities would be unable to understand what the rest of us mean when we say that some trait is virtuous or vicious. For instance, imagine a psychopath who has the necessary reasoning ability to understand the consequences of murder, but lacks aversion toward it and, thus, cannot determine or recognize its moral status. In fact, the presence of psychopathy, and the inability of psychopaths to understand moral judgments, is sometimes taken as an objection to moral rationalism.

Furthermore, our moral sense responds specifically to some “mental quality” (T 3.3.1.3) of another person. We can think of a “mental quality” as a disposition one has to act in certain ways or as a character trait. For example, when we approve of the courageous individual, we are approving of that person’s willingness to stand resolute in the face of danger. Consequently, actions can only be considered virtuous derivatively, as signs of another person’s mental dispositions and qualities (T 3.3.1.4). A single action, unlike the habits and dispositions that characterize our character, is fleeting and may not accurately represent our character. Only settled character traits are sufficiently “durable” to determine our evaluations of others (T 3.3.1.5). For this reason, Hume’s ethical theory is sometimes seen as a form of virtue ethics.

b. The General Point of View

Hume posits an additional requirement that some sentiment must meet to qualify as a sentiment of moral approval (or disapproval). Imagine a professor unfairly shows favor toward one student by giving her an “A” for sub-standard work. In this case, it is not difficult to imagine the student being pleased with the professor’s actions. However, if she was honest, that student would likely not think she was giving moral approval of the professor’s unfair grading. Instead, she is evaluating the influence the professor’s actions have upon her perceived self-interest. This case suggests that there is an important difference between the evaluations we make of other people based upon how they influence our interests, and the evaluations we make of others based upon their moral character.

This idea plays a significant role in Hume’s moral theory. Moral approval only occurs from a perspective in which the spectator does not take her self-interest into consideration. Rather, moral approval occurs from a more “general” vantage point (T 3.1.2.4). In the conclusion to the second Enquiry Hume makes this point by distinguishing the languages of morality and self-interest. When someone labels another “his enemy, his rival, his antagonist, his adversary,” he is evaluating from a self-interested point of view. By contrast, when someone labels another with moral terms like “vicious or odious or depraved,” she is inhabiting a general point of view where her self-interest is set aside (EPM 9.6). Speaking the language of morality, then, requires abstracting away from one’s personal perspective and considering the wider effects of the conduct under evaluation. This unbiased point of view is one aspect of what Hume refers to as the “general” (T 3.3.1.15) or “common” (T 3.3.1.30, EPM 9.6) point of view. Furthermore, he suggests that the ability to transcend our personal perspective, and adopt a general vantage point, ties human beings together as “the party of humankind against vice and disorder, its common enemy” (EPM 9.9). Thus, Hume’s theory of moral approval is related in important ways to his larger goal of demonstrating that moral life is an expression of human sociability.

The general vantage point from which moral evaluations are made does not just exclude considerations of self-interest. It also corrects for other factors that can distort our moral evaluations. For instance, adoption of the general point of view corrects our natural tendency to give greater praise to those who exist in close spatial-temporal proximity. Hume notes that someone might feel a stronger degree of praise for her hardworking servant than she feels for the historical representation of Marcus Brutus (T 3.3.1.16). From an objective point of view, Brutus merits greater praise for his moral character. However, we are acquainted with our servant and frequently interact with him. Brutus, on the other hand, is only known to us through historical accounts. Temporal distance causes our immediate, natural feelings of praise for Brutus to be less intense than the approval we give to our servant. Yet, this variation is not reflected in our moral evaluations. We do not judge that our servant has a superior moral character, and we do not automatically conclude that those who live in our own country are morally superior to those living in foreign countries (T 3.3.1.14). So, Hume needs some explanation of why our considered moral evaluations do not match our immediate feelings.

Hume responds by explaining that, when judging the quality of someone’s character, we adopt a perspective that discounts our specific spatial-temporal location or any other special resemblance we might have with the person being evaluated. Hume tells us that this vantage point is one in which we consider the influence that the person in question has upon his or her contemporaries (T 3.3.3.2). When we evaluate Brutus’ character, we do not consider the influence that his qualities have upon us now. As a historical figure who no longer exists, Brutus’ virtuous character does not provide any present benefit. Instead, we evaluate Brutus’ character based upon the benefits it had for those who lived in Brutus’ own time. We recognize that if we had lived in Brutus’ own time, and were a fellow Roman citizen with him, then we would express much greater praise and admiration for his character (T 3.3.1.16).

Hume identifies a second type of correction that the general point of view is responsible for as well. Hume observes that we have the capacity to praise someone whose character traits are widely beneficial, even when unfortunate external circumstances prevent those traits from being effective (T 3.3.1.19). For example, we might imagine a generous, kind-hearted individual whose generosity fails to make much of an impact on others because she is of modest means. Hume claims, in these cases, our considered moral evaluation is not influenced by such external circumstances: “Virtue in rags is still virtue” (T 3.3.1.19). At the same time, we might be puzzled how this could be the case since we naturally give stronger praise to the person whose good fortune enables her virtuous traits to produce actual benefits (T 3.3.1.21). Hume makes a two-fold response here. First, because we know that (for instance) a generous character is often correlated with benefits to society, we establish a “general rule” that links these together (T 3.3.1.20). Second, when we take up the general point of view, we ignore the obstacles of misfortune that prevent this virtuous person’s traits from achieving their intended goal (T 3.3.1.21). Just as we discount spatial-temporal proximity, so we also discount the influence of fortune when making moral evaluations of another’s character traits.

So, adopting the general point of view requires spectators to set aside a multitude of considerations: self-interest, demographic resemblance, spatial-temporal proximity, and the influence of fortune. What motivates us to adopt this vantage point? Hume explains that doing so enables us to discuss the evaluations we make of others. If we each evaluated from our personal perspective, then a character that garnered the highest praise from me might garner only than mild praise from you. The general point of view, then, provides a common basis from which differently situated individuals can arrive at some common understanding of morality (T 3.3.1.15). Still, Hume notes that this practical solution may only regulate our language and public judgments of our peers. Our personal feelings often prove too entrenched. When our actual sentiments are too resistant to correction, Hume notes that we at least attempt to conform our language to the objective standard (T 3.3.1.16).

In addition to explaining why it is that we adopt the general point of view, one might also think that Hume owes us an explanation of why this perspective constitutes the standard of correctness for moral evaluation. In one place Hume states that the “corrections” we make to our sentiments from the general point of view are “alone regarded, when we pronounce in general concerning the degrees of vice and virtue” (T 3.3.1.21). Nine paragraphs later Hume again emphasizes that the sentiments we feel from the general point of view constitute the “standard of virtue and morality” (T 3.3.1.30). What gives the pronouncements we make from the general point of view this authoritative status?

Hume scholars are divided on this point. One possibility, developed by Geoffrey Sayre-McCord, is that adopting the general point of view enables us to avoid the practical conflicts that inevitably arise when we judge character traits from our individual perspectives (Sayre-McCord 1994: 213-220). Jacqueline Taylor, focusing primarily on the second Enquiry, argues that the normative authority of the general point of view arises from the fact that it arises from a process of social deliberation and negotiation requiring the virtues of good judgment (Taylor 2002). Rachel Cohon argues that evaluations issuing from the general point of view are most likely to form true ethical beliefs (Cohon 2008: 152-156). In a somewhat similar vein, Kate Abramson argues that the general point of view enables us to correctly determine whether some character trait enables its possessor to act properly within the purview of her relationships and social roles (Abramson 2008: 253). Finally, Phillip Reed argues that, to the contrary, the general point of view does not constitute Hume’s “standard of virtue” (Reed 2012).

3. Sympathy and Humanity

a. Sympathy

We have seen that, for Hume, a sentiment can qualify as a moral sentiment only if it is not the product of pure self-interest. This implies that human nature must possess some capacity to get outside of itself and take an interest in the fortunes and misfortunes of others. When making moral evaluations we approve qualities that benefit the possessor and her associates, while disapproving of those qualities that make the possessor harmful to herself or others (T 3.3.1.10). This requires that we can take pleasure in that which benefits complete strangers. Thus, moral evaluation would be impossible without the capacity to partake of the pleasure (or pain) of any being that shares our underlying human nature. Hume identifies “sympathy” as the capacity that makes moral evaluation possible by allowing us to take an interest in the public good (T 3.3.1.9). The idea that moral evaluation is based upon sympathy can also be found in the work of Hume’s contemporary Adam Smith (1723-1790). However, the account of sympathy found in Smith’s work also differs in important ways from what we find in Hume.

Because of the central role that sympathy plays in Hume’s moral theory, his account of sympathy deserves further attention. Hume tells us that sympathy is the human capacity to “receive” the feelings and beliefs of other people (T 2.1.11.2). That is, it is the process by which we experience what others are feeling and thinking. This process begins by forming an idea of what another person is experiencing. This idea might be formed through observing the effects of another’s feeling (T 2.1.11.3). For instance, from my observation that another person is smiling, and my prior knowledge that smiling is associated with happiness, I form an idea of the other’s happiness. My idea of another’s emotion can also be formed prior to the other person feeling the emotion. This occurs through observing the usual causes of that emotion. Hume provides the example of someone who observes surgical instruments being prepared for a painful operation. He notes that this person would feel terrified for the person about to suffer through the operation even though the operation had not yet begun (T 3.3.1.7). This is because the observer already established a prior mental association between surgical instruments and pain.

Since sympathy causes us to feel the sentiments of others, simply having an idea of another’s feeling is insufficient. That idea must be converted into something with more affective potency. Our idea of what another feels must be transformed into an impression (T 2.1.11.3). The reason this conversion is possible is that the only difference between impressions and ideas is the intensity with which they are felt in the mind (T 2.1.11.7). Recall that impressions are the most forceful and intense whereas ideas are merely “faint images” of our impressions (T 1.1.1.1). Hume identifies two facts about human nature which explain what causes our less vivacious idea of another’s passion to be converted into an impression and, notably, become the very feeling the other is experiencing (T 2.1.11.3). First, we always experience an impression of ourselves which is not surpassed in force, vivacity, and liveliness by any other impression. Second, because we have this lively impression of ourselves, Hume believes it follows that whatever is related to that impression must receive some share of that vivacity (T 2.1.11.4). From these points, it follows that our idea of another’s impression will be enlivened if that idea has some relation to ourselves.

Hume explains the relationship between our idea of another’s emotion and ourselves in terms of his more general conception of how the imagination produces associations of ideas. Hume understands the association of ideas as a “gentle force” that explains why certain mental perceptions repeatedly occur together. He identifies three such ways in which ideas become associated: resemblance (the sharing of similar characteristics), contiguity (proximity in space or time), and causation (roughly, the constant conjunction of two ideas in which one idea precedes another in time) (T 1.1.4.1). Hume appeals to each of these associations to explain the relationship between our idea of another’s emotion and our impression of self (T 2.1.11.6). However, resemblance plays the most important role. Although each individual human is different from one another, there is also an underlying commonality or resemblance within all members of the human species (T 2.1.11.5). For example, when we form an idea of another’s happiness, we implicitly recognize that we ourselves are also capable of that same feeling. That idea of happiness, then, becomes related to ourselves and, consequently, receives some of the vivacity that is held by the impression of our self. In this way, our ideas of how others feel become converted into impressions and we “feel with” our fellow human beings.

Although sympathy makes it possible for us to care for others, even those we have no close or immediate connection with, Hume acknowledges that it does not do so in an entirely impartial or egalitarian manner. The strength of our sympathy is influenced both by the universal resemblance that exists among all human beings as well as more parochial types of resemblances. We will sympathize more easily with those who share various demographic similarities such as language, culture, citizenship, or place of origin (T 2.1.11.5). Consequently, when the person we are sympathizing with shares these similarities we will form a stronger conception of their feelings, and when such similarities are absent our conception of their feeling will be comparatively weaker. Likewise, we will have stronger sympathy with those who live in our own city, state, country, or time, than we will with those who are spatially or temporally distant. In fact, it is this aspect of sympathy which prompts Hume to introduce the general point of view (discussed above). It is our natural sympathy that causes us to give stronger praise those who exist in closer spatial-temporal proximity, even though our considered moral evaluations do not exhibit such variation. Hume poses this point as an objection to his claim that our moral evaluations proceed from sympathy (T 3.3.1.14). Hume’s appeal to the general point of view allows him to respond to this objection. Moral evaluations arise from sympathetic feelings that are corrected by the influence of the general point of view.

b. Humanity

While sympathy plays a crucial role in Hume’s moral theory as outlined in the Treatise, explicit mentions of sympathy are comparatively absent from the Enquiry. In place of Hume’s detailed description of sympathy, we find Hume appealing to the “principle of humanity” (EPM 9.6). He understands this as the human disposition that produces our common praise for that which benefits the public and common blame for that which harms the public (EPM 5.39). The principle of humanity explains why we prefer seeing things go well for our peers instead of seeing them go badly. It also explains why we would not hope to see our peers suffer if that suffering in no way benefited us or satisfied our resentment from a prior provocation (EPM 5.39). Like sympathy, then, Hume uses humanity to explain our concern for the well-being of others. However, Hume’s discussion of humanity in the Enquiry does not appeal (at least explicitly) to the cognitive mechanism that underlies Hume’s account of sympathy, and he even expresses skepticism about the possibility of explaining this mechanism. So, the Enquiry does not discuss how our idea of another’s pleasures and pains is converted into an impression. This does not necessarily mean that sympathy is absent from the Enquiry. For instance, in Enquiry Section V Hume describes having the feelings of others communicated to us (EPM 5.18) and details how sharing our sentiments in a social setting can strengthen our feelings (EPM 5.24, EPM 5.35).

As he did with sympathy in the Treatise, Hume argues that the principle of humanity makes moral evaluations possible. It is because we naturally approve of that which benefits society, and disapprove of that which harms society, that we see some character traits as virtuous and others as vicious. Hume’s justification for this claim follows from his rejection of the egoists (EPM 5.6). Here Hume has in mind those like Thomas Hobbes (1588-1679) and Bernard Mandeville (1670-1733), who each believed that our moral judgments are the product of self-interest. Those qualities we consider virtuous are those that serve our interests, and those that we consider vicious are those that do not serve our interests. Hume gives a variety of arguments against this position. He contends that egoism cannot explain why we praise the virtues of historical figures (EPM 5.7) or recognize the virtues of our enemies (EPM 5.8). If moral evaluations are not the product of self-interest, then Hume concludes that they must be caused by some principle which gives us real concern for others. This is the principle of humanity. Hume admits that the sentiments produced by this principle might often be unable to overpower the influence that self-interest has on our actions. However, this principle is strong enough to give us at least a “cool preference” for that which is beneficial to society, and provides the foundation upon which we distinguish the difference between virtue and vice (EPM 9.4).

4. Hume’s Classification of the Virtues and the Standard of Virtue

Since Hume thinks virtuous qualities benefit society, while vicious qualities harm society, one might conclude that Hume should be placed within the utilitarian moral tradition. While Hume’s theory has utilitarian elements, he does not think evaluations of virtue and vice are based solely upon considerations of collective utility. Hume identifies four different “sources” of moral approval, or four different effects of character traits that produce pleasure in spectators (T 3.3.1.30). Hume generates these categories by combining two different types of benefits that traits can have (usefulness and immediate agreeability) with two different types of benefactor that a trait can have (the possessor of the trait herself and other people) (EPM 9.1). Below is an outline of the four resulting sources of moral approval.

We praise traits that are useful to others. For example, justice (EPM 3.48) and benevolence (EPM 2.22).
We praise traits that are useful to the possessor of the trait. For example, discretion or caution (EPM 6.8), industry (EPM 6.10), frugality (EPM 6.11), and strength of mind (EPM 6.15).
We praise traits with immediate agreeability to others. For example, good manners (EPM 8.1) and the ability to converse well (EPM 8.5).
We praise traits that are immediately agreeable to the possessor. For example, cheerfulness (EPM 7.2) and magnanimity (EPM 7.4-7.18).

What does Hume mean by “immediate agreeability”? Hume explains that immediately agreeable traits please (either the possessor or others) without “any further thought to the wider consequences that trait brings about” (EPM 8.1). Although being well-mannered has beneficial long-term consequences, Hume believes we also praise this trait because it is immediately pleasing to company. As we shall see below, this distinction implies that a trait can be praised for its immediate agreeability even if the trait has harmful consequences more broadly.

There is disagreement amongst Hume scholars about how this classification of virtue is related to Hume’s definition of what constitutes a virtue, or what is termed the “standard of virtue.” That is, what is the standard which determines whether some character trait counts as a virtue? The crux of this disagreement can be found in two definitions of virtue that Hume provides in the second Enquiry.

First Definition: “personal merit consists altogether in the possession of mental qualities, useful or agreeable to the person himself or to others” (EPM 9.1).

Second Definition: “It is the nature, and, indeed, the definition of virtue, that it is a quality of the mind agreeable to or approved of by every one who considers or contemplates it” (EPM 8.n50).

The first definition suggests that virtue is defined in terms of its usefulness or agreeableness. On this basis, we might interpret Hume as believing that a trait fails to qualify as a virtue if it is neither useful nor agreeable. This interpretation is also supported by places in the text where Hume criticizes approval of traits that fail to meet the standard of usefulness and agreeableness. One prominent example is his discussion of the religiously motivated “monkish virtues.” There he criticizes those who praise traits such as “[c]elibacy, fasting, penance, mortification, self-denial, humility, silence, solitude” on the grounds that these traits are neither useful to society nor agreeable to their possessors (EPM 9.3). The second definition, however, holds that what determines whether some character trait warrants the status of virtue is the ability of that trait to generate spectator approval. On this view, some trait is a virtue if it garners approval from a general point of view, and the sources of approval (usefulness and agreability) simply describe those features of character traits that human beings find praiseworthy.

5. Justice and the Artificial Virtues

The four-fold classification of virtue discussed above deals with the features of character traits that attract our approval (or disapproval). However, in the Treatise Hume’s moral theory is primarily organized around a distinction between the way we approve (or disapprove) of some character trait. Hume tells us that some virtues are “artificial” whereas other virtues are “natural” (T 3.1.2.9). In this context, the natural-artificial distinction tracks whether the entity in question results from the plans or designs of human beings (T 3.1.2.9). On this definition, a tree would be natural whereas a table would be artificial. Unlike the former, the latter required some process of human invention and design. Hume believes that a similar type of distinction is present when we consider different types of virtue. There are natural virtues like benevolence, and there are artificial virtues like justice and rules of property. In addition to justice and property, Hume also classifies the keeping of promises (T 3.1.2.5), allegiance to government (T 3.1.2.8), laws of international relations (T 3.1.2.11), chastity (T 3.1.2.12), and good manners (T 3.1.2.12) as artificial virtues.

The designs that constitute the artificial virtues are social conventions or systems of cooperation. Hume describes the relationship between artificial virtues and their corresponding social conventions in different ways. The basic idea is that we would neither have any motive to act in accordance with the artificial virtues (T 3.2.1.17), nor would we approve of artificially virtuous behavior (T 3.2.1.1), without the relevant social conventions. No social scheme is needed for us to approve of an act of kindness. However, the very existence of people who respect property rights, and our approval of those who respect property rights, requires some set of conventions that specify rules regulating the possession of goods. As we will see, Hume believes the conventions of justice and property are based upon collective self-interest. In this way, Hume uses the artificial-natural virtue distinction to carve out a middle position in the debate between egoists (like Hobbes and Mandeville), who believe that morality is a product of self-interest, and moral sense theorists (like Shaftesbury and Hutcheson), who believe that our sense of virtue and vice is natural to human nature. The egoists are right that some virtues are the product of collective self-interest (the artificial virtues), but the moral sense theorists are also correct insofar as other virtues (the natural virtues) have no relation to self-interest.

a. The Circle Argument

In Treatise 3.2.1 Hume provides an argument for the claim that justice is an artificial virtue (T 3.2.1.1). Understanding this argument requires establishing three preliminary points. First, Hume uses the term “justice,” at least in this context, to refer narrowly to the rules that regulate property. So, his purpose here is to prove that the disposition to follow the rules of property is an artificial virtue. That is, it would make no sense to approve of those who are just, nor to act justly, without the appropriate social convention. Second, Hume uses the concept of a “mere regard to the virtue of the action” (T 3.2.1.4) or a “sense of morality or duty” (T 3.2.1.8). This article uses the term “sense of duty.” The sense of duty is a specific type of moral motivation whereby someone performs a virtuous action only because she feels it is her ethical obligation to do so. For instance, imagine that someone has a job interview and knows she can improve her chances of success by lying to the interviewers. She might still refrain from lying, not because this is what she desires, but because she feels it is her moral obligation. She has, thus, acted from a sense of duty.

Third, a crucial step in Hume’s argument involves showing that a sense of duty cannot be the “first virtuous motive” to justice (T 3.2.1.4). What does it mean for some motive to be the “first motive?” It is tempting to think that Hume uses the phrase “first motive” as a synonym for “original motive.” Original motives are naturally present in the “rude and more natural condition” of human beings prior to modern social norms, rules, and expectations (T 3.2.1.9). For example, parental affection provides an original motive to care for one’s children (T 3.2.1.5). As we will see, Hume does not believe that the sense of duty can be an original motive to justice. One can only act justly from a sense of duty after some process of education, training, or social conditioning (T 3.2.1.9). However, while Hume does believe that many first motives are original in human nature, it cannot be his position that all first motives are original in human nature. This is because he does not believe there is any original motive to act justly, but he does think there is a first motive to act justly. Therefore, it is best to understand Hume’s notion of the first motive to perform some action as whatever motive (whether original or arising from convention) first causes human beings to perform that action.

With these points in place, let us consider the basic structure of Hume’s reasoning. His fundamental claim is that there is no original motive that can serve as the first virtuous motive of just actions. That is, there is nothing in the original state of human nature, prior to the influence of social convention, that could first motivate someone to act justly. While in our present state a “sense of duty” can serve as a sufficient motive to act justly, human beings in our natural condition would be bewildered by such a notion (T 3.2.1.9). However, if no original motive can be found that first motivates justice, then it follows that justice must be an artificial virtue. This is implied from Hume’s definition of artificial virtue. If the first motive for some virtue is not an original motive, then that virtue must be artificial.

Against Hume, one might argue that human beings have a natural “sense of justice” and that this serves as an original motive for justice. Hume rejects this claim with an argument commonly referred to as the “Circle Argument.” The foundation of this argument is the previously discussed claim that when making a moral evaluation of an action, we are evaluating the motive, character trait, or disposition that produced that action (T 3.2.1.2). Hume points out that we often retract our blame of another person if we find out they had the proper motive, but they were prevented from acting on that motive because of unfortunate circumstances (T 3.2.1.3). Imagine a good-hearted individual who gives money to charity. Suppose also that, through no fault of her own, her donation fails to help anyone because the check was lost in the mail. In this case, Hume argues, we would still praise this person even though her donation was not beneficial. It is the willingness to help that garners our praise. Thus, the moral virtue of an action must derive completely from the virtuous motive that produces it.

Now, assume for the sake of argument that the first virtuous motive of some action is a sense of duty to perform that action. What would have to be the case for a sense of duty to be a virtuous motive that is worthy of praise? At minimum, it would have to be true that the action in question is already virtuous (T 3.2.1.4). It would make no sense to claim that there is a sense of duty to perform action X, but also hold that action X is not virtuous. Unfortunately, this brings us back to where we began. If action X is already virtuous prior to our feeling any sense of duty to perform it, then there must likewise already be some other virtuous motive that explains action X’s status as a virtue. Thus, since some other motive must already be able to motivate just actions, a sense of duty cannot be the first motive to justice. Therefore, our initial assumption causes us to “reason in a circle” (T 3.2.1.4) and, consequently, must be false. From this, it follows that an action cannot be virtuous unless there is already some motive in human nature to perform it other than our sense, developed later, that performing the action is what is morally right (T 3.2.1.7). The same, then, would hold for the virtue of justice. This does not mean that a sense of duty cannot motivate us to act justly (T 3.2.1.8), nor does it necessarily mean that a sense of duty cannot be a praiseworthy motive. Hume’s point is simply that a sense of duty cannot be what first motivates us to act virtuously.

Having dispensed with the claim that a sense of duty can be an original motive, Hume then considers (and rejects) three further possible candidates of original motives that one might claim could provide the first motive to justice. These are: (i) self-interest, (ii) concern for the public interest, (iii) concern for the interests of the specific individual in question. Hume does not deny that each of these are original motives in human nature. Instead, he argues that none of them can adequately account for the range of situations in which we think one is required to act justly. Hume notes that unconstrained self-interest causes injustice (T 3.2.1.10), that there will always be situations in which one can act unjustly without causing any serious harm to the public (T 3.2.1.11), and that there are situations in which the individual concerned will benefit from us acting unjustly toward her. For example, this individual could be a “profligate debauchee” who would only harm herself by keeping her possessions (T 3.2.1.13). Consequently, if there is no original motive in human nature that can produce just actions, it must be the case that justice is an artificial virtue.

b. The Origin of Justice

Thus far Hume has established that justice is an artificial virtue, but has still not identified the “first motive” of justice. Hume begins to address this point in the next Treatise section entitled “Of the origin of justice and property.” We will see, however, that Hume’s complete account of what motivates just behavior goes beyond his comments here. Hume begins his account of the origin of justice by distinguishing two questions.

Question 1: What causes human beings in their natural, uncultivated state to form conventions that specify property rights? That is, how do the conventions of justice arise?

Question 2: Once the conventions of justice are established, why do we consider it a virtue to follow the rules specified by those conventions? In other words, why is justice a virtue?

Answering Question 1 requires determining what it is about the “natural” human condition (prior to the establishment of modern, large-scale society) that motivates us to construct the specific rules, norms, and social expectations associated with justice. Hume does this by outlining an account of how natural human beings come to recognize the benefits of establishing and preserving practices of cooperation.

Hume begins by claiming that the human species has many needs and desires it is not naturally equipped to meet (T 3.2.2.2). Human beings can only remedy this deficiency through societal cooperation that provides us with greater power and protection from harm than is possible in our natural state (T 3.2.2.3). However, natural humans must also become aware that societal cooperation is beneficial. Fortunately, even in our “wild uncultivated state,” we already have some experience of the benefits that are produced through cooperation. This is because the natural human desire to procreate, and care for our children, causes us to form family units (T 3.2.2.4). The benefits afforded by this smaller-scale cooperation provide natural humans with a preview of the benefits promised by larger-scale societal cooperation.

Unfortunately, while our experience with living together in family units shows us the benefits of cooperation, various obstacles remain to establishing it on a larger scale. One of these comes from familial life itself. The conventions of justice require us to treat others equally and impartially. Justice demands that we respect the property rights of those we love and care for just as we respect the property rights of those whom we do not know. Yet, family life only strengthens our natural partiality and makes us place greater importance on the interests of our family members. This threatens to undermine social cooperation (T 3.2.2.6). For this reason, Hume argues that we must establish a set of rules to regulate our natural selfishness and partiality. These rules, which constitute the conventions of justice, allow everyone to use whatever goods we acquire through our labor and good fortune (T 3.2.2.9). Once these social norms are in place, it then becomes possible to use terms such as “property, right, [and] obligation” (T 3.2.2.11).

This account further supports Hume’s claim that justice is an artificial virtue. Justice remedies specific problems that human beings face in their natural state. If circumstances were such that those problems never arose, then the conventions of justice would be pointless. Certain background conditions must be in place for justice to originate. John Rawls (1921-2002) refers to these conditions as the “circumstances of justice” (Rawls 1971: 126n). The remedy of justice is required because the goods we acquire are vulnerable to being taken by others (T 3.2.2.7), resources are scarce (T 3.2.2.7), and human generosity is limited (T 3.2.2.6). Regarding scarcity and human generosity, Hume explains that our circumstances lie at a mean between two extremes. If resources were so prevalent that there were enough goods for everyone, then there would be no reason to worry about theft or establish property rights (EPM 3.3). On the other hand, if scarcity were too extreme, then we would be too desperate to concern ourselves with the demands of justice. Nobody worries about acting justly after a shipwreck (EPM 3.8). In addition, if humans were characterized by thoroughgoing generosity, then we would have no need to restrain the behavior of others through rules and restrictions (EPM 3.6). By contrast, if human beings were entirely self-interested, without any natural concern for others, then there could be no expectation that others would abide by any rules that are established (EPM 3.9). Justice is only possible because human life is not characterized by these extremes. If human beings were characterized by universal generosity, then justice could be replaced with “much nobler virtues, and more valuable blessings” (T 3.2.2.16).

Another innovative aspect of Hume’s theory is that he does not believe the conventions of justice are based upon promises or explicit agreements. This is because Hume believes that promises themselves only make sense if certain human conventions are already established (T 3.2.2.10). Thus, promises cannot be used to explain how human beings move from their natural state to establishing society and social cooperation. Instead, Hume explains that the conventions of justice arise from “a general sense of common interest” (T 3.2.2.10) and that cooperation can arise without explicit agreement. Once it is recognized that everyone’s interest is served when we all refrain from taking the goods of others, small-scale cooperation becomes possible (T 3.2.2.10). In addition to allowing for a sense of security, cooperation serves the common good by enhancing our productivity (T 3.2.5.8). Our understanding of the benefits of social cooperation becomes more acute by a gradual process through which we steadily gain more confidence in the reliability of our peers (T 3.2.2.10). None of this requires an explicit agreement or promise. He draws a comparison with how two people rowing a boat can cooperate by an implicit convention without an explicit promise (T 3.2.2.10).

Although the system of norms that constitutes justice is highly advantageous and even necessary for the survival of society (T 3.2.2.22), this does not mean that society gains from each act of justice. An individual act of justice can make the public worse off than it would have otherwise been. For example, justice requires us to pay back a loan to a “seditious bigot” who will use the money destructively or wastefully (T 3.2.2.22). Artificial virtues differ from the natural virtues in this respect (T 3.3.1.12). This brings us to Hume’s second question about the virtue of justice. If not every act of justice is beneficial, then why do we praise obedience to the rules of justice? The problem is especially serious for large, modern societies. When human beings live in small groups the harm and discord caused by each act of injustice is obvious. Yet, this is not the case in larger societies where the connection between individual acts of justice and the common good is much weaker (T 3.2.2.24).

Consequently, Hume must explain why we continue to condemn injustice even after society has grown larger and more diffuse. On this point Hume primarily appeals to sympathy. Suppose you hear about some act of injustice that occurs in another city, state, or country, and harms individuals you have never met. While the bad effects of the injustice feel remote from our personal point of view, Hume notes that we can still sympathize with the person who suffers the injustice. Thus, even though the injustice has no direct influence upon us, we recognize that such conduct is harmful to those who associate with the unjust person (T 3.2.2.24). Sympathy allows our concern for justice to expand beyond the narrow bounds of the self-interested concerns that first produced the rules.

Thus, it is self-interest that motivates us to create the conventions of justice, and it is our capacity to sympathize with the public good that explains why we consider obedience to those conventions to be virtuous (T 3.2.2.24). Furthermore, we can now better understand how Hume answers the question of what first motivates us to act justly. Strictly speaking, the “first motive” to justice is self-interest. As noted previously, it was in the immediate interest of early humans living in small societies to comply with the conventions of justice because the integrity of their social union hinged upon absolute fidelity to justice. As we will see below, this is not the case in larger, modern societies. However, all that is required for some motive to be the first motive to justice is that it is what first gives humans some reason to act justly in all situations. The fact that this precise motive is no longer present in modern society does not prevent it from being what first motivates such behavior.

c. The Obligation of Justice and the Sensible Knave

Given that justice is originally founded upon considerations of self-interest, it may seem especially difficult to explain why we consider it wrong of ourselves to commit injustice in larger modern societies where the stakes of non-compliance are much less severe. Here Hume believes that general rules bridge the gap. Hume uses general rules as an explanatory device at numerous points in the Treatise. For example, he explains our propensity to draw inferences based upon cause and effect through the influence of general rules (T 1.3.13.8). When we consistently see one event (or type of event) follow another event (or type of event), we automatically apply a general rule that makes us expect the former whenever we experience the latter. Something similar occurs in the present context. Through sympathy, we find that sentiments of moral disapproval consistently accompany unjust behavior. Thus, through a general rule, we apply the same sort of evaluation to our own unjust actions (T 3.2.2.24).

Hume believes our willingness to abide by the conventions of justice is strengthened through other mechanisms as well. For instance, politicians encourage citizens to follow the rules of justice (T 3.2.2.25) and parents encourage compliance of their children (T 3.2.2.26). Thus, the praiseworthy motive that underlies compliance with justice in large-scale societies is, to a large extent, the product of social conditioning. This fact might make us suspicious. If justice is an artificial virtue, and if much of our motivation to follow its rules comes from social inculcation, then we might wonder whether these rules deserve our respect.

Hume recognizes this issue. In the Treatise he briefly appeals to the fact that having a good reputation is largely determined by whether we follow the rules of property (T 3.2.2.27). Theft, and the unwillingness to follow the rules of justice, does more than anything else to establish a bad reputation for ourselves. Furthermore, Hume claims that our reputation in this regard requires that we see each rule of justice as having absolute authority and never succumb when we are tempted to act unjustly (T 3.2.2.27). Suppose Hume is right that our moral reputation hangs on our obedience to the rules of justice. Even if true, it is not obvious that this requires absolute obedience to these rules. What if I can act unjustly without being detected? What if I can act unjustly without causing any noticeable harm? Is there any reason to resist this temptation?

Hume takes up this question directly in the Enquiry, where he considers the possibility of a “sensible knave.” The knave recognizes that, in general, justice is crucial to the survival of society. Yet, the knave also recognizes that there will always be situations in which it is possible to act unjustly without harming the fabric of social society. So, the knave follows the rules of justice when he must, but takes advantage of those situations where he knows he will not be caught (EPM 9.22). Hume responds that, even if the knave is never caught, he will lose out on a more valuable form of enjoyment. The knave forgoes the ability to reflect pleasurably upon his own conduct for the sake of material gain. In making this trade, Hume judges that knaves are “the greatest dupes” (EPM 9.25). The person who has traded the peace of mind that accompanies virtue in order to gain money, power, or fame has traded away that which is more valuable for something much less valuable. The enjoyment of a virtuous character is incomparably greater than the enjoyment of whatever material gains can be attained through injustice. Thus, justice is desirable from the perspective of our own personal happiness and self-interest (EPM 9.14).

Hume admits it will be difficult to convince genuine knaves of this point. That is, it will be difficult to convince someone who does not already value the possession of a virtuous character that justice is worth the cost (EPM 9.23). Thus, Hume does not intend to provide a defense of justice that can appeal to any type of being or provide a reason to be just that makes sense to “all rational beings.” Instead, he provides a response that should appeal to those with mental dispositions typical of the human species. If the ability to enjoy a peaceful review of our conduct is nearly universal in the human species, then Hume will have provided a reason to act justly that can make some claim upon nearly every human being.

6. The Natural Virtues

After providing his Treatise account of the artificial virtues, Hume moves to a discussion of the natural virtues. Recall that the natural virtues, unlike the artificial virtues, garner praise without the influence of any human convention. Hume divides the natural virtues into two broad categories: those qualities that make a human great and those that make a human good (T 3.3.3.1). Hume consistently associates a cluster of qualities with each type of character. The great individual is confident, has a sense of her value, worth, or ability, and generally possesses qualities that set her apart from the average person. She is courageous, ambitious, able to overcome difficult obstacles, and proud of her achievements (EPM 7.4, EPM 7.10). By contrast, the good individual is characterized by gentle concern for others. This person has the types of traits that make someone a kind friend or generous philanthropist (EPM 2.1). Elsewhere, Hume explains the distinction between goodness and greatness in terms of the relationship we would want to have with the good person or the great person: “We cou’d wish to meet with the one character in a friend; the other character we wou’d be ambitious of in ourselves” (T 3.3.4.2).

Alexander of Macedonia exemplifies an extreme case of greatness. Hume recounts how Alexander responded when his general Parmenio suggested he accept the peace offering made by the Persian King Darius III. When Parmenio advises Alexander to accept Darius’ offering, Alexander responds that “So would I too […] were I Parmenio” (EPM 7.5). There are certain constraints that apply to the average person that Alexander does not think apply to himself. This is consistent with the fact that the great individual has a strong sense of self-worth, self-confidence, and even a sense of superiority.

a. Pride and Greatness of Mind

Given the characteristics Hume associates with greatness, it should not be a surprise that Hume begins the Treatise section entitled “Of Greatness of Mind” by discussing pride (T 3.3.2). Those qualities and accomplishments that differentiate one from the average person are also those qualities most likely to make us proud and inspire confidence. Thus, Hume notes that pride forms a significant part of the hero’s character (T 3.3.2.13). However, Hume faces a problem—how can a virtuous character trait be based upon pride? He observes that we blame those who are too proud and praise those with enough modesty to recognize their own weaknesses (T 3.3.2.1). If we commonly find the pride of others disagreeable, then why do we praise the boldness, confidence, and prideful superiority of the great person?

Hume must explain when pride is praiseworthy, and when it is blameworthy. In part, Hume believes expressions of pride become disagreeable when the proud individual boasts about qualities she does not possess. This results from an interplay between the psychological mechanisms of sympathy and comparison. Sympathy enables us to adopt the feelings, sentiments, and opinions of other people and, consequently, participate in that which affects another person. Comparison is the human propensity for evaluating the situation of others in relation to ourselves. It is through comparison that we make judgments about the value of different states of affairs (T 3.3.2.4). Notice that sympathy and comparison are each a stance or attitude we can take toward those who are differently situated. For example, if another individual has secured a desirable job opportunity (superior to my own), then I might sympathize with the benefits she reaps from her employment and participate in her joy. Alternatively, I might also compare the benefits and opportunities her job affords with my own lesser situation. The result of this would be a painful feeling of inferiority or jealousy. Thus, each of these mechanisms has an opposite tendency (T 3.3.2.4).

What determines whether we will respond with sympathy or comparison to another’s situation? This depends upon how lively our idea of the other person’s situation is. Hume supports this by considering three different scenarios (T 3.3.2.5). First, imagine someone is sitting safely on a beach. Taken by itself, this fact would not provide much enjoyment or satisfaction. This individual might try to imagine some other people who are sailing through a dangerous storm to make her current safety more satisfying by comparison. Yet, since this is an acknowledged fiction, and Hume holds that ideas we believe are true have greater influence than mere imaginations (T 1.3.7.7), doing so would produce neither sympathy nor comparison. Second, imagine that the individual on the beach could see, far away in the distance, a ship sailing through a dangerous storm. In this case, the idea of their precarious situation would be more lively. Consequently, the person on the beach could increase her satisfaction with her own situation by comparison. Yet, it is crucial that this idea of the suffering experienced by those in danger does not become too lively. In a third scenario Hume imagines that those in danger of shipwreck were so close to shore that the observer could see their expressions of fear, anxiety, and suffering. In this case, Hume holds that the idea would be too lively for comparison to operate. Instead, we would fully sympathize with the fear of the passengers and we would not gain any comparative pleasure from their plight.

From this example, Hume derives the following principle: comparison occurs whenever our idea of another’s situation is lively enough to influence our passions, but not so lively that it causes us to sympathize (T 3.3.2.5). Hume uses this principle to explain why we are offended by those who are proud of exaggerated accomplishments. When someone boasts about some quality she does not actually have, Hume believes our conception of her pride has the intermediate liveliness that allows for comparison. Our conception of her pride gains liveliness from her presence directly before us (the enlivening relation of contiguity in space and time). Yet, because we do not believe her claims about her merit, our conception of her pride is not so lively that it causes us to sympathize (T 3.3.2.6). Consequently, we disapprove of someone’s exaggerated arrogance because it makes us compare ourselves unfavorably against the pretended achievements and accomplishments of the conceited individual (T 3.3.2.7).

Importantly, Hume does not categorically condemn pride. Justified pride in real accomplishments is both useful (T 3.3.2.8) and agreeable to the possessor (T 3.3.2.9). However, direct expressions of pride, even if based on legitimate accomplishments, still cause disapproval. Recall that sympathizing with another’s pride requires that we believe their self-evaluation matches their actual merit. Yet, it is difficult for us to have such a belief. This is because we know that people are likely to overestimate the value of their own traits and accomplishments. The consequence is that, as a “general rule,” we are skeptical that another person’s pride is well-founded, and we blame those who express pride directly (T 3.3.2.10). It is because boasting and outward expressions of pride cause discomfort through drawing us into unfavorable comparisons that we develop rules of good manners (T 3.3.2.10). Just as we create artificial rules of justice to preserve the harmony of society, so artificial rules of good manners preserve the harmony of our social interactions. Among these unspoken rules is a prohibition against directly boasting about our accomplishments in the presence of others. However, if others infer indirectly through our actions and comportment that we feel pride, then our pride can garner approval (T 3.3.2.10). Thus, Hume believes that pride can be a virtuous trait of character provided it is not overtly expressed and based upon actual accomplishments (T 3.3.2.11).

Hume uses these points to combat attacks on the worth of pride from two different fronts. First, there are those “religious declaimers” who criticize pride and, instead, favor the Christian view which instead prizes humility (T 3.3.2.13). These religious moralists hold, not just that humility requires us to avoid directly boasting about our accomplishments, but that humility requires sincerely undervaluing our character and accomplishments (T 3.3.2.11). Here Hume seems to have in mind something like the view that we should keep in mind the comparative weakness of our own intellect in comparison to that of God. Or, perhaps, that proper worship of God requires that one humble oneself before the divine with an appropriate sense of relative worthlessness. Hume argues that such conceptions do not accurately represent the common regard we pay to pride (T 3.3.2.13).

The second criticism of pride comes from those who charge that the pride of the great individual often causes personal and social harm. The concern is that praising pride and self-assurance can overshadow the more valuable virtues of goodness. This can be seen most clearly in Hume’s discussion of military heroism. The military hero may cause great harm by leaving the destruction of cities and social unrest in his wake. Yet, despite this acknowledged harm, Hume claims that most people still find something “dazzling” about the military hero’s character that “elevates the mind” (T 3.3.2.15). The pride, confidence, and courage of the hero seem, at least temporarily, to blind us to the negative consequences of the hero’s traits. This pride is not communicated directly, but it is communicated indirectly through observing the hero overcoming daunting challenges. As a result, those who admire the military hero participate via sympathy in the pleasure the military hero derives from his own pride and self-assured courage, and this causes us to overlook the negative consequences of the hero’s actions (T 3.3.2.15).

This passage provides additional confirmation that Hume’s ethics cannot be placed neatly into the utilitarian or consequentialist moral tradition. Just as the religious moralist fails to recognize the common praise given to warranted pride in one’s accomplishments, so the consequentialist fails to recognize the human tendency to praise certain traits of character without considering their social utility. Hume’s ethics reminds us of the value of human greatness. In this vein, he writes that the heroes of ancient times “have a grandeur and force of sentiment, which astonishes our narrow souls, and is rashly rejected as extravagant and supernatural” (EPM 7.17). Likewise, Hume contends that if the ancients could see the extent to which virtues like justice and humanity predominate in modern times, that they would consider them “romantic and incredible” (EPM 7.18). Hume’s ethical theory attempts to give proper credit to the qualities of greatness prized by the ancients, as well as the qualities of goodness emphasized by the moderns.

b. Goodness, Benevolence, and the Narrow Circle

Hume turns to a discussion of goodness in a Treatise section entitled “Of Goodness and Benevolence.” Under the heading of “goodness,” Hume lists the following traits: “generosity, humanity, compassion, gratitude, friendship, fidelity, zeal, disinterestedness, liberality, and all those other qualities, which form the character of the good and benevolent” (T 3.3.3.3). Again, these traits are united by their tendency to make us considerate friends, generous philanthropists, and attentive caregivers.

Hume explains that we praise such qualities both because of their tendency to promote the good of society as well as their immediate agreeability to those who possess them. Generosity, of course, is socially useful insofar as it benefits other people. Hume also sees the gentle virtues of goodness as correctives to the destructive excesses of greatness, ambition, and courage (T 3.3.3.4). A complication here is that evaluating another’s generosity depends significantly upon the scope of benefactors we take into consideration. Praise for socially useful traits comes from sympathizing with the pleasure that is caused to those who benefit from them. How far should our sympathy extend when making this evaluation? How wide is the scope of potential benefactors we must consider when judging whether someone is generous or selfish? For example, if we interpret this scope more narrowly, then we might think that the person who takes good care of her children, helps her friends in need, and pushes for positive change in local politics exhibits admirable generosity with her time, energy, and attention. Contrastingly, if we interpret the scope more expansively, then the fact that she fails to make any positive impact on many people who are suffering all over the world will count against her.

Hume answers that when judging another’s generosity, because we do not expect “impossibilities” from human nature, we limit our view to the agent’s “narrow circle” (T 3.3.3.2). Broadly, Hume’s claim is that we limit our focus to those people that the agent can reasonably be expected to influence. A more detailed explanation of this point requires answering two further questions. First, what is the “impossibility” we do not expect of others? Second, just how “narrow” is the “narrow circle” that Hume believes we focus on when evaluating generosity?

Let’s begin with the first question. Given Hume’s statement that recognition of the “impossibility” comes from our knowledge of human nature (T 3.3.3.2), we might think that Hume is making a claim about the naturally confined altruism of human beings. We do not expect that the generous person will be beneficial to those who live far away because human beings rarely concern themselves with those who are spatio-temporally distant or with whom we infrequently interact (T 3.3.3.2). This reading fits naturally with Hume’s previously discussed claim that the strength of sympathy is influenced by our relation to the person sympathized with. It also coheres well with Hume’s claim, emphasized in his discussion of the “circumstances of justice,” that human beings are naturally selfish (although not completely selfish).

An alternative reading, however, holds that the “impossibility” Hume identifies is not primarily the human inability to care about distant strangers. Hume sometimes discusses the possibility of “extensive sympathy” that enables us to care about those who are distant and unrelated (T 3.3.6.3). This suggests Hume might have some other sort of “impossibility” in mind. One possibility would be the “impossibility” of undertaking effective action outside one’s “narrow circle.” In support of this reading, Hume mentions being “serviceable and useful within one’s sphere” (T 3.3.3.2). Perhaps Hume’s point is just that, given human motivational structure and the practical realities of human life, it is unreasonable to expect someone to be able to have a significant impact beyond the sphere of one’s daily interactions. Although, we should note that the practical boundaries to acting effectively outside one’s “narrow circle” are significantly more relaxed today than they were in Hume’s time.

Moving to the second question, how we understand the “impossibility” of expecting benevolence outside of one’s “narrow circle” may depend upon just how close the boundaries of the “narrow circle” are drawn. Many of the ways Hume refers to the agent’s proper sphere of influence suggest he did not think of it as simply a tightly bound group of personal acquaintances and close relations. In a few passages Hume suggests that we consider all those who have “any” connection or association with the agent (T 3.3.1.18; T 3.3.1.30; T 3.3.3.2). Each of these passages leaves open the possibility that the agent’s “sphere” may be much more expansive than the phrase “narrow circle” would immediately suggest.

The proper sphere of influence may also depend upon the role, position, and relationships that the person in question inhabits. In one place, Hume claims that a perfect moral character is one that is not deficient in its relationships with others (T 3.3.3.9). In the second Enquiry Hume imagines a virtuous individual, Cleanthes, whose excellent character is evidenced by the fact that his qualities enable him to perform all his various personal, social, and professional roles (EPM 9.2). Thus, how “narrow,” or expansive, one’s circle is may depend upon the extent to which that person’s attachments and position make her conduct matter to others. For example, when evaluating the character traits of an elected public official we would consider a wider sphere of influence than we would when considering the same traits in most private citizens.

Benevolence is not only praised for its utility to others. Hume also discusses how it is immediately agreeable to the benevolent individual herself. This is a feature that is found in all emotions associated with love, just as it is a feature of all emotions associated with hatred to be immediately disagreeable (T 3.3.3.4). Mirroring his discussion of military heroism, Hume points out that we cannot help but praise benevolence, generosity, and humanity even when excessive or counter-productive (T 3.3.3.6). We say that someone is “too good” as a way of laying “kind” blame upon them for a harmful act with good-hearted intentions (EPM 7.22). Thus, the virtue of benevolence is praised, at least to some extent, in all its forms (T 3.3.3.6; EPM 2.5). However, Hume notes that we react much more harshly to excesses of anger. While not all forms of anger should be criticized (T 3.3.3.7), excessive anger or cruelty is the worst vice (T 3.3.3.8). Whereas cruelty is both immediately disagreeable and harmful, the harms of excessive benevolence can at least be compensated by its inherent agreeability.

c. Natural Abilities

Hume’s ethics is based upon the idea that virtues are mental traits of persons that garner praise. The resulting “catalogue of virtues” (T 3.3.4.2), then, paints a portrait of what human beings believe to be the ideal member of their species. One might argue that this approach to ethics is fundamentally flawed because a mental trait can garner praise without being a moral quality. For example, the rare ability to learn and understand complex concepts is often seen as a natural talent. Such talent is admirable, but is it a moral virtue? Does it not make more sense to feel pity for someone who lacks some natural ability instead of blaming her for failing her moral duty?

Hume’s position is that there is not a significant difference between the supposed categories of moral virtue and natural ability. To understand his view, we need to answer the following question: why must a virtuous trait be a mental quality or disposition? It is not because other types of traits do not garner the approval of spectators. Hume discusses our approval of sex appeal (T 3.3.5.2), physical fitness (T 3.3.5.3), and health (T 3.3.5.4). He also recognizes how the same principle of sympathy which produces approval of virtue also produces our approval of these physical attributes and our admiration for the wealthy (T 3.3.5.6). Instead, the reason virtue is limited to mental qualities is that virtue is supposed to constitute personal merit, or the set of qualities, dispositions, and characteristics that we specifically admire in persons (EPM 1.10). The implication, then, is that the qualities of the mind constitute who we are as persons. So, while Hume does not deny that there is such a thing as bodily merit, he does not see bodily merit as the proper scope of moral philosophy.

If the “catalogue of virtues” is a list of the mental traits we admire in persons, then the catalogue must include certain qualities not normally placed in the category of moral virtue and vice. Common usage of the terms “virtue” and “vice” is narrower than the set of those qualities that we find admirable about persons (EPM App. 4.1). For example, it is common to think that an extraordinary genius is someone with an exceptional talent (instead of a virtue), or a person who is especially lacking in common sense as having some type of defect (instead of a vice). Despite this common language convention, Hume emphasizes that intelligence and common sense are still mental qualities that we admire in persons. Consequently, Hume states that he will leave it to the “grammarians” to decide where to draw the line between virtue, talent, and natural ability (T 3.3.4.4, EPM App 4.1). It is not a distinction Hume believes is philosophically important since, regardless of precisely where the line is drawn, natural abilities like understanding and intelligence are undoubtedly characteristics we praise in persons. Hume quips that nobody, no matter how “good-natured” and “honest,” could be considered virtuous if he is an “egregious blockhead” (EPM App 4.2).

Hume faced criticism from contemporaries on this point. For example, James Beattie (1753-1803) argued that, while it is entirely appropriate to blame someone for failing to act with generosity or justice, it would be entirely inappropriate to blame someone because they lack beauty or intelligence (Beattie 1773: 294). Beattie holds that some quality can only be considered a moral virtue if it is within our control to develop or, at least, act in ways that are consistent with it. Hume anticipates this objection. He agrees that it would be inappropriate to blame someone for a natural lack of intelligence. Yet, he denies that this shows that natural abilities such as intelligence should not be considered part of personal merit. The reason we do not blame someone for their natural defects is that doing so would be pointless. We blame the person who is unjust, or unkind, because these behavior patterns and dispositions can be changed through exerting social pressure. However, we cannot shame someone into being more intelligent (T 3.3.4.4). Yet, we still think a penetrating mind is a quality possessed by the ideal person. So, while those who lack some natural ability are not to blame, this lack still influences our evaluation of their personal merit.

This issue is important for the for the overall plausibility of Hume’s account of the natural virtues. Specifically, the question of natural abilities has an important connection with the role greatness should play in the catalogue of virtue. Beattie claims that he wants nothing to do with the term “great man.” This is because the person who possesses the natural abilities of Hume’s “great man” is better able to cause destruction and harm. Here we should recall Hume’s description of the military hero. For this reason, Beattie holds that virtue is concerned with the qualities of the “good man” that can be acquired by anyone and tend to the good of society (Beattie 1773: 296). If Beattie is correct that the qualities of greatness are natural abilities, then Hume’s attempt to include both goodness and greatness within the catalogue of virtue requires him to provide a satisfactory defense of this point.

7. References and Further Reading

a. Hume’s Works

Hume, David (2007 [1739-1740]) A Treatise of Human Nature: A Critical Edition, ed. David Fate Norton and Mary J. Norton. Oxford: Clarendon Press.
- Cited in text as “T” followed by Book, part, section, and paragraph numbers.
Hume, David (2000 [1748]) An Enquiry concerning Human Understanding: A Critical Edition, ed. Tom L. Beauchamp. Oxford: Clarendon Press.
- Cited in text as “EHU” followed by section and paragraph.
Hume, David (1998 [1751]) An Enquiry concerning the Principles of Morals: A Critical Edition, ed. Tom L. Beauchamp. Oxford: Clarendon Press.
- Cited in text as “EPM” followed by section and paragraph.
Hume, David Essays Moral, Political, and Literary, ed. Eugene F. Miller, revised edition, (Indianapolis: Liberty Fund, 1987).
- Cited in text as “EMPL” followed by the page number.

b. Further Reading

Baier, Annette (1991) A Progress of Sentiments. Cambridge: Harvard University Press.
- An account of the Treatise that emphasizes the continuity between Hume’s ethics and his epistemology, metaphysics, and skepticism.
Botros, Sophie (2006) Hume, Reason, and Morality: A Legacy of Contradiction.
- Focuses on Hume’s theory of motivation, and arguments against the moral rationalist, and develops an account of why these arguments are still relevant for contemporary metaethical debates.
Bricke, John (1996) Mind and Morality: An Examination of Hume’s Moral Psychology. New York: Oxford University Press.
- Discusses Hume’s theory of agency, the will, and defends a noncognitivist interpretation of Hume on moral evaluation.
Cohon, Rachel (2008) Hume’s Morality: Feeling and Fabrication. New York: Oxford University Press.
- Argues against “standard” views of Hume’s moral philosophy by arguing that Hume’s philosophy is both non-realist and cognitivist. Also includes novel and influential interpretations of the artificial virtues.
Darwall, Stephen (1995) The British Moralists and the Internal ‘Ought.’ Cambridge: Cambridge University Press.
- Places Hume’s theory in its historical context and situates Hume as a member of an empirical, naturalist tradition in ethics alongside thinkers such as Hobbes, Locke, and Hutcheson.
Gill, Michael (2006) The British Moralists on Human Nature and the Birth of Secular Ethics. Cambridge: Cambridge University Press.
- Provides further historical context for Hume’s place within seventeenth and eighteenth-century moral philosophy with a particular focus on the way in which the British moralists founded morality on human nature and disentangled morality from divine and religious sources.
Harrison, Jonathan (1976) Hume’s Moral Epistemology. Oxford: Clarendon Press.
Harrison, Jonathan (1981) Hume’s Theory of Justice. Oxford: Clarendon Press.
- Each of these works provides a detailed, textual, and critical commentary on the major arguments Hume puts forward in service of his metaethical views and his conception of justice.
Herdt, Jennifer (1997) Religion and Faction in Hume’s Moral Philosophy. Cambridge: Cambridge University Press.
- An account of sympathy that focuses on its connection to human sociability and the tendency that sympathy has for allowing human beings to overcome faction and division.
Mackie, J.L. (1980) Hume’s Moral Theory. London: Routledge.
- Situates Hume’s moral theory within the context of his predecessors and successors and provides critical discussion of the main doctrines of Hume’s ethical thought: Hume’s anti-rationalism, sentimentalism, and a detailed discussion and critique of Hume’s artificial-natural virtue distinction.
Mercer, Philip. (1972) Sympathy and Ethics: A Study of the Relationship between Sympathy and Morality with Special Reference to Hume’s Treatise. Oxford: Clarendon Press.
- Provides critical, detailed commentary on Hume’s account of sympathy and its relationship to his moral philosophy.
Norton, David Fate (1982) David Hume: Common-Sense Moralist, Sceptical Metaphysician. Princeton:Princeton University Press.
- Discusses the relation between Hume’s epistemology and ethics. Puts forward the view that Hume was only skeptical regarding the former, but was a realist about morality.
Reed and Vitz (eds.) (2018) Hume’s Moral Philosophy and Contemporary Psychology. New York: Routledge.
- A collection of essays that draws discusses the relevance of Hume’s moral philosophy for a wide array of topics in psychology. These topics include: mental illness, the situationist critique of virtue ethics, character development, sympathy, and the methodology of Hume’s science of human nature among other topics.
Swanton, Christine (2015) The Virtue Ethics of Hume and Nietzsche. Malden, MA: Wiley Blackwell.
- Argues that Hume should be placed within the tradition of virtue ethics. Includes discussion of how a virtue theoretic interpretation can be reconciled with his rejection of rationalism and his sentimentalism, as well as the problem of why justice is a virtue.

c. Other Works Cited

Abramson, Kate (2008) “Sympathy and Hume’s Spectator-centered Theory of Virtue.” In Elizabeth Radcliffe (ed.), A Companion to Hume. Malden, MA: Blackwell Publishing.
Beattie, James (1773) An essay on the nature and immutability of truth, in opposition to sophistry and scepticism. The third edition. Dublin, MDCCLXXIII. Eighteenth Century Collections Online. Gale.
Clarke, Samuel (1991[1706]) A Discourse of Natural Religion. Indianapolis: Hackett Publishing Company.
Rawls, John (1971) A Theory of Justice. Cambridge: Harvard University Press.
Reed, Philip (2012) “What’s Wrong with Monkish Virtues? Hume on the Standard of Virtue.” History of Philosophy Quarterly 29.1: 39-53.
Sayre-McCord, Geoffrey. (1994) “On Why Hume’s ‘General Point of View’ Isn’t Ideal–and Shouldn’t Be.” Social Philosophy and Policy 11.1: 202-228.
Taylor, Jacqueline (2002) “Hume on the Standard of Virtue.” The Journal of Ethics 6: 43-62.

Author Information

Ryan Pollock
Email: pollocrc@gmail.com
Grand Valley State University
U. S. A.

Robert Boyle (1627—1691)

Robert Boyle was one of the most prolific figures in the scientific revolution and the leading scientist of his day. He was a proponent of the mechanical philosophy which sought to explain natural phenomena in terms of matter and motion, rather than appealing to Aristotelian substantial forms and qualities. He was a champion of experimental science, claiming that theory should conform to observation and advocating openness in the publication of experimental results, the replication of experiments for empirical corroboration, and the importance of recording even those experiments that failed, at a time when these ideas were revolutionary. He defended and developed the distinction between primary and secondary qualities and supported it with detailed experimental evidence. With the help of his colleague Robert Hooke (1635-1703), he designed and improved an air pump capable of creating and sustaining a vacuum and used it to perform many famous experiments, investigating things like respiration, disease, combustion, sound, and air pressure. He discovered Boyle’s law, which shows that the volume and pressure of a gas are proportionally related. He used empirical evidence to refute both the four-element theory of Aristotle and the more recent three-principle theory of Paracelsus (1493-1541). Finally, many historians of science consider him to be the father of modern chemistry.

This article focuses on the philosophical significance of Boyle’s work, but it is important to note that Boyle was a polymath with diverse interests ranging from animal husbandry to underwater respiration, from the study of ancient languages to finding ways of extending the human lifespan. Furthermore, Boyle had both the intellect and the financial resources to pursue such a wide research agenda. Focusing on his philosophy, or even his chemistry, runs the risk of ignoring the true complexity of his thought. Nevertheless, much of Boyle’s work has had enduring philosophical significance.

Life
Natural Philosophy
Philosophy of Science
Substance Dualism
Causation
God
Ethics
Casuistry
References and Further Reading

1. Life

Robert Boyle was born on the 25^th of January, 1627, at Lismore Castle, County Waterford, Ireland. He was the fourteenth child of Richard Boyle, the first Earl of Cork, who had come to Ireland from Canterbury, essentially penniless, in 1588. By the time of Boyle’s birth, through a series of shrewd and sometimes shady real estate ventures, Cork had become the wealthiest man in Ireland. This incredible wealth can be seen in Boyle’s lavish upbringing and education. After the death of his mother in 1630, Boyle’s daily care and supervision went to a local Irish woman, known today only as Nurse Allen. Allen raised Boyle, teaching him the Irish language, until his eighth year when he was sent away, along with his brother Francis, for a formal education at Eton.

After only three years at Eton, Cork decided to send Boyle, along with his brother Francis, on a grand tour of the continent under the tutelage of Isaac Marcombes. Marcombes was a renowned teacher from Switzerland and had just returned from a similar tour in which he had tutored Boyle’s older brothers. Boyle spent most of the tour in Geneva, at Marcombes’s home, where he studied a variety of subjects, including French, Latin, Italian, geometry, Roman history, philosophy, tennis, fencing, and horseback riding.

During his initial stay in Geneva in 1641, Boyle had a life-changing experience. One night during a terrible storm, he thought the Day of Judgment had come and that he had wasted his life on trivial pursuits. Boyle made an oath that he would dedicate himself to the Christian service of humanity if he was allowed to survive. The next morning, after the storm had passed, the young Boyle swore the oath again to demonstrate his sincerity. For the rest of his life he dedicated himself to various charitable endeavors. Even much of his later scientific work was directly motivated by what Boyle perceived as his religious duty. This event also led Boyle to a renewed dedication to his studies, as well as a lifelong aversion to swearing oaths. Later in life, for example, he declined the presidency of the Royal Society because it required swearing an oath. He even wrote a treatise, A Free Discourse Against Customary Swearing (1695).

During the grand tour, Boyle also travelled in France and Italy. They tried to visit Galileo, and Boyle studied Italian to read Galileo’s works in preparation, but the great scientist died before Boyle could meet him. The grand tour came to an end when Boyle received the news that his father had died. After sufficient finances were secured, Boyle returned to England and eventually settled at the family estate at Stalbridge, where he devoted himself to writing chivalric romances, a common literary form at the time, and moral treatises.

It is hard to determine when Boyle developed a serious interest in natural philosophy, but a few events are noteworthy. The Boyle scholar Michael Hunter puts it in the early1650s, warning against an interpretation that makes it seem inevitable that Boyle would become a scientist. However, we should not ignore events in Boyle’s life that indicate an early interest in natural philosophy, and the more one looks into the matter, the more a steady interest in natural philosophy becomes apparent. While it was not inevitable that Boyle would become a scientist, neither is it surprising.

Boyle had been familiar with the work of Aristotle, Bacon, and Galileo since his days at Eton. As early as 1646, events in Boyle’s life show an increasing interest in chemistry. An important letter to his sister Katherine Ranelagh (1615-1691) from May of that year shows that Boyle made a serious attempt to design and construct a chemical laboratory at Stalbridge. The attempt was unsuccessful, since an essential furnace was delivered “crumbled into as many pieces as we are into sects!” But the attempt itself is sufficient evidence of a serious interest in natural philosophy. Nevertheless, ethics was still Boyle’s primary philosophical concern during this period.

A trip to Leiden to attend his brother’s wedding in 1648 is also pertinent because at that time there was a thriving intellectual community of natural philosophers, with multiple schools of anatomy and the controversial mechanical philosophy of Rene Descartes (1596-1650) being discussed all over Holland. During this trip Boyle visited the University of Leiden and viewed an experiment on the nature of light in which the image of the city was projected onto the wall in the room of a high tower. This event may be the cause of the once-common view that Boyle studied there. However, these early experiences do pale in importance next to a conversion experience Boyle had in the early 1650s, when he essentially became a scientist. Boyle had found a way to combine his interests in natural philosophy with his pledge to dedicate his life to philanthropic pursuits.

He was encouraged in these endeavors by his older sister and best friend, the Lady Katherine Ranelagh (1615-1691). His relationship with Ranelagh would be the closest one of his life. Ranelagh was an important natural philosopher in her own right, respected and consulted by her contemporaries, who found a way to pursue her scientific research within the confines of the strict gender norms of seventeenth-century England. Later in life, at her London estate on Pall Mall, she would become Boyle’s intellectual companion, editor, and most trusted collaborator. In the early 1650s, her main contribution to Boyle’s philosophical development was her participation in the Hartlib Circle.

Samuel Hartlib (1600-1662) was a German polymath who moved to England in 1628 and recruited intellectuals and experts in all sorts of fields for a variety of religious philanthropic endeavors, including projects in medicine, public education, agriculture, animal husbandry, and translations of the Bible into other languages (Boyle would eventually help with projects to translate the Bible into Irish, Malay, and Algonquin). The members of the circle included Heinrich Appelius, Friedrich Clodius, Cheney Culpeper, John Dury, Theordore Haack, Godofred Hotton, Joachim Hubner, Katherine Ranelagh, Johann Moriaen, John Pell, William Petty, Johann Rulicius, John Sadler, George Starkey, and Benjamin Worsley. Hartlib, like Marine Mersenne (1588-1648), had a vast network of correspondence, with so many individuals that it is hard to establish a comprehensive list. However, it is important to note that by the time Boyle began participating in the circle, Ranelagh was already an established member. Furthermore, Ranelagh was a very important member, since out of the 766 names mentioned in Hartlib’s correspondence, Ranelagh’s is the sixth most mentioned. The group’s activities were inspired by the utopian writings of Francis Bacon (1561-1626), and it is Bacon who would have the single greatest influence on Boyle’s philosophy. The Hartlib Circle became a prototype of the modern scientific research society. It was eventually replaced by formal scientific societies, such as the Royal Society, of which Boyle was a founding member.

In 1652, Boyle briefly returned to Ireland to settle matters involving his inheritance. Although he was in Ireland only a short time, Hartlib recruited Boyle to work on a number of projects. Boyle was asked to create a Baconian natural history of Ireland, and research ways of developing new agricultural and animal husbandry techniques there, but these projects never got off the ground. By now, Boyle’s primary interest was to learn the empirically oriented chemistry of Jean Baptise Van Helmont (1580-1644). He was being helped in this endeavor through correspondence with the American alchemist George Starkey (1628-1665). However, unable to establish a chemical laboratory in Ireland, Boyle spent his time reading up to 12 hours a day and learning anatomy from William Petty (1623-1687), who had learned anatomy in Leiden before teaching it at Oxford, then following Cromwell to Ireland as Physician General.

Boyle’s serious investigations into natural philosophy really began when he became Starkey’s pupil. In Alchemy Tried in the Fire: Starkey, Boyle, and the Fate of Helmontian Chymistry (2002), William Newman and Lawrence Principe present a detailed analysis of Starkey’s influence on Boyle’s chemical education. They suggest using the term Chymistry to refer to the general group of issues concerning alchemy and chemistry in the early modern period, noting that the terms were then often used synonymously, while they have very different connotations in contemporary discourse.

Starkey was greatly influenced by Van Helmont, and Boyle eventually replicated many of Van Helmont’s experiments. By the time Boyle returned to England he was thoroughly absorbed in natural philosophy, wasting little time in moving to Oxford, networking with other scientists, and establishing the laboratory for which he is now famous. From this point, and for the rest of his life, Boyle was constantly conducting experiments. His published works, correspondence, and work notes—many of which survive—became full of detailed accounts of them. Boyle spent this important part of his career in one of the most thriving intellectual environments in the world at the time, working on a variety of projects involving both chemical analysis as well as experiments involving medicine, pneumatics, and hydraulics. He became involved with a group of like-minded, anti-Aristotelian, natural philosophers, which included John Locke (1632-1704) and eventually Isaac Newton (1643-1727), who regarded Boyle’s work on pneumatics as a paradigm of science.

The natural philosopher Robert Hooke (1635-1703) began his career as Boyle’s laboratory assistant. Together, they made improvements on the air-pump design made by Otto von Guericke (1602-1686), and produced a machine capable of evacuating most of the air from an observable glass chamber. They did a large number of experiments with it, and by presenting these to noble and socially influential audiences, they produced useful publicity for the scientific activities of the Royal Society. Inspired by Bacon’s conception of science, Boyle developed and used new technological instruments that enabled detailed, replicable observations which he thought revealed the hidden structure of the natural world.

Boyle also became close friends with the young John Locke, who went to Oxford in 1652 to study medicine. They even worked on a few medical projects together. Boyle had a significant influence on Locke’s philosophical development, including his distinction between primary and secondary qualities, and the difference between real and nominal essences.

Some of Boyle’s scientific claims were criticized by Thomas Hobbes (1588-1679), and the two philosophers became involved in a heated public debate over the role of experimental observation in natural philosophy. Poor health caused Boyle to move to London in 1668. There he lived with his sister Katherine for the rest of his life. For over twenty years they worked together on various projects in medicine, natural philosophy, and philanthropy. They received many important visitors who would come to witness his famous experiments.

Boyle died of grief a week after the death of his beloved sister, on December 31, 1691. Locke was the executer of his estate. He left funds to establish a series of annual lectures to defend Christianity against objections to its basic tenets. The lectures continue to this day.

2. Natural Philosophy

Boyle considered natural philosophy to be an important part of philosophy. He believed God gave humans three books to aid in their salvation: “the book of scripture,” “the book of conscience, and “the book of nature.” In works such as Of the Study of the Book of Nature and Some Considerations Touching the Usefulness of Experimental Natural Philosophy (1663), Boyle argues that the natural world had been not only intentionally designed by God, but had been designed specifically to be understood, at least in part, by rational human minds. He believed that humans equipped with reason could make use of detailed observation, under controlled experimental conditions, to uncover the hidden structure of nature. Boyle’s efforts to bring chemistry out of the disreputable shadows of alchemy, as well as all sorts of other projects he undertook in natural philosophy, were justified as part of the theologically acceptable study of the natural world, God’s great automaton, the study of which Boyle believed too many people neglected. Boyle saw it as a religious duty to investigate natural phenomena and publish the knowledge he gained for the benefit of humanity. This Baconian approach to science can be seen throughout his research, including his chemical analyses of medicines, his investigations of air pressure, his study of human anatomy, his invention of the friction match, his efforts to expand the human lifespan, and even his work to advance agriculture and animal husbandry techniques.

Boyle seems to have spent nearly equal time doing experimental natural philosophy, studying the Bible as well as the ancient languages associated with it, and analyzing his own conscience. He put the same intellectually rigorous effort, aided by significant financial resources, into all three. Boyle’s entire philosophy, his metaphysics, his epistemology, and his ethics, are all intertwined with these three religiously motivated projects. Though Boyle is known today mostly for his work in various areas of natural philosophy, these achievements cannot be fully appreciated without understanding their place in Boyle’s religion.

It is important to emphasize that Boyle’s approach to natural philosophy, though influenced by Descartes, is more explicitly indebted to philosophers such as Francis Bacon and Pierre Gassendi (1592-1655). In the article “Pacere Nominibus: Boyle, Hooke and the Rhetorical Interpretation of Descartes” (1994), Edward Davis explores Descartes’s influence on Boyle during the 1660s, under the influence of Hooke, who taught Boyle Cartesian philosophy. However, it is misleading to describe Boyle as a Cartesian.

Descartes’s influence both occurred earlier and was also less extensive than this view implies. Boyle read Descartes’s Passions of the Soul in 1648, before his association with Hooke. While this has been downplayed as a minor work compared to Descartes’s Meditations on First Philosophy and the Principles of Philosophy, it does give an accurate and succinct presentation of Descartes’s philosophy, including his mechanical account of the human body. Furthermore, along with the works of Galileo and Gassendi, it represents one of Boyle’s earliest exposures to the mechanical philosophy. And while Boyle does later present many of his views in Cartesian terms and agrees with his basic dualistic and theist ontology, there are fundamental differences between their philosophies, such as their views on the essence of matter, the possibility of a vacuum, the role of experiment in science, and the possibility and extent of knowledge based on experience. On the other hand, Boyle had been exposed to Bacon’s conception of science since his time at Eton. The provost of Eton, Henry Wotton, was Bacon’s cousin. Furthermore, Bacon’s influence can be seen in the work of many of the members of the Hartlib Circle. Thus, it is more accurate to say that in natural philosophy Boyle was primarily a Baconian who agreed with Gassendi on many important issues, Descartes on others, and often expressed his ideas in Cartesian terms.

In works such as A Discourse on Things Above Reason (1681) and On the High Veneration Man’s Intellect owes to God (1684), Boyle distinguishes between demonstrative rational arguments and what can be inductively inferred from experience. Like Bacon, Boyle believed that theory should conform to observation. He tried to avoid premature metaphysical speculation—with mixed results—in favor of theories that could be tested by experiment. He agreed with Bacon’s claim in Novum Organum (1620) that the hidden structure of the natural world is too subtle to be penetrated by the Aristotelian, deductive approach to science, and that technology can aid in our investigation of the natural world. Boyle thought this approach yielded new scientific information that could be potentially used for the benefit of humanity. Boyle tried to put into practice something like the science Bacon envisioned in works such as Novum Organum (1620), and New Atlantis (1627).

Many areas of Boyle’s philosophy are intimately connected to his natural philosophy, including his rejection of scholastic Aristotelianism, his acceptance of the corpuscular mechanical philosophy, his work in chemistry, alchemy, medicine, and pneumatics, as well as his philosophical views regarding the nature of knowledge, perception, substance, real and nominal essences, causation, and alternative possible worlds.

a. Rejection of Aristotelianism

Central to Boyle’s natural philosophy is his general rejection of scholastic Aristotelianism. In works such as About the Excellency and Grounds of the Mechanical Hypothesis (1674), he rejects Aristotle’s theory of motion as the actualization of a potential, as well as his distinction between natural and unnatural motion, holding that the local motion involved in the mechanical interactions of corpuscles was inherently more intelligible. He also rejected the scholastic notion of substantial form and used controlled experiments to investigate the Aristotelian terrestrial elements, forms, and qualities. For example, Boyle was the first philosopher to write an entire book about cold, a property the scholastics claimed to be one of the four primary qualities of matter, but had actually only discussed in the most general terms. Boyle’s book included all sorts of experiments he had conducted on the nature of cold, each described in meticulous detail.

Boyle rejected the scholastics’ deductive syllogistic approach to science. He agreed with Bacon that the natural world was too complex for the categorical syllogism to penetrate. He thought that scientific progress requires an inductive method that posits a hypothesis that can then be tested by experiment involving multiple controlled observations. Because the theories could be modified in light of new empirical evidence, Boyle believed the experimental method was fundamentally superior to the scholastic syllogistic model of science.

Boyle’s rejection of scholastic Aristotelianism in works such as The Sceptical Chymist (1661), and The Origin of Forms and Qualities (1666), was also based in part on his acceptance of the mechanical philosophy. This early modern philosophical movement sought to explain natural phenomena in terms of matter and motion, rather than, for example, the composition and proportion of Aristotelian terrestrial elements. Boyle thought mechanical explanations were inherently more intelligible than explanations based on the elemental model because they appealed to properties which themselves were more intelligible, such as size, shape, and motion, rather than to ultimately obscure causes such as real qualities or substantial forms. For Boyle, generation, corruption, and alteration could all be explained mechanically, as various types of interaction between microscopic particles of matter he called corpuscles.

This rejection of the elemental model of explanation also extended to other theories of natural philosophy that were popular in his day, such as the alchemical theory of Paracelsus (1493-1541), involving three chemical “principles”: salt, sulfur, and mercury, as well as even the more recent five-element theories of chemists such as Nicolas Le Fevre (1615-1669). When fire-analysis experiments revealed that some compound bodies could be reduced to five, rather than only four, homogenous elements, some natural philosophers thought this was evidence of a fifth element. Boyle rejected the elemental explanatory model altogether. Instead, he argued that there was only one kind of material substance, and what appear at the macroscopic level to be different elements are actually structural modifications of this universal matter’s mechanical properties.

In a similar way, Boyle also rejected the Aristotelian notion of natural motion. In Book 8 of the Physics, Aristotle argued that each element has a natural location in the universe and a natural tendency to return to this location. This was used to explain such things as why rocks fall and smoke rises. In contrast, Boyle argued that all matter was essentially passive and insensible, lacking any tendencies or dispositions beyond its mechanical properties. Matter can be acted upon but contains no internal force, source of motion, substantial form, or disposition.

The traditional Aristotelian qualities of hot, cold, wet, and dry could be mechanically explained in a similar way. For example, Boyle thought that heat was not a primary quality of matter, but instead a property that is reducible to a particular type of rapid corpuscular motion. The conception of heat as molecular motion is a direct descendent of this view. In a similar way, Boyle thought the power of a key to open a lock is not due to some real quality, substantial form, or occult power of the key; rather, it is an emergent power, ultimately reducible to the size, shape, and motion of the key and the lock, which Boyle called their mechanical affections.

It is important to note that Boyle’s objections against Aristotelian natural philosophy were usually directed more toward the views of his contemporary scholastics, such as Julius Caesar Scaliger (1484-1558), than those of Aristotle himself, for whom he had great respect. Boyle’s approach to ethics, for example, shows this respect was more than lip service, since it provides what is essentially an Aristotelian analysis of the causes of moral virtue. It is also important not to conflate the mechanical philosophy, the corpuscular hypothesis (Boyle’s own version of the mechanical philosophy), and the experimental philosophy (the method by which Boyle often tested theories).

b. The Mechanical Philosophy

Boyle coined the term Mechanical Philosophy and used it to describe any attempt to explain natural phenomena in terms of matter and motion, rather than in terms of substantial forms, real properties, or occult qualities. For Boyle, this included the work of a wide variety of philosophers that otherwise differed in many respects. His list of mechanical philosophers included the ancient atomists Democritus, Leucippus, Epicurus, and Lucretius—names synonymous with atheism at the time—as well as his contemporaries Galileo, Descartes, Gassendi, Hobbes, Locke, and Newton.

Boyle’s own corpuscular version of the mechanical philosophy makes him both an empirical representationalist and an indirect realist. Though Galileo’s The Assayer (1923) is likely the first early modern work to raise the influential distinction between primary and secondary qualities, Boyle developed this distinction and made it an important part of his natural philosophy. In the Origin of Forms and Qualities, among other works, he argued that our senses provide a representation of an independently existing, external physical world, which is ultimately composed of material particles moving through empty space. Boyle held that these corpuscles have mechanical affections, properties such as size, shape, and motion, which are the primary qualities of matter, real properties that exist in any bit of material substance, no matter how small. The secondary qualities we perceive, such as color, sound, taste, odor, and warmth, are mental perceptions that are produced by these primary qualities causally interacting with our sense organs, but do not actually exist as real qualities in the object of perception itself. Thus, perception involves information about the external world entering the brain as a result of the causal interaction between the conscious perceiver and the object perceived.

Boyle used the term “corpuscle” to describe the microscopic material particles, and their clusters, of which he believed the material world was composed. Boyle thought God has the power to infinitely divide matter, even if this is beyond our rational comprehension, but the actual physical world is composed of minima or prima naturalia, microscopic particles of matter which never are, as a matter of fact, divided. These basic corpuscles interact and combine to form larger and larger clusters until they form the ordinary macroscopic material substances with which we are familiar.

On the surface, Boyle’s mechanical philosophy seems very similar to Descartes’s, but their views differ in several important respects. Briefly looking at their differences helps us understand the uniqueness of Boyle’s view. Descartes argues that the “attribute,” or essence, of matter is extension in space. He also held that there is no real distinction between a substance and its attribute. Just as there is no body that lacks extension, Descartes held that there is no extension that lacks body. Descartes held that the universe was a plenum, completely filled with material substance. He even thought that the famous mercury vacuum created by Evangelista Torricelli (1608-1647), while devoid of air, was filled with “subtle matter”—particles small enough to penetrate the pores of the glass—and that we could deduce the existence of such particles from the nature of matter itself.

Boyle agreed that all matter was extended in space, but he wasn’t committed to Descartes’s elegant metaphysical system. Boyle thought theory had to be subordinate to observation. Extension, rather than being the essence of matter through which all other properties were mere modifications, is only another empirically manifest mechanical affection like size, shape, texture, arrangement, and solidity. For Boyle, empty space is not only logically possible but also empirically corroborated by experiments like those performed by Torricelli, Otto von Guericke (1602-1686), and himself, with Robert Hooke. Boyle also thought motion in empty space was more intelligible than in a plenum. Descartes had to resort to a complex theory of circular motion to explain it. Until there is empirical evidence to support the existence of subtle matter, Boyle believed its postulation violated Ockham’s razor.

Boyle believed that mechanical explanations were inherently more intelligible than those of the Aristotelians or the Paracelsians because they involved easily understandable concepts like size, shape, and motion. He thought the local motion involved in the mechanical interaction of corpuscles is inherently more intelligible than the Aristotelian conception of motion as the actualization of a potential. Boyle thought the appeal to substantial forms in natural philosophy produced explanations that were vacuous when compared to mechanical explanations. The Paracelsians seemed no better, appealing to vague notions such as the “archeus,” “astral beings,” and “blas.” Furthermore, being firmly rooted in alchemy, they were often secretive and intentionally obscure. However, explanations that appealed only to mechanical properties were clear, intelligible, and often had the advantage of being empirically testable.

In About the Excellency and Grounds for the Mechanical Hypothesis (1674), Boyle points out that no one appeals to substantial forms when mechanical explanations are available, as, for example, when one is shown how the moon is eclipsed by the shadow caused by the position of the earth relative to the sun. Likewise, there is no reason to appeal to witchcraft to explain how a concave mirror can project the image of a man into the air, once catoptrics is understood. Boyle thought Aristotelians and Paracelsians failed to realize that this mechanical approach can be applied to natural phenomena in general.

Boyle was interested in occult qualities, natural phenomena in which the effect is observable, but the cause is not, such as magnetic and electrical attraction. Boyle thought such phenomena could be explained mechanically in terms of corpuscular effluvia, the emission of small corpuscular clusters. In A Discourse of Things above Reason (1681), though, Boyle also recognized that some phenomena cannot be mechanically explained. These included the miracles featured in the Bible, as well as more traditional philosophical problems such as whether or not matter is infinitely divisible, how mind-body interaction is possible, and how human free will and moral responsibility can be compatible with divine foreknowledge. Perhaps these might be explained by future philosophical investigation, but they resist straightforward mechanical explanation.

The influence of the mechanical philosophy can be seen throughout Boyle’s other intellectual endeavors and provides his basic approach to philosophy. This influence is apparent in his metaphysical views on the nature of substance and causation, his defense of the corpuscular hypothesis, his epistemological views on role of experiment in scientific explanation and the limits of reason, and his theological views on the importance of studying the book of nature and its potential for medicine.

c. Chemistry

Boyle is considered by many to be the father of modern experimental chemistry. Through years of diligent work he became a skilled chemist. His interest and work in chemistry lasted from the early 1650s to the end of his life. His social status and efforts to show that natural philosophy was a theologically acceptable pursuit did much to make the science of chemistry socially respectable. Boyle’s most important contribution to chemistry is his systematic critique of both the Aristotelian and Paracelsian theories of natural philosophy.

In The Sceptical Chymist (1661), Boyle points out the limitations of fire analysis as a universal method of separating compound substances into their homogenous components, a method many Aristotelians and Paracelsians used. For example, a green stick burned in open fire seems to separate into four homogenous parts, demonstrating its compound nature: The smoke was the element of air being separated, the hissing and snapping of the sap indicated the water element, the quantity of fire grew as the stick burned, and the remaining ash was the element of earth that was left. Pacracelsians had a similar explanation, separating the stick into the chemical principles of salt, sulfur and mercury.

Boyle thought the separation could be better explained by the rapid mechanical bombardment of corpuscles from the fire onto the structure of the corpuscles composing the stick, setting them in motion. Chemical analysis revealed that the smoke and ash are not homogenous elements but are compound bodies themselves. Some compound substances, such as gold, could be burned for extended periods at extreme temperatures without separating into other homogenous substances. Furthermore, chemical distillation of other compound substances, such as raisins, could produce five homogenous substances.

Boyle was able to chemically sublimate several substances, such as sulfur, turning them from a solid state to a gas and back without going through a liquid phase. Boyle thought such experiments had serious consequences for the elemental model since, according to it, the release of a gas involved the separation of an element or chemical principle, which would require a diminution of the whole. If a substance can be turned back and forth from a solid to a gas again and again without any sign of disintegration, then such a diminution clearly has not taken place. The only alternative explanation on the elemental model would be that the substance has transmuted back and forth into different elements. However, if this is the case, then neither can be considered a true element.

Inspired by Bacon’s utopian model of science, Boyle tried to compile “experimental histories” of different substances. Some of these projects led to completed works, such as An Essay about the Origin and Virtues of Gems (1672) and Short Memoirs for the Natural Experimental History of Mineral Waters (1685). Others, such as the Philosophical History of Minerals, never came to fruition, though much of the research was completed. These projects were records of chemical experiments and other empirical observations concerning the given substance. The goal was to create a sort of publicly accessible database of the chemical analysis of every known substance. Boyle prioritized substances such as the traditional Aristotelian elements and Paracelsian chemical principles, “noble” metals like gold, and bodily fluids such a blood, due to their potential medical value. Concerning salt, a basic chemical principle according to the Paracelsians, Boyle claimed to be able to distinguish three different kinds, each of which he could chemically produce.

Boyle believed colors were caused by the mechanical properties of material corpuscles. In works such as Experiments and Considerations Touching Colours (1664) and New Experiments Concerning the Relation between Light and Air (1668), Boyle presents a chemical analysis of colors and light. He also analyzed samples of phosphorous he had acquired, which produce light chemically. Boyle achieved significant success in these endeavors, though this pales in comparison to the success of later philosophers on the nature of color. This line of investigation also led Boyle to discover things not directly related to color, such as a reliable method of distinguishing an acid from a base.

Developing an interpretation of a laboratory accident of Hennig Brandt, in 1680 Boyle saturated some coarse paper in phosphorous and drew a stick coated with sulfur across it, creating a steady flame. This was the first friction match. The creation of a reliable and eventually safe way to easily produce fire was a major technological advancement that changed the world.

Boyle spent the last twenty years of his life engaged, often with the help of Ranelagh, in the chemical analysis of medical recipes. These efforts did much to bring chemistry out of the shadows of alchemy and into the light of social respectability. Throughout his work in chemistry, Boyle advocated openness in the publication of experimental results, including even those experiments that were unsuccessful. Nonetheless, there were exceptions to this openness involving alchemy.

d. Alchemy

Many of the early modern philosophers, most notably Isaac Newton, had a significant interest in alchemy, and Boyle was no exception. Lawrence Principe in The Aspiring Adept: Robert Boyle and his Alchemical Quest (1998), and William Newman and Lawrence Principe in Alchemy Tried in the Fire: Starkey, Boyle, and the Fate of Helmontian Chymistry (2002), present a detailed analysis of Boyle’s alchemical pursuits, though one should also read Hunter’s account. The early Boyle scholars Henry Miles and Thomas Birch actually destroyed much of Boyle’s work in alchemy, fearing it would tarnish his reputation as a scientist. During his lifetime, however, Boyle’s interest in alchemy was extensive and well known. Though Boyle often tried to distance chemistry from its alchemical association, many of his projects in natural philosophy were clearly alchemical.

Boyle’s alchemical endeavors were motivated by three goals: to uncover the hidden nature of physical reality, to find “extraordinary and noble medicines,” and to acquire accurate accounts of supernatural events that might help convince religious skeptics. Boyle expressed an interest in finding the philosopher’s stone as early as 1646, though he mentions it more as a humorous exaggeration than a current project. In a letter to Ranelagh in May of that year, he complains that he is not destined to find the philosopher’s stone, since his initial attempts at chemical analysis had been so unsuccessful.

Boyle believed it was possible to transmute one substance into another, and this included the traditional alchemical quest of turning lead into gold. He believed the possibility of transmutation directly followed from the mechanical philosophy. If there is only one universal type of matter, and the differences between the macroscopic substances we perceive are the result of structural differences at the microscopic level, then it follows that causing changes in the structure and arrangement of corpuscles might cause substantial changes at the macroscopic level. Since gold and lead have similar macroscopic properties, there might be only a subtle difference between them at the microscopic level.

Boyle claimed to have witnessed the transmutation of lead into gold on more than one occasion. As early as 1652 he claimed to have acquired a quantity of philosopher’s mercury, a substance believed to be required for gold transmutation. Boyle also claimed to have turned gold into a base metal, using a powder given to him by a mysterious stranger. In some works, Boyle describes successful transmutation experiments on other substances. Boyle spent a great deal of time, effort, and financial resources in these pursuits, which included searching for the elixir of life, a medicine capable of curing all diseases and extending the human lifespan.

Boyle was hoodwinked on more than one occasion by charlatans who claimed to have alchemical knowledge or the rare substances required for his alchemical pursuits. The most notable incident involved a con man named Georges Pierre. Boyle eventually realized he was being had, and there is evidence that he was aware of the danger of such scams and viewed them as an unfortunate but necessary risk in the pursuit of alchemical knowledge, a risk that his unique wealth allowed him to take.

Influenced by Bacon’s utopian conception of science, Boyle thought scientific information, including his own detailed reports of chemical experiments, should be made public for the benefit of humanity. This allowed his experiments to be reproduced and the knowledge acquired to be used to help people, especially in areas such as medicine, where the benefit to the public was obvious and immediate. There were limits to this support of scientific openness, though. For example, Boyle was concerned that the publication of instructions for turning lead into gold could collapse the world economy, bringing social chaos. Upon Boyle’s death, Newton, also a dedicated alchemist, made attempts to obtain Boyle’s alchemical notes regarding the transmutation of lead to gold. Boyle had anticipated this and left detailed instructions in his will to prevent it.

Furthermore, despite Boyle’s support of scientific openness as well as his aversion to taking oaths, Boyle often employed secrecy in his alchemical pursuits. The sort of secrecy involved here, however, was a part of the cost of networking with other alchemists to share recipes and other experimental data, and was considered common practice in the world of alchemy. Most alchemists were secretive, and would exchange recipes and materials only if their secrets were kept. Boyle was justified in believing that, if he had refused to make such promises, other alchemists would not have shared their work with him. Nevertheless, this is a notable exception to his otherwise deep aversion to taking oaths, as well as his Baconian belief that scientific data should be open to the public for the benefit of humanity.

e. Medicine

Boyle also had a deep interest in medicine. Though he never formally studied it, much of his research in natural philosophy was either directly medical in nature or motivated by medical goals, both practical and theoretical. He nonetheless distrusted physicians, after an event in his youth in which he became gravely ill when a physician at Eton gave him the wrong medicine by mistake. Furthermore, he generally rejected their Galen-based theories in favor of mechanical ones. He noted that chemical remedies often worked better than the Galenist practice of bloodletting, and that many of Galen’s views were based on claims about human anatomy that turned out to be incorrect. Boyle thought most patients were better off not seeking a doctor’s treatment.

At the same time, Boyle knew, respected, and was respected by many of the leading physicians of his day. Boyle’s London neighbor was Thomas Sydenham (1624-1689), one of the greatest physicians of the day. Sydenham read Boyle’s work and liked it so much that he dedicated his own book, Methodus Curandi Febris (1666), to him. He sometimes even asked Boyle and Ranelagh to accompany him on house calls. Boyle’s medical work was so respected that Oxford gave him an honorary Doctorate of Medicine, the only degree he ever received.

Boyle’s work in medicine is entwined with his work in natural philosophy. While the two should not be conflated, as Boyle worked on many nonmedical projects in natural philosophy, neither can be fully understood apart from the other. The development of Boyle’s interest in medicine coincided with his interest in natural philosophy in general, beginning around 1646, increasing in the mid-1650s, and lasting the rest of his life.

One of Boyle’s earliest published works was a collection of medical recipes entitled An Invitation to a Free and Generous Communication of Secrets and Receits in Physick (1655). Though Boyle worked on medical projects throughout his scientific career, a renewed interest in medicine began in the late 1660s. He would go on to steadily publish books on medical topics for the rest of his life, including Memoirs for the Natural History of Human Blood (1684), Of the Reconcileableness of Specifick Medicines to the Corpuscular Philosophy (1685), Some Receipts of Medicines (1688), Medicina Hydrostatica (1690), Experimenta et Observationes Physicae (1691), and Medical Experiments (1692).

Boyle worked with Locke on a few medical projects that are worth noting. Though early 21^st century scholars remember Locke primarily for his work in epistemology and political philosophy, he considered himself first and foremost a physician. Boyle and Locke collaborated for several years to create a Baconian experimental history of human blood. This was part of a larger project of Boyle’s to create records of experimental observations regarding every known substance, with priority given to substances, such as blood, with potential value to medicine. Their work was interrupted while Locke was travelling or Boyle was ill, but their persistence resulted in the publication of Memoirs for the Natural History of Human Blood (1684).

A second medical project with Locke was the collection of data for testing the miasma theory of disease. This is particularly noteworthy because this theory proposes the mechanical explanation that disease is caused by noxious vapors moving in the air. The theory holds that these vapors act as a contagion, penetrating the bodies of those who come in contact with them through respiration. Boyle believed the contagions were composed of corpuscles and might originate deep underground, being released by human activity such as mining. Boyle and Locke hypothesized that these noxious corpuscular emanations were then spread far and wide by the wind. Believing disease and weather were linked, they collected data from physicians across the country on both the weather and the patients they had treated, looking for correlations. While this was a relatively minor project compared to some of Boyle’s other achievements, it is noteworthy since it attempted to use empirical data to test a mechanical explanation. One should not conflate the mechanical philosophy with the experimental philosophy, but the points where they intersect provide insight into Boyle’s philosophy.

Another medical collaboration in which Boyle participated was the race to find a cure for the Great Plague of 1666, an epidemic of bubonic plague which killed a fourth of London’s population, including Boyle’s former mentor George Starkey. Boyle’s belief in the miasma theory convinced him to leave London during this time. Despite this, Boyle was still part of a general effort to cure the plague that included Ranelagh, Sydenham, Locke, and many others. Boyle’s particular efforts primarily consisted of developing medical recipes he hoped would be useful to plague victims, which he then sent to Henry Oldenburg (1619-1677).

Boyle spent the last twenty years of his life engaged in medical research with his sister, Katherine Ranelagh. Through their vast network of correspondents, they would find medical recipes which they would then chemically analyze. Through medical research, Boyle found the clearest way to wed his passion for natural philosophy with his philanthropic goals.

Although he sometimes exaggerated his poor health, Boyle also suffered from very real and serious ailments including malaria, edema, seizures, kidney stones, toothaches, and deteriorating eyesight. He also suffered throughout his life from melancholy and complained of imaginative fits he described as “ravings.” During these episodes, he was carried away by his imagination, making it difficult to work. Boyle considered these ravings both a medical condition and a moral defect and spent years seeking a remedy. Since Boyle distrusted doctors and was an expert chemist, he often treated these illnesses with his own concoctions, sometimes making his condition worse. In 1670, Boyle suffered a severe stroke that left him partially paralyzed. He eventually recovered most of the mobility he had lost and continued working on his experiments.

f. Pneumatics

In 1643, Evangelista Torricelli, a friend and advocate of Galileo, filled a glass tube with mercury, turned it upside down, and placed it in a basin of mercury. The level of mercury in the tube lowered, but some mercury remained in the tube, suspended by the weight of the air—the air pressure—pressing down on the surface of the mercury in the basin. Since the tube was airtight, Torricelli reasoned that the area in the tube above the mercury must be a vacuum. Through Marin Mersenne and his vast correspondence network, news of the experiment quickly spread throughout Europe.

Otto von Guericke (1602-1686) heard of the Torricelli experiment and designed a pump capable of producing an evacuated receiver so strong, due to the outward air pressure, that sixteen horses could not pull the two hemispheres of the receiver apart. Boyle had been interested in the nature of respiration for some time, so when he and Hooke, then Boyle’s laboratory assistant, heard of von Guericke’s impressive feat, they set about to create their own air pump. Boyle designed an improved model which featured a chamber made of glass, allowing direct observation of the phenomena within the evacuated receiver. Boyle first approached the scientific instrument-maker Ralph Greatorex (1625-1675) to build it, but when he failed Hooke took up the difficult challenge and succeeded.

From the spring through the fall of 1659, Boyle and Hooke performed dozens of experiments using the air pump and published the results in New Experiments Physico-Mechanical Touching the Spring of the Air and its Effects (1660). In this book, Boyle provides extremely detailed presentations of 43 of the experiments, giving compelling evidence for such claims as that air is a distinct substance from space, that air is elastic and has a spring, and that air pressure is so powerful that a glass vial of water placed in the receiver explodes when the air is removed. They demonstrated that air is required for phenomena such as combustion, respiration, and sound. They even placed a Torricellian barometer in the receiver, showing that the mercury does not remain suspended in the vacuum. Spring of the Air established Boyle’s scientific reputation. With its success, Boyle went from being an amateur gentleman interested in natural philosophy to being the leading scientist of the day.

The book highlighted Boyle’s genius for developing experiments that revealed important scientific information, and he also included detailed critiques of the other theories he had studied concerning the nature of air. The detail of his analysis astounded even other natural philosophers such as Henry Power (1623-1668), who claimed, “I never read any tract in all my life, wherein all things are so curiously and critically handled, the experiments so judiciously, and accurately tried, and so candidly and intelligently delivered.” It also influenced Newton, who saw it as a paradigm of scientific research.

At many of the early meetings of the Royal Society, Boyle was asked to replicate some of the experiments. Unlike other natural philosophers, Boyle had the financial resources to conduct the experiments and to repair the temperamental air pump when it broke. He even had an additional air pump made at considerable expense, which he gave to the Royal Society on May 15, 1661.

The book was also controversial, and it remains so to this day. Steven Shapin and Simon Schaffer explore the social construction of science, using the controversy between Hobbes and Boyle over the air pump experiments as their focal point in their influential book Leviathan and the Air Pump: Hobbes, Boyle and the Experimental Life (1985). However, one should also read Hunter’s account. The Jesuit Priest Francis Linus (1595-1675) tried to replicate some of the experiments and offered an alternative Aristotelian interpretation of the results, defending the view that nature abhorred a vacuum in Treatise on the Inseparable Nature of Bodies (1661). Christiaan Huygens (1629-1695) also reported that he could not replicate some of the experiments. Boyle praised Linus for his use of experiment, but pointed out the defects in his experimental practice in A Defense of the Doctrine Touching the Spring and Weight of the Air (1662). He added that further experiments with a J-shaped tube corroborated his claim that the reciprocal proportion between the pressure and volume of air was constant. This became known as Boyle’s Law.

This is controversial because Boyle appealed to experiments with the J tube actually performed by other natural philosophers like Henry Power and Richard Towneley (1629-1704). Furthermore, it was Hooke, rather than Boyle, who worked to find the precise numerical relation between air volume and pressure, while Boyle was more interested in the philosophical significance of the proportion being reciprocal and constant.

Even more significant was a series of objections raised by Boyle’s fellow mechanical philosopher Thomas Hobbes, upon which Leviathan and the Air Pump focuses. Hobbes offered a contrary mechanical interpretation that was consistent with observation. Like Descartes’s interpretation of the Torricelli experiment, Hobbes suggested that subtle matter was passing through microscopic pores in the glass so that the receiver was full of matter and not a true vacuum. Since it is possible to give an alternative mechanical explanation consistent with observation, Hobbes argued one cannot use experiments to decide between them. Furthermore, since multiple mechanical interpretations are possible for any experimental observation, observations are never completely independent of theory.

In An Examen of Mr. T. Hobbes his Dialogus Physicus De Natura Aeris (1662), Boyle replied by distinguishing between “matters of fact,” which can be tested, and mere “hypotheses,” which result from metaphysical speculation. It is possible that subtle matter penetrated the glass, but until there is empirical evidence to support this, positing the existence of subtle matter violates Ockham’s razor. Notably, by the early 21^st century compelling evidence had emerged that an evacuated receiver contains billions of subatomic particles, such as neutrinos, far smaller than the pores of the glass.

Boyle also was motivated by a desire to show a theistic alternative to the equally mechanical materialism of Hobbes, Gassendi, and the ancient atomists, which was then strongly associated with atheism. For a time, Hobbes’s name was almost synonymous with atheism. Boyle had tried to show, since the early 1650s, that a mechanical philosophy could be compatible with Christianity.

In the end, Boyle wrote some ten books concerning his work with the air pump: New Experiments Physico-Mechanical Touching the Spring of the Air and its Effects (1660); A Defense of the Doctrine Touching the Spring and Weight of the Air (1662); An Examen of Mr. T. Hobbes his Dialogus Physicus De Natura Aeris (1662); New Experiments Concerning the Relation between Light and Air (1668); A Continuation of New Experiments Physico-Mechanical Touching the Spring and Weight of the Air and their Effects (1669); New Pneumatical Experiments about Respiration (1670); Of a Discovery of the Admirable Rarefaction of Air (1670); Flame and Air (1672); A Continuation of New Experiments Physico-Mechanical Touching the Spring and Weight of the Air and their Effects (1680); and The General History of Air (1692). Eventually, though, his attention shifted to medical chemistry.

3. Philosophy of Science

Boyle was well known for his views on the role of experimental evidence in natural philosophy. Boyle’s philosophy of science was primarily influenced by Bacon. In Novum Organum (1620) and New Atlantis (1627), Bacon had challenged natural philosophers to employ an inductive scientific method based on the careful application of technology to make detailed empirical observations, instead of relying on the syllogistic approach favored in the Scholastic tradition, which made deductive inferences from universal principles. Bacon argued that if the universal principles themselves turned out to be false, the conclusions deduced from them would be unjustified. Instead of trying to anticipate what nature should be like according to reason, natural philosophers instead should make detailed observations of what nature is actually like. They should then interpret these observations and form inductive generalizations about the natural world. This approach to science allows observational evidence to have epistemic priority over theory, so that theories can be modified in the face of new empirical evidence. Bacon envisioned a future “history of qualities,” a sort of publicly accessible scientific database of empirical observations.

Boyle took this challenge seriously and developed an experimental method that used detailed observation, aided by new technology, to reveal nature’s hidden structure. This approach is apparent in his work in pneumatics, his chemical research to create experimental histories of substances, and his projects on cold, air, light, color, minerals, and gems. Many of these projects never came to fruition, but on some he worked steadily for years. For instance, Boyle’s natural history of Ireland never even got off the ground, but his empirical approach to the study of blood was fruitful and eventually led to medical advances which now routinely save lives. It is also important to note that this collection of empirical data is not the blind data collection of the “narrow inductivist conception of scientific inquiry” criticized by Carl Hempel in Philosophy of Natural Science (1966). Boyle prioritized the experimental investigation of substances with obvious benefit to society, and Boyle’s empirical data collection was hypothesis driven.

Boyle’s commitment to the mechanical philosophy was consistent with his views on the role of experiment in science. Boyle would often develop mechanical explanations of phenomena that served as hypotheses, for which he would then design experiments to test. He thought that testability was important in hypothesis development as well as in determining what questions science should pursue. He had a genuine talent for creating experiments designed to test theories, and in many cases this provided new scientific information. Following Bacon, Boyle tried to resist non-empirical metaphysical speculation and modify theories in the light of new experimental evidence. The results are mixed, but when he did engage in metaphysical speculation, such as in his treatment of the arguments for body-to-body occasionalism, he prefaced his remarks by noting that none of the theories he discussed could be empirically tested.

Comparison with Descartes on the role of experiment in natural philosophy is insightful. Experimental observation played a much different role for Boyle than it did for Descartes. Descartes is famous for conducting ingenious experiments, but rather than being used to test or falsify a hypothesis, they often played a part in the reduction of a complex scientific question into more basic ones. In Rules for the Direction of Mind (1628) and Discourse on the Method (1637), Descartes describes a scientific method that involves reducing a problem into more and more fundamental problems until a problem is reached that is so basic that a self-evident intuition solves it. One can then use this intuitive solution in a series of deductive inferences, solving the problems until one reaches a solution to the original one.

Furthermore, for Descartes, empirical observation was not a reliable method of testing hypotheses, since he believed the senses provide only confused modes of thought. The only properties of matter about which we can be certain, for Descartes, are the geometric properties of extended space. He believed this method of science could achieve the same level of certainty as mathematics since it restricted itself to clear and distinct deductions from matter’s geometric properties. For Descartes, physics is applied geometry.

By contrast, Boyle thought theory must be epistemically subordinate to observation, so he used experiments to test a theory. Instead of using them in a reductive process of finding self-evident intuitions, he designed experiments specifically to falsify or corroborate a claim. In this way, claims such as “air is needed for respiration” could be empirically supported, while claims such as “air is identical to space,” could be refuted. For Boyle, scientific knowledge was more likely to be inductively inferred than geometrically deduced.

Concerning Boyle’s general epistemology, in works such as A Discourse of Things above Reason (1681), Boyle distinguishes between things that can be known by reason and things that can be known through experience. Boyle also believed that at least some ideas are innate. Examples of innate ideas include the belief that contradictories cannot both be true, that the whole is greater than the part, and that every natural number is either odd or even.

Furthermore, Boyle believed that some truths are beyond a human’s capacity to understand. These are things which are true, and our intellect has sufficient cause to assent to them based on experience, authentic testimony, or mathematical demonstration, but when it reflects on them, it finds itself at a strange disadvantage. Boyle includes three kinds of beliefs in his taxonomy of things above reason.

The first kind he labels “incomprehensible” since it includes belief in things beyond our comprehension. For example, our finite minds cannot grasp the infinite nature of God. Boyle thinks we can comprehend that God exists and some of the things that God is not, but we cannot fully understand the boundless nature of his perfections. Boyle declares this to be truly supra-intellectual.

Boyle calls the second kind of thing above reason “inexplicable.” This includes beliefs for which we are unable to conceive of their manner of existing, or how the predicate can be applied to the subject. Boyle gives examples such as the infinite divisibility of matter and the incommensurability of the diagonal of a square to the length of its sides.

Boyle calls the final kind of thing above reason “unsociable,” but it might better be labeled “incompatible.” This class includes true propositions that seem incompatible with other propositions known to be true. For example, human free will seems to be incompatible with God’s foreknowledge of future events, but necessary for moral responsibility. Mind and body are distinct substances, but they seem to causally interact. Boyle thought these were real problems and had real solutions but were likely beyond a human’s finite capacity to understand, though he also thought philosophers should continue to try.

Like Descartes, Boyle believed that we could have knowledge of things that are beyond our capacity to clearly imagine, such as the mathematical properties of a chiliagon. We can demonstrate necessary truths about a 1000-sided object and show it has different properties that a 1001-sided object. Despite this, the images our minds form of these shapes are indistinguishable.

Boyle also distinguished between real and nominal essences, which, along with his work on primary and secondary qualities, influenced Locke’s epistemology. In A Free Enquiry into the Vulgarly Received Notion of Nature (1686), Boyle begins by listing all the ways the term “nature” is used. He then distinguishes between the “notional” sense, which is the way we choose to use words, from the way nature really is. Boyle also discusses the distinction in the Origin of Forms and Qualities (1666).

4. Substance Dualism

Boyle was a substance dualist, postulating that the universe consists of two types of substance: purely material corpuscles and nonphysical, conscious souls. Boyle accepted Descartes’s definition of substance as a type of entity that was not ontologically dependent on anything but God, whereas a mode is ontologically dependent on a substance. Shape, for example, cannot exist on its own, but is ontologically dependent on the bit of matter that has it.

Boyle’s dualism was influenced by Descartes, especially after his work with Robert Hooke, who taught him Cartesian philosophy, but there are important differences between their similar metaphysics. Descartes held that spatial extension was the “Attribute,” or essence, of matter, while thought was the essence of mind. Accordingly, all true properties of matter were modifications of extension, such as size, shape, and motion. In a similar way, since thought is the essential attribute of mental substance, all properties of mind are modes or types of thought.

Although Boyle agreed that thought was mental and matter was extended, he was not committed to Descartes’s elegant, rationally deduced substance-attribute-mode model. The mechanical affections Boyle associated with matter were derived from experience. For example, Boyle included solidity as another empirically based mechanical affection, but it is not clear how one can explain it as a mode of Cartesian spatially extended matter.

Boyle saw that bodies need some minimal force of resistance for mechanical interaction to be possible, though he emphasized such a force was nothing like a rational disposition or internal source of motion. Boyle also believed God gave matter the power to transfer motion upon collision, another potential problem for Descartes, since modes should not be able to transfer.

Likewise, for Descartes, the existence of a void or vacuum in space—that is, an area of space containing no matter whatsoever—is logically impossible. Since the attribute of body is extension, and there is no real distinction between a substance and its attribute, any extended area of space must contain body. Boyle’s views on the nature of the material world were more influenced by Bacon and Gassendi. He believed the elegance of a metaphysical system is not as important as its correspondence to empirical observation. He thought the air pump experiments supported the idea that a vacuum in space, devoid of all matter, was logically possible, and the existence of a vacuum should be posited until there was empirical evidence for the presence of matter in the evacuated receiver.

A final difference between Boyle’s dualism and that of Descartes was Boyle’s belief in animal consciousness. Descartes thought animals lacked a soul and were merely incredibly complex, divinely designed machines. Although they behaved as if they suffered, nonhuman animals lacked any conscious mental states. Descartes performed many animal dissections, including vivisections of live animals. Boyle saw the scientific need for vivisection since some anatomical features are only observable in living bodies. He even performed some during his sojourn in Ireland during the early 1650s. He gave up the practice, though, because of the observable suffering it caused. Boyle even had a preference for free-range chicken, but this may have been as much about flavor as chicken flourishing.

Boyle believed much instinctual behavior in nonhuman animals is purely mechanical, such as involuntary blinking when an eyelash is touched by a feather. Although he believed nonhuman animals were capable of conscious sensations, he thought they lacked rationality. Like other natural phenomena, nonhuman animal behavior sometimes seems rational, but, contrary to the scholastic Aristotelians, he thought the material world contained no rational dispositions.

5. Causation

Fundamental to Boyle’s philosophy is the belief that matter is passive, having no internal power, force, source of motion, or substantial form beyond the primary qualities of size, shape, solidity, and motion. He rejected the scholastic tendency to see intelligent dispositions everywhere in nature, such as the view that nature abhors a vacuum, or the view that an element has an internal disposition to move toward a natural location in the universe. Boyle acknowledged that the regularity seen in the natural world makes it sometimes seem like there is rational behavior, such as the regular motion of celestial bodies, or the tendencies of chemical substances to repeatedly behave in uniform ways. Despite this, he rejected the view that matter had power beyond its mechanical properties and sought to demonstrate how natural phenomena could be explained in terms of the motion of particles obeying certain laws of motion which he believed God had established. In works such as The Christian Virtuoso (1744), Boyle argued that the regularities we see in nature are a manifestation of God’s power and that divine volitions cause the laws of nature.

Boyle believed that the ultimate cause of motion is God, who created bodies, set them in motion, and maintained the laws of motion by divine will. God does grant matter certain basic powers such as solidity and the power to transfer motion to other bodies upon collision, but these are to be understood as unconscious mechanical properties rather than anything like mental dispositions or the internal sources of motion invoked by scholastic Aristotelian natural philosophy.

Boyle was aware of, and even sympathetic to, occasionalism, the view that God is the cause of anything that requires a cause. However, he never explicitly endorsed it. He does speak of it favorably in folios 38 to 40 of volume 10 of the Boyle Papers. While not explicitly endorsing it, Boyle presents three arguments intended to show that body-to-body occasionalism is not in itself absurd. Boyle does not here discuss mind-body occasionalism, but rather how God causally interacts with matter to create the natural world.

This is a minor discussion in his vast corpus, and should not be given undue emphasis. Its relevance to Boyle’s views on causation, though, makes it worthy of inclusion here. Boyle generally tried to avoid non-empirical metaphysical speculation or metaphysical system building, and he begins by pointing out that the issue cannot be settled by any testable experiment. Boyle then explicitly appeals to Ockham’s razor. Since God’s concurrence by itself is sufficient to cause the motion of bodies, it is superfluous, and even potentially impious, to attribute such power to finite bodies. If God wills a body to be in location a, and later wills it to be in location b, this alone is sufficient to move it. Attribution of a second cause to matter itself is not necessary.

Boyle’s second argument anticipates the philosophy of David Hume (1711-1776) by claiming that causation itself never appears to the senses. The power of one body to move another body is not directly observable. We only perceive that when one body hits another there follows a motion in the second body. This point is essential to Hume’s formulation of the problem of induction, supporting the claim that our belief in causation cannot be justified as a matter of fact. For Boyle, the fact that the power of causation is not manifest to the senses shows that it could be God. Therefore, occasionalism cannot be ruled out as absurd.

Boyle’s third argument is that it might not be even possible to conceive of one body communicating motion to another. If finite bodies are collections of modes ontologically dependent on the attribute of extension, for example, they should not be able to cause motion in another body. It thus should not be possible for us to conceive of a body transferring its motion to another body on collision. Occasionalism, therefore, cannot be ruled out as absurd since it actually seems more comprehensible than attributing the power of causation to finite bodies.

Boyle incorrectly labels Descartes as a sort of deist. Deists believed that, after the initial divine causal impulse, the universe ran on its own accord, obeying the laws of motion without the constant intervention of God. However, Descartes believed that God is constantly involved in creating the world through one continuous divine act. Boyle was aware of the similar body-body occasionalism of Louis De La Forge, in which God creates motion by recreating an object in different locations at different times. Boyle, however, seems to have preferred what Peter Anstey has described as “nomic occasionalism.” According to this type of body-body occasionalism, bodies are not totally passive but have basic, mechanical powers, such as solidity and the power to transfer their motion to other bodies upon collision. On this view, God causes the initial motion, preserves and conserves that motion, and determines the direction and speed of bodily motions before and after collisions. Like many of his contemporaries, Boyle believed that the laws of nature are divine volitions. In the case of miracles, though, God can suspend a law of nature, a further manifestation of divine power. Yet again, Boyle was cautious and hesitant to proclaim nomic occasionalism over deism, or the so-called cinematic occasionalism of De La Forge, pointing out that none of these views can be easily empirically tested.

In any case, it seems clear that Boyle’s occasionalism was confined to body to body interaction. Boyle thought that human minds were capable of genuine causal agency. This agency played an essential role in his views on the nature of moral responsibility, as well as his theological views about what is necessary for salvation. Our souls are connected to our bodies and somehow causally interact with them. Here again, Boyle is hesitant to commit himself to any specific theory beyond what can be experimentally tested. He believed that how mind-body interaction is possible, as well as how free will is consistent with divine foreknowledge, are likely mysteries beyond the ability of reason to solve.

6. God

By now it should be clear that the single most important influence on Boyle’s philosophy was his personal religious beliefs. His contributions to philosophy, chemistry, pneumatics, and medicine can be all interpreted as the development and fulfillment of a lifelong religious quest. Boyle thought there were three true books of wisdom, the “book of scripture,” the “book of nature,” and the “book of conscience.” He thought all three were important and spent nearly equal amounts of time and energy on each.

Boyle was christened at the chapel at Lismore Castle in Ireland as an infant and brought up as an Anglican protestant, though he was greatly influenced by Puritanism. The terrible storm Boyle witnessed on his grand tour with Isaac Marcombes was a transformative experience for Boyle, and many of his philosophical projects can be seen as attempts to fulfill the oath he took to survive it.

Boyle thought that, of the traditional arguments for the existence of God, the teleological argument was the strongest. Boyle acknowledged that the existence of God could not be rationally demonstrated, but he believed the natural world abounded with empirical evidence of God’s power and wisdom. He thought the incredible complexity and order of the universe was evidence of God’s existence. The vastness of the universe, and the speed with which the earth and celestial objects move, Boyle saw as evidence of God’s unbounded power. He thought that God’s constant concurrence was needed to sustain the universe’s existence.

He was particularly amazed by the human body and the bodies of nonhuman animals, which he interpreted as divinely constructed machines. Internal organs were smaller machines ingeniously and exquisitely designed to work together to sustain the life of the animal. Ignorant of natural selection, Boyle thought the incredible complexity of their mechanical structure was compelling evidence of God’s existence. In one early letter, Boyle claimed to have learned more about God’s creation dissecting fishes than in all the books he had read. At a macroscopic level, he thought that the climates of the different regions of the earth, and other geological features were intentionally designed to sustain the lives of various animals.

Boyle also used the famous clock at Stroudsburg as an analog to “this great automaton the world.” He thought the universe itself was intentionally designed by God to be understood by rational creatures, though parts of this creation are beyond human comprehension. Boyle believed that, since the universe was a manifestation of God’s greatness, one should study the book of nature as an aid to salvation.

Boyle also had a basic modal semantics. He believed God has the power to create alternative universes with different laws of nature. Boyle interpreted these possible worlds as potential divine creations. In addition to possible alternative creations of God, in Of the High Veneration Man’s Intellect Owes to God (1685), Boyle claims the size of the actual universe is so great that distant regions of space might have other areas, the size of our observable universe, that contain different planets and creatures, and even might have different laws of nature.

In the traditional theological debate between divine voluntarism, which holds that God’s will is prior to his reason, and divine intellectualism, which holds that God’s reason is prior to his will, Boyle has been often regarded as an important early modern voluntarist, but the label needs qualification. Boyle believed it was rash to claim that God’s acts had to conform to our finite conception of reason, and he generally rejected the a priori approach to theology advanced by many intellectualists. There is no way for us to deduce a priori which of the countless possible worlds God chose to create. Boyle thought we could learn about God’s magnificent creation through empirical observation. The problem with placing God’s reason above his will was that we are limited by our finite understanding of a priori truths. The ultimate contingency of the laws of nature calls for their empirical investigation, rather than a priori deduction. On the other hand, Boyle did not think God did things arbitrarily. He thought everything happened according to God’s divine plan, even if we could not completely understand it. Boyle’s rejection of intellectualism has more to do with the limits of our finite reason than a priority of God’s will over his reason.

Boyle believed everyone had the capacity for salvation. Boyle, Ranelagh, and other members of the Hartlib Circle collaborated on a number of projects to make the Bible available to more people, including overseeing the publication of translations of the Bible into Irish, Malay, and Algonquin. This has allowed much of the Algonquin language to be preserved. Such projects were controversial at the time, but Boyle saw them as part of his religious duty.

Boyle spent years mastering ancient Biblical languages to further his understanding of the Bible, including Greek, Syrian, Aramaic, and Arabic. He learned Hebrew to read the Torah and sought out Jewish scholars for advice on his translations. He argued for religious toleration, though he thought Christianity held the only path to salvation.

Boyle believed in the existence of supernatural creatures such as angels, demons, and witches. In Of the High Veneration Man’s Intellect Owes to God (1685), he claimed that angels, both good and evil, are rational but completely incorporeal, and that there could be as many species of angels and demons as there are nonhuman animals, with subtle moral differences between them. On the other hand, he also believed that most witch trials were unjust and not cases of real witchcraft. He tried to apply his empirical scientific method to the investigation of supernatural phenomena by creating a sort of database of reliable accounts of supernatural events, just as his Baconian histories of qualities were records of reliable experimental observations of natural substances. Boyle was convinced that enough reliable accounts of supernatural phenomena would make skepticism of Christianity seem unreasonable. He even saw to the publication of what he believed to be a true account of a poltergeist: Pearreaud’s Devil of Mascon (1658). He also tried to investigate what he thought to be a reliable account of precognition.

Despite a lifetime of religious pursuits, Boyle also had significant religious doubts. These doubts troubled him, and throughout his life he sought spiritual guidance from friends, family, and clergy. He worried that his wealth had been taken from Ireland unjustly and that his philanthropic endeavors were inadequate. He also feared that he had committed a sin against the Holy Ghost by ignoring opportunities to repent for self-acknowledged sins.

Boyle intended to write a book about atheism, but it was never completed. He left a substantial endowment in his will to start a series of annual lectures defending the existence of God and the basic tenets of Christianity against the dangers of atheism he perceived. The sermons started in 1692 and lasted steadily until 1935, after which time they were given frequently, but sporadically. Since 2005, they have been given every year once again.

7. Ethics

Although Boyle is best known for his scientific endeavors, he was also fundamentally concerned with ethics. His earliest attempts at philosophy were in ethics, and ethics dominated his philosophy throughout the years he spent at his estate in Stalbridge during the 1640s, following his return to England from the grand tour with Isaac Marcombes. At some point during the late 1640s to early 1650s, Boyle had a conversion experience in which the focus of his work shifted permanently to natural philosophy. Nonetheless, he never abandoned his ethical concerns.

His most extensive ethical work is the Aretology, a systematic study of virtue. Written between 1645 and 1647 and never published during his lifetime, the treatise defends the claim that the key to human flourishing is the attainment of “felicity,” which Boyle understood as a supreme, sufficient, contenting happiness, ultimately achievable only after the death of the body and the contact of the soul with the divine. Felicity is the goal of eudaimonia because Boyle believes it is the only thing that is good in itself. Boyle rejects pleasure, honor, wealth, and even knowledge as approaches to achieving felicity, arguing instead that “to the palace of felicity the only highway is virtue.” This warrants the systematic study of moral virtue to which the title refers.

Boyle begins by claiming that the proper subject of moral virtue must be the rational soul rather than the affections of the senses. He then adopts a basically Aristotelian causal analysis of moral virtue, complimented with dashes of stoicism. Thus, the final cause of virtue is felicity, as we have seen. The material cause of virtue is the human soul. The formal cause of virtue is what Boyle terms “mediocrity,” the Aristotelian idea that a moral virtue is a mean between a vice of deficiency and a vice of excess, which one obtains only through habitual repetition until it becomes part of one’s character. The efficient cause of virtue is the most complex. Boyle sees it as a combination of God, the capacity that God gave us to develop virtue, mental habit, and living in accordance with right reason.

Boyle was greatly influenced by stoicism, having read the classic works under Isaac Marcombes. This influence is apparent throughout his moral treatises. Boyle’s ethics was also heavily influenced by Johann Alsted (1588-1638), a German Calvinist.

8. Casuistry

Boyle was a dedicated casuist, believing that a detailed analysis of his own conscience was just as important as the study of nature or the study of the Bible, and he devoted just as much of his time and effort to it. Boyle was just as meticulous in the analysis of his own conscience as he was at chemical analysis, scrutinizing his behavior, taking detailed notes, discussing them regularly with close friends and spiritual advisors such as Ranelagh, Locke, Gilbert Burnet (1642-1715), and Edward Stillingfleet (1635-1699).

Boyle’s intense examination of his own conscience likely goes back to the conversion experience he had during the night of the terrible storm on his grand tour, but it was probably also influenced by his study of stoicism. Boyle even provided a stipend for Robert Sanderson to help him publish his Lectures on Human Conscience, a book based on a series of lectures that Sanderson gave at Oxford in the 1640s. It is considered a classic in the field of casuistry.

Throughout his life, Boyle also suffered from manic fits he described as “ravings,” in which his imagination seemed to run away beyond his control, ravishing his attention. He found these fits of restless fancy disturbing and debilitating, and he made all sorts of efforts to treat these episodes both medically and by developing coping mechanisms to calm himself when the fits occurred.

Boyle scrutinized his daily moral behavior. For example, Boyle sometimes had to make promises of secrecy to obtain new alchemical recipes. This not only involved taking an oath, but also ran counter to his general advocation of openness in experimental data. These sorts of tensions gave Boyle and his spiritual advisors plenty of material to analyze. A full understanding of Boyle’s thought has to appreciate his equal dedication to the study of the book of nature, the book of scripture, and the book of conscience.

9. References and Further Reading

a. Recent Editions of Boyle’s Works

The Works of Robert Boyle (Pickering & Chatto, 1999-2000), ed. Michael Hunter and Edward B. Davis.
- This fourteen-volume set is the definitive edition of Boyle’s work.
Selected Philosophical Papers of Robert Boyle (Hackett, 1991), ed. M.A. Stewart.
- An excellent paperback edition of some of Boyle’s most important works.
A Free Enquiry into the Vulgarly Received Notion of Nature (Cambridge, 1996), ed. Edward B. Davis and Michael Hunter.
- A paperback edition of this important later work by Boyle, with a good introduction and chronology.
The Works of the Honourable Robert Boyle (Rivington, 1772), ed. Thomas Birch.
- This was the classic edition, but has been surpassed by the Hunter and Davis edition.

b. Chronological List of Boyle’s Publications

An Invitation to a free and generous Communication of Secrets and Receits in Physick (1655)
Some Motives and Incentives to the Love of God (Seraphic Love) (1659)
New Experiments Physico-Mechanical, touching the Spring of the Air and its Effects (1660)
Certain Physiological Essays (1661)
The Sceptical Chymist (1661)
Some Considerations touching the Style of the Scriptures (1661)
A Defense of the Doctrine Touching the Spring and Weight of the Air (1662)
An Examen of Mr. T. Hobbes his Dialogus Physicus De Natura Aeris (1662)
Some Considerations Touching the Usefulness of Experimental Natural Philosophy (1663)
Experiments and Considerations Touching Colours (1664)
New Experiments and Observations Touching Cold (1665)
Occasional Reflections upon Several Subjects (1665)
Hydrostatical Paradoxes (1666)
The Origin of Forms and Qualities (1666)
New Experiments Concerning the Relation between Light and Air (1668)
A Continuation of New Experiments Physico-Mechnical Touching the Spring and Weight of the Air and their Effects (1669)
Of Absolute Rest in Bodies (1669)
New Pneumatical Experiments about Respiration (1670)
Cosmical Qualities (1670)
Of a Discovery of the Admirable Rarefaction of Air (1670)
The Usefulness of Natural Philosophy, II (1671)
An Essay about the Origin and Virtues of Gems (1672)
Flame and Air (1672)
Essays of Effluviums (1673)
The Saltness of the Sea (1673)
The Excellency of Theology Compared with Natural Philosophy (1674)
About the Excellency and Grounds of the Mechanical Hypothesis (1674)
Some Considerations about the Reconcileableness of Reason and Religion (1675)
Experiments, Notes, Etc., about the Mechanical Origin of Qualities (1675)
Of a Degradation of Gold Made by an Anti-Elixir (1678)
Experiments and Notes about the Producibleness of Chemical Principles (1680)
A Continuation of New Experiments Physico-Mechnical Touching the Spring and Weight of the Air, and their Effects (1680)
The Aerial Noctiluca (1680)
New Experiments and Observations, made upon the icy Noctiluca (1682)
A Discourse of Things Above Reason (1681)
Memoirs for the Natural History of Human Blood (1684)
Experiments and Considerations about the Porosity of Bodies (1684)
Of the High Veneration Man’s Intellect owes to God (1684)
Short Memoirs for the Natural Experimental History of Mineral Waters (1685)
An Essay of the Great Effects of Even Languid and Unheeded Motion (1685)
Of the Reconcileableness of Specifick Medicines to the Corpuscular Philosophy (1685)
A Free Enquiry into the Vulgarly Received Notion of Nature (1686)
The Martyrdom of Theodora and of Didymus (1687)
A Disquisition about the Final Causes of Natural Things (1688)
Some Receipts of Medicines (1688)
Medicina Hydrostatica (1690)
The Christian Virtuoso (1690)
Experimenta et Observationes Physicae (1691)
The General History of Air (1692)
Medicinal Experiments (1692)
A Free Discourse against Customary Swearing (1695)
The Christian Virtuoso, The Second Part (1744)

c. Correspondence

The Correspondence of Robert Boyle (Pickering & Chatto, 2001), ed. Michael Hunter, Antonio Clericuzo, and Edward B. Davis.
- This six-volume edition of Boyle’s correspondence is the standard in the field and a companion to the Pickering & Chatto edition of The Works of Robert Boyle.

d. Work Diaries

Boyle diligently kept diaries of his experimental work starting in the 1640s. Thanks to the work of Michael Hunter and Charles Littleton, these are available online at http://www.bbk.ac.uk/boyle/workdiaries/.

e. Biographies

Hunter, Michael. Boyle: Between God and Science (Yale, 2009).
- This is the best biography of Boyle to date, and includes important recent discoveries in Boyle studies.
Hunter, Michael. Robert Boyle by Himself and His Friends (Cambridge, 1994).
- This edited volume of biographical and autobiographical essays about Boyle is noteworthy for the inclusion of fragments from William Wotton’s lost Life of Boyle.
Maddison, R.E.W. The Life of the Honourable Robert Boyle (Taylor & Francis, 1969).
- This is another biography of Boyle with excellent coverage of Boyle’s Oxford period, but the coverage of Boyle’s early life is covered by reprinting Boyle’s own account as presented in the autobiographical An Account of Philaretus During his Minority (also included in Hunter 1994 above).
Masson, Flora. Robert Boyle: A Biography (Constable and Company, 1914).
- An early biography of Boyle with many notable anecdotes.

f. Selected Works on Boyle

Alexander, Peter. Ideas, Qualities, and Corpuscles: Locke and Boyle on the External World (Cambridge, 1985).
- This is an exploration of Boyle’s profound influence on John Locke.
Anstey, Peter. The Philosophy of Robert Boyle (Routledge, 2000).
- This is the first book-length treatment of Boyle’s philosophy.
Anstey, Peter. “Boyle Against Thinking Matter,” in Late Medieval and Early Modern Corpuscular Matter Theories, Edited by Christoph Luthy, John Murdoch, and William Newman (Brill, 2001).
Baxter, Roberta. Skeptical Chemist: The Story of Robert Boyle (Morgan Reynolds Publishing, 2006).
Boas, Marie. Robert Boyle and Seventeenth-Century Chemistry (Cambridge, 1958).
Boas-Hall, Marie. Robert Boyle on Natural Philosophy (Indiana University Press, 1965).
DiMeo, Michelle. “‘Such a Sister Became Such a Brother’: Lady Ranelagh’s Influence on Robert Boyle,” Intellectual History Review 25.1 (2015), pp. 21-36.
Eaton, William. Boyle on Fire: The Mechanical Revolution in Scientific Explanation (Continuum, 2005).
- This work explores the lasting influence of Boyle’s philosophy of science.
Harwood, John. The Early Essays and Ethics of Robert Boyle (Southern Illinois University Press, 1991).
- This is the only book that presents a detailed analysis of Boyle’s ethics.
Hunter, Michael. Robert Boyle Reconsidered (Cambridge, 1994).
- This edited volume of essays brought about a new appreciation of the significance of Boyle’s natural philosophy.
Hunter, Michael. “How Boyle became a Scientist,” History of Science 33.1(1995), pp. 59-103.
- This article is a detailed account of how Boyle became a scientist.
Hunter, Michael. Robert Boyle 1627-1691: Scrupulosity and Science (Boydell, 2000).
- This work is an in-depth exploration of the relationship between Boyle’s religious views and his natural philosophy. It includes Hunter’s essay, “How Boyle became a Scientist.”
Hunter, Michael. Boyle Studies: Aspects of the Life and Thought of Robert Boyle (Ashgate, 2015).
Kuslan, Louis, and A. Harris Stone. Robert Boyle: The Great Experimenter (Prentice-Hall, 1970).
- Although written for children, this short book is an excellent introduction to Boyle’s natural philosophy, with detailed explanations of several of his most important experiments.
J.R. Jacob. Robert Boyle and the English Revolution: A Study in Social and Intellectual Change (Burt Franklin, 1977).
Newman, William, and Lawrence Principe. Alchemy Tried in the Fire: Starkey, Boyle, and the Fate of Helmontian Chymistry (University of Chicago, 2002).
Principe, Lawrence. The Aspiring Adept: Robert Boyle and His Alchemical Quest (Princeton, 1998).
Sargent, Rose-Mary. The Diffident Naturalist: Robert Boyle and the Philosophy of Experiment (University of Chicago, 1995).
Wojcik, Jan W. Robert Boyle and the Limits of Reason (Cambridge University Press, 2002).

g. Other Important Works

Ben-Chaim, Micahel. Experimental Philosophy and the Birth of Empirical Science (Routledge, 2004).
Evan Bourke. “Female Involvement, Membership, and Centrality: A Social Network Analysis of the Hartlib Circle,” Literature Compass 14.4 (2017).
David, Edward. Creation, Contingency, and Early Modern Science: The Impact of Voluntaristic Theology on Seventeenth Century Natural Philosophy (PhD Dissertation, Indiana University, 1984)
Duddy, Thomas. A History of Irish Thought (Routledge, 2002).
Frank, Robert G. Harvey and the Oxford Physiologists: A Study of Scientific Ideas (University of California Press, 1980).
Garber, Daniel. Descartes’ Metaphysical Physics (University of Chicago Press, 1992).
Garber, Daniel. Descartes Embodied: Reading Cartesian Philosophy through Cartesian Science (Cambridge University Press, 2000).
Harrison, Peter. “Voluntarism and Early Modern Science,” History of Science 40.1 (2002), pp. 63-89.
Harrison, Peter. The Fall of Man and the Foundations of Science (Cambridge University Press, 2007).
Hempel, Carl. The Philosophy of Natural Science (Prentice Hall, 1966).
Klaaren, Eugene. Religious Origins of Modern Science (William B. Eerdmans Publishing Company, 1977).
Osler, Margaret. Divine Will and the Mechanical Philosophy: Gassendi and Descartes on Contingency and Necessity in the Created World (Cambridge University Press, 1994).
Webster, Charles. The Great Instauration: Science, Medicine, and Reform 1626-1660 (Holmes and Meier Publishers, 1975)

Author Information

William Eaton
Email: weaton@georgiasouthern.edu
Georgia Southern University
U. S. A.

Reduction and Emergence in Chemistry

Most talk of reduction and emergence figures in discussions about the relation between different physical theories, or between physics and biology. The aim of this article is to present a different perspective through which to examine reduction and emergence; namely, the perspective of chemistry’s relation to physics.

Very broadly, reduction is associated with the idea that the sciences are hierarchically ordered and unified. As a universal thesis, reductionism takes physics to be the most fundamental science in the sense that the laws and postulates of all other sciences can, at least in principle, be derived from and explained by physics. Metaphysically, this implies that things like molecules, cells, chairs and consciousness are nothing more than the physical stuff of which they are made. On the other hand, emergence is often associated with the idea that the special sciences and their postulated entities, properties, and so forth are somehow novel and partially autonomous from physics. On this view, while the special sciences comply with physical laws, they are nevertheless autonomous, and their postulated entities are over and above physical ones. In this context, one cannot explain away molecules, cells and their respective properties by reference only to physical stuff.

The philosophy of chemistry examines in detail whether reduction, emergence, or some other notion correctly characterises chemistry’s relation to physics and, in particular, to quantum mechanics. The philosophy of chemistry illuminates possible ways of thinking of chemistry’s relation to physics, but also of reduction and emergence. Moreover, understanding chemistry’s relation to physics has important implications for how one understands the relation between other sciences. For example, biology often refers to chemical entities and processes in order to explain biological phenomena. Given this, examining chemistry’s relation to physics contributes to understanding biology’s relation to physics. Furthermore, the notions of reduction and emergence are associated with more general philosophical questions about the unity or disunity of the sciences, but also about the very nature and structure of the world. Examining reduction and emergence with respect to chemistry can contribute to these issues. A case in point is the nature and reality of entities and properties in special sciences. For example, if chemical entities are reduced to those of physics, then one could formulate an argument against the existence of chemical entities. On the other hand, if chemical entities somehow emerge from physical ones, then this may suffice to support the reality of chemical entities and of their respective properties.

Introduction
The Significance of This Topic in the Philosophy of Chemistry
Reduction in Chemistry
Emergence in Chemistry
Beyond Reduction and Emergence
1. Unity without Reduction
2. Pluralism
Conclusion
References and Further Reading

1. Introduction

What one means by reduction and emergence can vary extensively, and there are positions which argue for an understanding of chemistry’s relation to physics in a manner that goes beyond the dilemma between reduction and emergence. Nevertheless, all positions can be understood as addressing at least one of two distinct, yet often overlapping, questions:

The question of the relation of the formalism of chemistry to that of physics. This is an epistemic question because it focuses on the relation between theories of chemistry and theories of physics.
The question of the relation of the entities, properties, and so forth that are postulated by chemistry to the entities and so forth that are postulated by physics. This is a metaphysical question because it concerns the nature of chemical entities, properties, and so forth.

Chemistry’s relation to physics is examined with respect to different theories, concepts, entities, properties and phenomena of chemistry and of physics (Hendry 2012; van Brakel 2014). Given this, ‘to speak of “the relation between chemistry and physics” is nonsense: a whole variety of possible intertheoretical relations have to be addressed’ (van Brakel 2014: 34). Both chemistry and physics, understood as scientific disciplines, encompass various sub-disciplines and theories which have, among other things, distinct explanatory and heuristic goals. In light of this, various theories have been examined in the context of chemistry’s relation to physics, including: (a) the relation between thermodynamics and statistical mechanics (Hendry 2012: 369; Needham 2009); (b) the relation of chemistry to quantum mechanics; and, (c) the relation of organic chemistry to quantum chemistry (Goodwin 2013).

Given the above, it is not surprising that the relation between chemistry and physics involves examining the relation between different sets of entities, properties, and so forth that the relevant theories postulate. For example, chemistry’s relation to quantum mechanics has been examined with respect to (a) chemical elements and the periodic table (Scerri 2012b: 75-76); (b) molecular structure (Hendry 2010b; Weininger 194; Woolley 1976); (c) orbitals (Villani et al. 2018); (d) chemical reaction rates (Hettema 2017: 69-86); and (e) the chemical bond (Hendry 2008; Weisberg 2008). Another feature of chemistry’s relation to physics concerns examining how macroscopic substances are related to their constituents (van Brakel 2014: 34). Also, another feature involves examining the relation between the ‘vernacular and scientific use of substance names’ (van Brakel 2014: 34).

While none of the above features of chemistry’s relation to physics are independent from each other, each of them deserves its own article, as each involves addressing issues unique to its specific domain of inquiry. Given this, as well as the fact that reduction and emergence are mostly investigated with respect to chemistry’s relation to quantum mechanics, this article reviews reduction and emergence in the context of how chemistry and its postulated chemical entities relate to quantum mechanics and its postulated entities.

Before presenting the existing views on chemistry’s relation to quantum mechanics, it is useful to briefly specify the subject matter of the two relevant sciences. Chemistry is concerned with the composition and transformation of matter into new substances. It achieves the description, explanation, and prediction of the composition and reaction of matter by reference to entities, properties, and so forth that the theory postulates. In other words, chemistry uses concepts which are characteristic of the chemical description and which allegedly refer to entities, properties, and so forth that determine how matter is composed and reacts. Phenomena that are within the purview of chemistry are the rusting of metals, the properties of atoms and molecules, the boiling of water and the volatility of mercury. Quantum mechanics is the non-relativistic theory that describes microscopic systems (Palgrave Macmillan Ltd 2004: 1863). It is distinct from relativistic quantum mechanics and from quantum field theory. Quantum mechanics achieves the description, explanation, and prediction of microscopic systems by reference to entities and properties that the theory postulates. Phenomena that are within the purview of quantum mechanics are black-body radiation, the double-slit experiment, and the behaviour of a free particle under a magnetic field.

Note that quantum chemistry plays a very important role in understanding the relation between chemistry and quantum mechanics. In the Dictionary of Physics quantum chemistry is defined as the ‘branch of theoretical chemistry in which the methods of quantum mechanics are applied to chemical problems’ (Palgrave Macmillan Ltd 2004: 1845; see also Gavroglu and Simões 2012). In the literature on chemistry’s relation to quantum mechanics, it is not clear whether quantum chemistry is regarded as part of the higher-level theory or the lower-level one (that is, chemistry and quantum mechanics respectively). For example, Goodwin (2013) refers to the relation of quantum chemistry to quantum mechanics, implicitly suggesting that quantum chemistry is the higher-level (chemical) theory. On the other hand, there are philosophers of chemistry who compare the explanatory and predictive success of quantum chemistry with that of chemistry proper, thus implicitly suggesting that quantum chemistry is the lower-level theory.

2. The Significance of This Topic in the Philosophy of Chemistry

According to some members of the philosophy of chemistry community, chemistry is a special science that has not been considered in much detail with respect to its relation with other sciences, including physics (Scerri and Fisher 2015: 3). This is because the philosophy of science and the philosophy of physics take the relation between chemistry and physics to be an unproblematic relation of subordination of the former to the latter (for example van Brakel 2014: 13; Bensaude-Vincent 2008: 16). Epistemically, this broadly means that the descriptions, explanations, and predictions of phenomena that are provided by chemistry can at least in principle be derived from the theories of physics. Metaphysically, this broadly means that the entities, properties, and so forth that are postulated by chemistry are nothing over and above physical entities and properties.

There are two main reasons why physics may be considered ‘ontologically prior’ to chemistry (Hendry 2012: 367). First, if one takes physics to examine those things that make up chemical entities and properties, then this establishes the priority of physics in virtue of the existence of a mereological relation between chemical and physical entities (Hendry 2012: 367). Secondly, physics is considered a universal science in the sense that it sets out, at least in principle, to describe, explain, and predict everything in the world, and not just some subset of phenomena, like chemistry does (Hendry 2012: 367). Dirac’s famous quote is indicative of this stance towards chemistry and of chemistry’s status compared to physics:

The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. (1929: 714)

In light of this, some members of the community take the investigation of chemistry’s relation to physics to be a central issue in the philosophy of chemistry, as the answer that one gives with respect to this issue determines whether, and in what sense, chemistry is an autonomous scientific discipline (Chang 2015; Lombardi and Labarca 2005). For example, Chang states that

the relationship between physics and chemistry is one of the perennial foundational issues in the philosophy of chemistry. It concerns the very existence and identity of chemistry as an independent scientific discipline. Chemistry is also the most immediate territory that physics must conquer if its “imperialistic” claim to be the foundation for all sciences is to have any promise. (Chang 2015: 193)

Some members of the philosophy of chemistry community take the investigation of chemistry’s relation to physics to be central not only for establishing the autonomy of chemistry, but also for ensuring the legitimacy of the philosophy of chemistry as a worthwhile and autonomous field of philosophy (in particular see Lombardi and Labarca 2005; Lombardi and Labarca 2007; Scerri and Fisher 2015; Schummer 2014a: 1-2; van Brakel 1999). For example, Scerri and Fisher state that

the philosophy of chemistry had been mostly ignored as a field, in contrast to that of physics and, later, biology. This seems to have been due to a rather conservative, and at times implicitly reductionist, philosophy of physics whose voice seemed to speak for the general philosophy of science. It has taken an enormous effort by dedicated scholars around the globe to get beyond the idea that chemistry merely provides case studies for established metaphysical and epistemological doctrines in the philosophy of physics. These efforts have resulted in both definitive declarations of the philosophy of chemistry to be an autonomous field of inquiry and a number of edited volumes and monographs. (2015: 3)

Lombardi and Labarca state something similar regarding the ‘traditional assumption’ of reduction:

This traditional assumption not only deprives the philosophy of chemistry of legitimacy as a field of philosophical inquiry, but also counts against the autonomy of chemistry as a scientific discipline: whereas physics turns out to be a ‘fundamental’ science that describes reality in its deepest aspects, chemistry is conceived as a mere ‘phenomenological’ science, that only describes phenomena as they appear to us. (2005: 126)

Given the above, it is no surprise that chemistry’s relation to physics has received such attention in the philosophy of chemistry. This does not mean that all philosophers who investigate the relation of chemistry to physics do so with the intention of defending the legitimacy of the philosophy of chemistry or the autonomy of chemistry. In fact, many examine the question of chemistry’s relation to physics because they take it to be relevant to the investigation of other philosophical issues, such as the reality of chemical entities and the relation between biology and physics. For example, Needham believes that views regarding biology’s reduction to physics, as they are discussed in the philosophy of mind and biology, presuppose the successful reduction of chemistry to physics (Needham 1999: 169). Therefore, the question of the relation of chemistry to physics is central not only for chemistry and the philosophy of chemistry in the manner outlined above, but also for the sciences and general philosophy as well.

3. Reduction in Chemistry

Discussion of reduction with respect to chemistry primarily occurs in the context of the distinction between epistemological and ontological reduction. In the philosophy of chemistry, epistemological reduction requires ‘that the laws of chemistry be derivable from those of physics’ (Hendry and Needham 2007: 339). Ontological reduction ‘requires only that chemical properties are determined by “more fundamental” properties’ (Hendry and Needham 2007: 339). By and large, this distinction is accepted in the literature, though there are philosophers that argue that this distinction is not helpful in spelling out correctly the relation between the two theories (Needham 2010: 169; Hettema 2012b: 164). It is worth noting that Hendry and Needham prefer using the term ‘intertheoretic reduction’ instead of ‘epistemological reduction’ as they think that the former term captures best the sort of reduction that is investigated; namely a reduction which ‘involves logical relationships between theories, rather than knowledge’ (Hendry and Needham 2007: 339).

a. Epistemological, or Intertheoretic, Reduction

Discussion of epistemological, or intertheoretic, reduction primarily happens in the context of Nagel’s account of reduction. In the philosophy of chemistry, a Nagelian reduction is understood as requiring at least, in principle, the derivation or deduction of chemistry from quantum mechanics (Needham 2010: 164; Hettema 2017: 7). A Nagelian reduction consists of two ‘formal’ requirements, namely the ‘connectability and derivability’ of the two theories ((Scerri 1994: 160), see also (Hettema 2017: 7)). Moreover, the reduction of chemistry to quantum mechanics would fall under the cases of heterogeneous reductions. This is because ‘some typically chemical terms cannot be found in the quantum mechanical language’, thus requiring the existence of bridge laws (Scerri 1994: 160; see also Primas 1983: 5). A successful reduction would allegedly be sufficiently supported if the chemical properties of atoms and molecules can, at least in principle, be calculated by quantum mechanics ‘entirely from first principles, without recourse to any experimental input whatsoever’ (Scerri 1994: 162). Note that the latter form of quantum mechanics is often referred to as ‘ab initio quantum mechanics’ (Scerri 1994; Schwarz 2007).

In the philosophy of chemistry there has been debate on what the appropriate criteria are for a successful Nagelian reduction of chemistry to physics (see for example Hettema 2012a; 2017; Needham 1999; 2010; Scerri 1994). For example, Hettema claims that the use of the term ‘Nagelian’ with reference to the aforementioned understanding of reduction is to an extent misleading because Nagel was not so strict in his account of reduction:

Reduction is too often conceived of as a straightforward derivation or deduction of the laws and concepts of the theory to be reduced to a reducing theory, notwithstanding Nagel’s insistence that heterogeneous reduction simply does not work that way. (Hettema 2017: 1-2; see also Hettema 2012b: 146; Dizadji-Bahmani, Frigg and Hartmann 2010; Fazekas 2009; Klein 2009; Nagel 1979; van Riel 2011)

While Nagel’s account of reduction is the most widely discussed account in the philosophy of chemistry, there are other accounts from philosophy. They include Oppenheim’s and Putnam’s account of micro-reduction (Oppenheim and Putnam 1958; Hendry 2012: 368-369). Very briefly, according to this account of reduction, a theory T₁ micro-reduces a theory T₂ if (i) the phenomena that are explained by T₂ can be explained by T₁; and (ii) T₁ describes the parts of the entities, properties, and so forth that are postulated by T₂. According to Hendry, if ‘the micro reductive explanation takes the form of a deduction’, then Oppenheim’s and Putnam’s account is a kind of Nagelian reduction (Hendry 2012: 369).

Nagel, Oppenheim and Putnam take chemistry’s relation to physics to be a paradigmatic case of their respective accounts of reduction (Hendry 2012: 369). A large, though not the entire, part of the philosophy of chemistry literature discusses reduction by investigating whether these accounts of reduction correctly apply to chemistry’s relation to quantum mechanics. Popper’s understanding of reduction has also been investigated in the context of chemistry’s relation to quantum mechanics (Scerri 1998; Needham 1999).

The epistemological reduction of chemistry to quantum mechanics is primarily examined by looking at how quantum mechanics, via the Schrödinger equation, describes the chemical properties of atoms and molecules. Given this, it is useful to briefly present how quantum chemistry employs the Schrödinger equation in order to describe the chemical properties of atoms and molecules. This sub-section henceforth focuses on the non-relativistic Schrödinger equation since this is the one that is standardly employed for the description of atoms and molecules and that is discussed with respect to chemistry’s relation to quantum mechanics.

The Schrödinger equation is the ‘equation of motion for the wave function’ which describes ‘the state of a quantum-mechanical system, and (more generally) for the corresponding state-vector’ (Palgrave Macmillan Ltd 2004: 2029). The solutions of the time-dependent Schrödinger equation (Ψ(x,t)) are (potentially) the wavefunctions of the system under examination (that is of an electron, atom, molecule and so forth).

The generic form of the time-dependent Schrödinger equation is the following:

iħ ∂Ψ(x,t)/ ∂t = – (ħ²/2m)(∂²Ψ(x,t)/∂x²) + VΨ(x,t),

where

∂: partial derivative

Ψ(x,t): a system’s wavefunction

ħ: Planck’s constant

m: the system’s mass

x: position

t: time

V: potential energy

i: imaginary unit (square root of negative one)

If one assumes that a system’s potential energy is independent of time, then it is possible to solve the Schrödinger equation using the method of separation of variables (Griffiths 2005: 24). In this context, the resulting solutions are wavefunctions of the following form (Griffiths 2005: 24):

Ψ(x,t) = ψ(x)φ(t),

where

ψ: a function of position

φ: a function of time

Based on the ability to separate the variables of the Schrödinger equation, it is possible to formulate the time-independent Schrödinger equation, which is an equation independent of time and whose solutions are a system’s time-independent wavefunctions, ψ(x). These wavefunctions correspond to the stationary states of the system under examination.

The time-independent Schrödinger equation does not yield a unique solution (that is, one wavefunction) (Griffiths 2005: 27). It yields an infinite number of solutions (ψ(x₁), ψ(x₂), …), each of which corresponds to a different state of the system under examination. In accordance with the superposition principle, any linear combination of the solutions of the time-independent Schrödinger equation is also regarded as a wavefunction that represents a possible state of the system (Griffiths 2005: 27).

The stationary state of a system, through its wavefunction ψ(x), provides useful information about the total state of the system, Ψ(x,t). First, the probability density Ψ(x,t) equals ∣ψ(x)∣². This means that knowledge of just the stationary state of a system, through the solution of the time-independent Schrödinger equation, provides the probability of finding the system at a particular region in space. Secondly, it is possible to calculate the expectation value of any dynamical variable of a state of the system through the stationary state of the system alone (Griffiths 2005: 26). Stationary states are states of definite total energy, E (Griffiths 2005: 26). Each solution to the time-independent Schrödinger equation is associated with a particular allowed total energy of the system (E₁, E₂, …). The wavefunction that is associated with the minimum total energy corresponds to the ground state of the system, whereas the wavefunctions whose total energies are larger correspond to the excited states of the system.

The time-independent Schrödinger equation for an isolated molecule provides an infinite number of solutions (that is, wavefunctions), each of which corresponds to different stationary states of the molecule. For example, a stable isolated molecule, in virtue of being stable, is said to be in the ground state. From this, it follows that it is represented by the wavefunction that is associated with the system’s ground state and that it has the minimum total energy.

The Hamiltonian operator plays a central role in the solution of the time-independent Schrödinger equation for quantum systems and isolated molecules in particular. When the system under examination is an isolated molecule, the Hamiltonian operator corresponds to the total energy of the molecule (that is, its eigenvalues are the total energy of each state of the molecule); hence it is called the molecular Hamiltonian. In principle, the molecular Hamiltonian operator includes all the factors that determine the kinetic and dynamic energy of the molecule. That is, it should take into account the kinetic energy of each nucleus and electron in the system, the repulsion between each pair of electrons and between each pair of nuclei, and the attraction between each pair of electron and nucleus.

Because of the mathematical complexity involved in the formulation of the Hamiltonian operator, atomic and molecular systems are examined within the framework of the Born-Oppenheimer approximation (henceforth BO approximation; also referred to as the adiabatic approximation). The BO approximation is a ‘(r)epresentation of the complete wavefunction as a product of an electronic and a nuclear part Ψ(r,R) = Ψ_e( r,R) Ψ_N(R)’ (IUPAC 2014: 179). The validity of the BO approximation is ‘founded on the fact that the ratio of electronic to nuclear mass […] is sufficiently small and the nuclei, as compared to the rapidly moving electrons, appear to be fixed’ (IUPAC 2014: 179).

Within the BO approximation, one can in principle formulate the Hamiltonian operator by positioning the nuclei at all the possible fixed positions. Each set of nucleonic positions corresponds to different quantum states of the system (hence to different wavefunctions) and to different values of the total energy, E, of the atom or molecule. However, in practice this process is not followed. By having prior knowledge of the quantum system that is under examination—for example, by knowing the chemical and structural properties of the examined molecule—only particular nucleonic conformations are considered when constructing the Hamiltonian operator.

The BO approximation is a feature of quantum mechanics which plays a central role in the investigation of chemistry’s relation to quantum mechanics (Bishop 2010: 173; van Brakel 2014: 31-33; Woolley 1976; 1978; 1991; 1998; Woolley and Sutcliffe 1977; Sutcliffe and Woolley 2012). It has often been invoked as putative empirical evidence for the rejection of chemistry’s reduction to quantum mechanics as well as for the support of the emergence of chemistry (see next sections). Solving the equation outside the BO approximation in order to describe atomic and molecular properties is currently investigated in chemistry and quantum chemistry (for example Tapia 2006). This implies that there are features of quantum mechanics which may further contribute to our understanding of chemistry’s relation to quantum mechanics (for example Woolley 1991).

Note that even when the nucleonic conformation is fixed in the manner represented by the BO approximation, calculating the solution of the Schrödinger equation remains a complicated task. Each nucleonic conformation is compatible with different quantum states of the system (and thus different wavefunctions). This is compatible with chemistry’s understanding of atoms and molecules because, even if the nuclei are fixed at particular positions, the electrons may behave in more than one possible way within an atom or molecule.

In light of the above, the Schrödinger equation is not solved analytically for all atoms and molecules. As Hendry states:

There is an exact analytical solution to the non-relativistic Schrödinger equation for the hydrogen atom and other one-electron systems, but these are special cases on account of their simplicity and symmetry properties. (Hendry 2010a: 212)

Instead, researchers have developed various approximate methods in order to solve it, most of which employ the BO approximation. In general, the development of computation has led to the proliferation of complex computational methods that solve the equation by following different mathematical strategies and by making different assumptions. These methods include the Valence Bond Approach, the Molecular Orbital Approach, the Hartree-Fock Method and Configuration Interaction.

Based on the above, there are philosophers who argue in favour of the epistemological reduction of chemistry to quantum mechanics. For example, Schwarz argues that ab initio quantum mechanics can in principle derive all ‘well-defined numerical properties’ of the chemical elements (Schwarz 2007: 168). Ab initio quantum mechanics refers to quantum mechanical methods that are ‘independent of any experiment other than the determination of fundamental constants. The methods are based on the use of the full Schrödinger equation to treat all the electrons of a chemical system’ (IUPAC 2014: 5).

While Schwarz does not examine chemistry’s relation to quantum mechanics in terms of a particular philosophical account of reduction (such as Nagel’s account of reduction), he advocates some sort of reductive relation between chemistry and quantum mechanics. He claims that the ‘difficulty’ of ab initio quantum mechanics to (in practice) derive certain chemical properties is due to the fact that ‘basic qualitative chemical concepts are so vaguely defined’ and ‘fuzzy’ (Schwarz 2007: 172, 174). Given the above, he believes that the periodic system is in a ‘transition phase’ from a primarily ‘empirical model of chemistry’ to ‘an understandable model based in physical theory’ (Schwarz 2007: 173).

The epistemological reduction of chemistry to quantum mechanics is alternatively supported by Bader’s Quantum Theory of Atoms in Molecules (QTAIM) (Bader 1990; Bader and Matta 2013; Matta and Boyd 2007; Matta 2013). The QTAIM provides a topological analysis of electron density through which one derives information regarding atomic and bonding properties. The QTAIM provides experimentally verifiable information regarding the properties of large molecules, by reconstructing their properties from ‘smaller fragments’ (Matta 2013). It is a scientific theory which ‘demonstrates that every measurable property of a system, finite or periodic, can be equated to a sum of contributions from its composite atoms’ (Bader 1990).

Bader takes the QTAIM to provide correct descriptions, explanations and predictions of the chemical properties of matter ((Bader 1990: vi), see also (Bader and Matta 2013), (Causá et al. 2014), (Hettema 2012a) and (Hettema 2013)). While Bader does not explicitly talk about the reduction of chemistry to quantum mechanics in philosophical terms, his account is regarded in the philosophy of chemistry as representing ‘a proper, (reductionist) basis for chemistry’ (Hettema 2013: 311). This is because, according to Bader and Matta, the QTAIM allegedly supports the claim that ‘chemistry is physics’ (Bader and Matta 2013: 254). However, Hettema argues that while Bader’s view of the QTAIM suggests that the QTAIM is related to chemistry in a manner that closely resembles Kemeny and Oppenheims’ reductive eliminativist account, the QTAIM fails to be a reductive theory of this sort (Hettema 2013). Moreover, Arriaga, Fortin and Lombardi argue that while the QTAIM manages to ‘provide a rigorous definition of the chemical bond and of atoms in a molecule, it appeals to concepts that are unacceptable in the quantum-chemical context’, thus failing to sufficiently support the reduction of chemistry to quantum mechanics (Arriaga et al. 2019: 125). Van Brakel makes a similar point, arguing that the QTAIM works only after postulating facts from chemistry (van Brakel 2014: 32), thus rendering it insufficient for the support of chemistry’s reduction to quantum mechanics.

b. Antireductionism with Respect to Chemistry

Many members of the philosophy of chemistry community reject the epistemological reduction of chemistry to quantum mechanics, as understood in terms of the aforementioned accounts. As Hettema states:

The idea that chemistry stands in a reductive relationship to physics still is a somewhat unfashionable doctrine in the philosophy of chemistry. (2017: 1)

Indeed, there are alternative and often incompatible positions in the philosophy of chemistry which argue, either explicitly or implicitly, against the reduction of chemistry to quantum mechanics. These antireductionist views can be divided into two main camps (Scerri 2007b). First are those positions which reject the reduction of chemistry tout court (Schummer 1998; Schummer 2014b; van Brakel 2000). That is, they ‘deny the whole enterprise’ of reducing chemistry to quantum mechanics on grounds that have to do with the unique methodological, classificatory or other epistemological features of chemistry (Scerri 2007b: 70). Philosophers that follow this antireductionist approach support, either implicitly or explicitly, the irreducibility of chemistry by arguing that chemistry, in virtue of being a science of substances which employs unique classificatory tools and concepts, cannot be reduced to a science which looks at the micro-constituents of those substances and which disregards the classificatory or methodological tools and concepts that are of interest to chemists.

In the second camp are those positions which examine in detail how quantum mechanics describes, predicts, and explains particular chemical entities, properties, and so forth (such as the chemical bond, molecular structure, orbitals and the periodic system). They consider how quantum mechanics describes particular chemical properties and through this analysis they implicitly or explicitly argue against the reduction of chemistry to quantum mechanics (Bogaard 1978; González et al. 2019; Hendry 1998; 1999; 2010a; 2012; Ramsey 1997; Scerri 1994; 1998; Woolley 1976; 1978; 1985; 1998; Woolley and Sutcliffe 1977; Weininger 1984; Woody 2000).

For example, Scerri evaluates the manner in which the Schrödinger equation is solved so as to yield accurate results about the properties of atoms and molecules. He claims that ab initio quantum mechanics has yielded relatively accurate results regarding the ground-state energy of particular atoms and has acknowledged the success of quantum mechanics in providing a mathematical analysis of chemical phenomena and in generating sufficiently accurate quantitative values of chemical properties such as bond strength and dipole moments (2007b; 2012). However, he takes that this does not sufficiently support the reduction of chemistry to quantum mechanics (Scerri 1994: 164). Specifically, the approximate methods that are employed for the solution of the Schrödinger equation—and without which a solution cannot be provided—involve the use of ad hoc assumptions which, in virtue of being ad hoc and reliant ‘on experimental data’, undermine the thesis that chemistry is reduced in a Nagelian manner to quantum mechanics (Scerri 1994: 165-168; see also Scerri 1991: 320-321). Note that Hofmann (1990) presents how models and approximations have been employed throughout the history of quantum mechanics for the description of chemical properties; see also Gavroglu and Simões (2012).

Scerri invokes the periodic table and the electronic configuration model as examples that support the failure of chemistry’s reduction to quantum mechanics (Scerri 2007b: 74; Scerri 2012b: 79-80; Scerri 1991).

Before presenting Scerri’s argument, it is useful to briefly define the chemical terms that his and subsequent analyses invoke. The electronic configuration is ‘a distribution of the electrons of an atom or a molecular entity over a set of one-electron wavefunctions called orbitals, according to the Pauli principle’ (IUPAC 2014: 317). An orbital, whether atomic or molecular, is a ‘(w)avefunction depending explicitly on the spatial coordinates of only one electron’ (IUPAC 2014: 1034). An atomic orbital is a ‘(o)ne-electron wavefunction obtained as a solution of the Schrödinger equation for an atom’ (IUPAC 2014: 124). Given that orbitals depend on the spatial coordinates of electrons, the electronic configuration of an atom provides a representation of the distribution of electrons in the atom. This is particularly important in chemistry because it serves as a basis for the explanation and prediction of the type of bonds that are formulated between atoms.

With respect to the periodic table then, Scerri’s claim is broadly the following. The manner in which chemical elements are ordered in the periodic table is partially explained and could be regarded as derived by quantum mechanics because quantum mechanics specifies the electronic configuration of the atoms of each element (Scerri 2012b: 75). However, there are certain features of the periodic table, such as the length of its periods, which are not deducible from quantum mechanics (Scerri 2012b: 77-78). Therefore, the derivation of the periodic table from quantum mechanics, and thus the reduction of chemistry, cannot be sufficiently supported.

Moreover, a Nagelian reduction ‘requires axiomatised versions of the theory to be reduced as well as the reducing theory’, which at least with respect to chemistry cannot possibly be argued for (Scerri 2006: 124). A similar point is made by Hettema regarding Nagelian reduction: ’chemistry is a field, whereas reduction tends to be a relation between individual theories, or between laws and theories’ (Hettema 2017: 1). Furthermore, quantum mechanics does not provide on its own ‘a conceptual understanding of chemical phenomena’ (Scerri 2007b: 74). Instead, chemists employ chemical models and theories in order to formulate sufficient descriptions, explanations, and predictions of chemical phenomena and properties. Another problem for the reduction of chemistry is that quantum mechanics is symmetric under time inversion, and thus cannot provide an explanation of why chemical entities evolve in time the way they do. It can only provide a ‘reductive description’ of chemical properties independent of time (Scerri 2007b: 78). In fact, while quantum mechanics provides numerical values to particular chemical properties, it does not provide a complete explanation of a system’s chemical behaviour (Scerri 2007b: 78).

Scerri also rejects the success of an approximate reduction of chemistry to quantum mechanics (1994; 1998). By approximate reduction, Scerri refers to Putnam’s analysis of reduction, which permits the reducing theory to be approximately and not exactly true (Scerri 1994: 161). That is, ‘the relationships postulated by the theory hold not exactly, but with a certain specifiable degree of error’ (Putnam 1965: 206-207). In this context, reduction is not undermined if ab initio quantum mechanics provides only approximate results of the value of atomic and molecular properties, as long as these results are accompanied by a specifiable degree of error. However, Scerri rejects approximate reduction as the errors ‘are seldom computed by independent ab initio criteria’ (Scerri 1994: 168). Scerri also examines approximate reduction in relation to Popper’s analysis of the reduction of chemistry. In this context, Scerri draws a very similar conclusion with respect to the approximate reduction of chemistry (Scerri 1998: 42).

Based on all the above, Scerri concludes that the reduction of chemistry is ambiguous since, depending on what the set criteria for a successful reduction are, chemistry’s reduction to quantum mechanics ‘is both successful and unsuccessful’ (Scerri 2007b: 76; Scerri 2012b: 80).

Other philosophers also argue that chemistry has failed to epistemically reduce to quantum mechanics by pointing out similar issues with respect to the quantum mechanical description of chemical phenomena (see Bogaard 1978; Hendry 1998; Hendry 2010b: 183; Primas 1983; Woolley 1976; 1998; Woolley and Sutcliffe 1977). For example, Primas argues that quantum mechanics is ‘incorrect and should be revised, partly because [it] seems incapable of rendering a robust account of concepts such as molecular shape’ (Hettema 2017: 53, see also Primas 1983). Bogaard points out that chemists disregard a number of features of the behaviour of subatomic particles when specifying an atom’s or molecule’s Schrödinger equation. These features include (a) the behaviour of subatomic particles (namely protons and neutrons); (b) the energetic contribution of the movement of the nuclei; and, (c) relativistic effects (Bogaard 1978: 346). Moreover, the fact that the Schrödinger equation is ‘adapted’ so as to provide an accurate description of each particular system challenges the view that quantum mechanics can, even in principle, deduce complete explanations of chemical phenomena (Bogaard 1978).

González et al. (2019) argue that there is a tension between the theoretical postulates of quantum mechanics and how molecular structure is understood in chemistry. In particular, Heisenberg’s uncertainty principle implies that a ‘quantum “particle” is not an individual in the traditional sense, since it has properties—those represented by its observables—that have no definite value’ (González et al. 2019: 36). Such a metaphysical understanding of quantum particles comes in contrast to chemistry’s understanding of molecular structure, which is defined ‘in terms of the spatial relation of the nuclei conceived as individual localised objects’ (González et al. 2019: 43). The failure of chemistry’s reduction is further supported by the fact that the Schrödinger equation cannot be solved analytically without the use of approximations and models (for example Bogaard 1978: 347; González et al. 2019; Hendry 2010b). These approximations and models are based on ‘theoretical assumptions drawn from chemistry’, thus rendering the quantum chemical description of complex atoms and molecules in a ‘loose relationship to exact atomic and molecular Schrödinger equations’ (Hendry 2010b: 183).

Lastly, Chang argues that since its advent, quantum chemistry was practiced in a manner that required the use of pre-quantum, chemical knowledge (Chang 2015; 2017). The views of Linus Pauling, one of the main founders of quantum chemistry, allegedly corroborate this argument, as Pauling took quantum chemistry to be ‘a direct continuation of nineteenth-century organic structural chemistry’ (Chang 2015: 197-198). Chang also claims that physics consists of many different branches and that the relation of those branches with more fundamental physical theories has not been decisively shown to be reductionist. In light of this, and given that chemistry’s relation to physics is examined in the context of a physical theory (that is, quantum mechanics) which is not the most fundamental theory in physics, one should not assume chemistry to be unproblematically reduced to physics (Chang 2015: 200; Chang 2017: 365). Thirdly, Chang looks at how chemistry is done in practice and claims from this that chemistry is very far from being ‘submitted’ to physics (Chang 2015: 201). This claim allegedly undermines the reduction of chemistry to quantum mechanics since quantum mechanics has never in practice been sufficient for the description, explanation or prediction of phenomena that are within the purview of chemistry (Chang 2015: 201-202).

c. Ontological Reduction

In light of the above objections against the epistemological reduction of chemistry, there are philosophers who have investigated whether it is possible to support chemistry’s ontological reduction to quantum mechanics in a manner that is consistent with the failure of chemistry’s epistemological reduction. Most notable is Le Poidevin, who formulated a detailed account for the ontological reduction of chemical properties which does not depend on the success of an epistemic reduction of chemistry to quantum mechanics. In fact, Le Poidevin accepts that chemistry has not been epistemically reduced to quantum mechanics and argues that, despite this, it can be argued that chemical elements are ontologically reduced to physical properties. He claims that the argument for the ontological reduction of chemical elements can be generalised to all chemical properties in the following manner:

Chemical properties reduce to those properties variation in which is discrete, and combinations of which constitute the series of physically possible chemical properties. (Le Poidevin 2005: 132)

In particular, he takes that the discreteness of chemical elements as specified via the periodic table supports a combinatorial argument for their ontological reduction. According to this argument, ‘a finite number of fundamental entities combine together to give a discrete set of composite elements’ (Scerri 2007a: 929).

Le Poidevin’s argument is based on two premises. The first is the ‘combinatorial criterion for ontological reduction’, which states that

a property type F is ontologically reducible to a more fundamental property type G if the possibility of something’s being F is constituted by a recombination of actual instances of G, but the possibility of something’s being G is not contributed by a recombination of actual instances of F. (Le Poidevin 2005: 132)

The second premise concerns the ‘discreteness of chemical ordering’: ‘between any two elements there is a finite number of physically possible intermediate elements’ (Le Poidevin 2005: 132).

According to Le Poidevin, the combinatorial criterion for the ontological reduction of chemical properties is preferable to existing physicalist accounts regarding the ontological reduction of special science properties because it overcomes two insurmountable problems of physicalism. The first problem is the ‘vacuity problem’, according to which physicalism is in danger of becoming a trivial thesis depending on what one takes to be included in the domain of physics (Le Poidevin 2005: 121-122). The second problem is the ‘asymmetry problem’, according to which the supervenience relation, as postulated by physicalism, does not necessitate an asymmetric relation between higher and lower-level properties (Le Poidevin 2005: 122).

Scerri, Hendry and Needham are sympathetic towards Le Poidevin’s argument of the ontological reduction of chemical elements (Scerri 2007b: 76; Hendry and Needham 2007: 340). As Hendry and Needham state, the combinatorial argument establishes that ‘the discreteness of the elements is explained by the nomologically required discrete variation in a physical quantity, namely nuclear charge’ (Hendry and Nedham 2007: 34). However, all of them take that there are certain problematic features in Le Poidevin’s account.

First, the argument is allegedly not well-supported for all chemical properties. Scerri doubts that the combinatorial argument can be generalised so as to apply to all chemical properties because, unlike chemical elements, most chemical properties are not discreet (such as the solubility and acidity of elements) (Scerri 2007a: 929). Similarly, Hendry and Needham argue that the combinatorial argument is only investigated with respect to chemical elements, thus disregarding a large part of chemistry. This is a central shortcoming of Le Poidevin’s account because there are particular features of chemistry and of quantum mechanics which are often regarded as posing unique challenges to chemistry’s reduction to quantum mechanics. For example, the structure of molecules is a chemical property which some argue is not in principle derivable by quantum mechanics (Hendry and Needham 2007: 341-342). This is regarded problematic for the reduction of chemistry to quantum mechanics, whether epistemic or ontological. Another issue is how chemistry describes the rate of chemical reactions. Kinetic theory and thermodynamics play a fundamental role in explaining and describing the rate of chemical reactions, and thus need to be considered in the context of chemistry’s relation to quantum mechanics (Hendry and Needham 2007: 343-344). These are problems that concern particular chemical properties and which need to be tackled if any account of (ontological) reduction is to be well-supported for all chemical properties.

Secondly, Scerri takes that Le Poidevin’s attempt to circumvent any talk about the epistemic reduction between the two relevant theories is illusory. The latter takes that a ‘periodic ordering is a classification rather than a theory’, thus rendering his account of ontological reduction ‘theory-neutral’ (Le Poidevin 2005: 131). However, Scerri disagrees on this point as he takes reference to the periodic table to inevitably require the investigation of how chemistry and quantum mechanics are epistemically related (Scerri 2007a: 929). Hendry and Needham take this point a step further by suggesting that reference to a theory cannot be avoided when specifying the micro-constituents of chemical elements (Hendry and Needham 2007: 344). In fact, they argue that there is ‘a close evidential connection’ between epistemological and ontological reduction; one cannot entirely avoid the investigation of inter-theoretic reduction when seeking to provide sufficient empirical support to ontological reduction (Hendry and Needham 2007: 351).

Another objection to Le Poidevin’s account is that the combinatorial argument, even if correct, does not succeed in establishing the ontological reduction of chemistry to physics. The asymmetric relation that Le Poidevin allegedly establishes via his combinatorial argument establishes ‘only an asymmetrical relationship between the (actual) physical and the (merely possible) chemical’ (Hendry and Needham 2007: 349). Given this, such a relation does not preclude the possibility of chemical properties having novel causal powers, thus rendering Le Poidevin’s account consistent with non-reductive (metaphysical) accounts (such as emergentist accounts) (Hendry and Needham 2007: 350).

Hendry also offers independent support to the claim that chemistry fails to ontologically reduce to quantum mechanics, outside his critique of Le Poidevin’s account. Specifically, he assumes that ontological reduction involves the acceptance of the causal completeness of physics (Hendry 2010b: 187). Given this, it follows that ontological reduction is committed to the claim that only physical entities, properties, and so forth possess novel causal powers (Hendry 2010b: 187). Based on this understanding of ontological reduction, he argues that what he calls the ‘symmetry problem’ undermines the tenability of ontological reduction. The symmetry problem arises from the fact that, for any atom or molecule, the arbitrary solutions of the Schrödinger equation are spherically symmetrical (Hendry 2010b: 186). This comes in contrast to the asymmetry exhibited by polyatomic molecules and which chemistry invokes in order to explain many of their chemical properties, such as the acidic behaviour and boiling point of the hydrogen chloride molecule (Hendry 2010b: 186). The symmetry problem allegedly challenges the ontological reduction of chemistry because it undermines the tenability of the causal completeness of physics, namely the principle that every physical effect has a physical cause (Hendry 2010b: 187). This is because

quantum mechanics is consistent with the view that the asymmetry of molecules ‘is not conferred by the molecule’s physical basis according to physical laws’ (Hendry 2010b: 187); and
the symmetry problem ‘removes much of the empirical support that is claimed for’ the causal completeness of physics (Hendry 2010b: 187).

Lastly, it should be noted that there are positions which argue for the ontological autonomy of chemistry in a manner that is implicitly or explicitly incompatible with the ontological reduction of chemistry to quantum mechanics. This includes Lombardi and Labarca (2005) and Schummer (2014b) (see subsection 5b).

d. Alternative Forms of Reduction

Despite the arguments against chemistry’s epistemological and ontological reduction to quantum mechanics, there are philosophers who attempt to establish reduction. For example, Hettema states that ‘the widespread rejection of reduction by philosophers of chemistry might have been premature’ (Hettema 2012b: 147). Hettema argues that, contrary to how Nagel’s account of reduction has been understood and argued against in the philosophy of chemistry, Nagel was in fact not so strict about the requirements for reduction (Hettema 2014: 193; see also Hettema 2012a). In light of this, Hettema proposes ‘a suitable paraphrase of the Nagelian reduction programme’ which is ‘reinforced by a modern notion of both connectibility and derivability’ (Hettema 2017: 24) (italics are in the original text). Hettema’s position is a reductive account which advocates the existence of autonomous areas. Characterising Hettema’s account as a form of reduction is justified given the quotes just mentioned. Nevertheless, it should be noted that Hettema often refers to his proposal as one that advocates a form of unity (for example Hettema 2012b; 2017). In order to explicate his proposal, Hettema analyses the development of the reaction rate theory and presents, among other things, Eyring’s theory of absolute reaction rates (2017: 71-81; see also Hettema 2012b) (see also subsection 5a).

Needham has also investigated reduction and identified those aspects of Nagelian reduction which should be amended for a more convincing defence of chemistry’s reduction to physics to be achieved. As Needham states:

Chemistry is, perhaps, so entwined with physics that what would be left after removal of physics is but a pale shadow of modern chemistry. It is, perhaps, not even clear what the removal of physics from chemistry would amount to. (Needham 2010: 163)

Needham identifies the weaknesses of Nagelian reduction and examines whether historical developments in chemistry and physics are consonant with how reduction tells us that two theories are related (2010: 170). Based on such an analysis, he argues that it is possible to understand Nagelian reduction in a way that permits and takes into account the use of approximations in science (Needham 2010: 168-169).

4. Emergence in Chemistry

The emergence of chemistry was first discussed and defended by British Emergentists. British Emergentism defended the emergence of chemistry before the advent of quantum mechanics. With the development of quantum mechanics and quantum chemistry, the emergence of chemistry, as it was advocated by British emergentists, was mostly rejected in philosophy. However, in the contemporary literature the emergence of chemistry from quantum mechanics has been reformulated and supported on new grounds. Perhaps the most detailed and widely discussed account of emergence with respect to chemistry is Robin Hendry’s account of the strong emergence of molecular structure. However, there are also alternative understandings of emergence within the philosophy of chemistry.

a. British Emergentism in Chemistry

British Emergentism refers to a group of philosophers in the 19th and 20th centuries which is regarded as the first to provide a detailed and coherent philosophical account of emergence. Among the examples that British Emergentists invoked in order to support the existence of emergence is that of chemistry and in particular of chemical bonding. In particular, J. S. Mill argued that ‘the different actions of a chemical compound will never, undoubtedly, be found to be the sums of the actions of its separate elements’ (quote in McLaughlin 1992: 28; see also Mill 1930). C. D. Broad also advocated the emergence of chemistry on the grounds that it is not ‘theoretically possible to deduce the characteristic behaviour of any element from an adequate knowledge of the number and arrangement of the particles in its atom, without needing to observe a sample of that substance’ (Broad 1925: 70; see also McLaughlin 1992: 47; Hendry 2006: 176-180; Hendry 2010a: 210; Hendry 2010b: 185).

The putative empirical evidence that emergentists invoked for the support of the emergence of chemical bonding is the failure to deduce the chemical behaviour of elements from the entities and properties that constitute those chemical elements. Since one does not describe and predict how chemical elements are bonded to each other only with reference to the entities that compose them, then this suffices to support that chemical bonding is an emergent chemical property which exerts downward causal powers to the entities that constitute the relevant chemical elements (Scerri 2007a: 921).

The British Emergentists’ argument for the emergence of chemical bonding was formulated before the advent of quantum mechanics. According to McLaughlin, once quantum mechanics contributed to the understanding of atomic and molecular properties, including the chemical bond, the emergence of chemical bonding was no longer justified in the manner that British Emergentism advocated:

Quantum mechanical explanations of chemical bonding suffice to refute central aspects of Broad’s Chemical Emergentism: Chemical bonding can be explained by properties of electrons, and there are no fundamental chemical forces. (Mclaughlin 1992: 49; see also Scerri 2007a)

On the other hand, Scerri argues that McLaughlin is mistaken to reject the emergence of chemistry and rejects McLaughlin’s claims that

there was no complete or adequate theory of chemical bonding before the advent of quantum mechanics; and
quantum mechanics provided a complete theory of chemical bonding (Scerri 2007a: 922-923; see also Scerri 2012a).

In fact, Scerri claims that the quantum mechanical theory of chemical bonding should be viewed as continuous and as enhancing Lewis’s theory of chemical bonding (Scerri 2007a: 922-923). The advent of quantum mechanics does not refute pre-quantum, chemical theories of bonding, but rather offers a deeper understanding of chemical bonding. Chemistry remains vital in the description and explanation of the chemical behaviour of elements because quantum mechanics cannot offer by itself a complete account of chemical bonding and of the overall chemical behaviour of elements. While quantum mechanics provides quantitative information regarding particular chemical properties of elements and compounds, it ‘cannot predict what compounds will actually form’ (Scerri 2007a: 924). Quantum mechanics can neither provide an explanation of how atoms and molecules evolve in time, nor can it provide a complete explanation of their overall chemical behaviour (Scerri 2007b: 78). These two characteristics of quantum mechanics, apart from blocking the possibility of a ‘complete’ reduction of chemistry, also allegedly support the claim that chemical entities and properties emerge at a level ‘over and above what one would expect from the constituents of the system’ (Scerri 2007b: 77; see also Llored 2012: 254). What Scerri means by emergence is, however, unclear since he only specifies this notion contrary to physicalism and does not provide a detailed account of the emergence of chemistry.

b. Strong Emergence

Hendry formulates one of the most detailed and widely discussed accounts of emergence regarding chemistry. Hendry’s account focuses on a metaphysical understanding of emergence that has direct implications on the metaphysical relation between chemical and quantum mechanical entities and properties, as well as on the nature of molecular structure. His account of strong emergence is formulated in terms of downward causation, and the putative empirical evidence that supports his position is drawn from the manner in which quantum mechanics and chemistry each describe molecular structure.

According to Hendry, the structure of a molecule strongly emerges from its quantum mechanical entities in the sense that it exhibits downward causal powers. Specifically, ‘the emergent behaviour of complex systems must be viewed as determining, but not being fully determined by, the behaviour of their constituent parts’ (Hendry 2006: 180).

Strong emergence is supported by the ‘counternomic criterion for downward causation’ (Hendry 2010b: 189). According to this criterion, ‘a system exhibits downward causation if its behavior would be different were it determined by the more basic laws governing the stuff of which it is made’ (Hendry 2010b: 189). The manner in which quantum mechanics describes a molecule’s structure allegedly satisfies the counternomic criterion and thus supports the view that molecular structure strongly emerges.

In order to support this claim, Hendry makes a distinction between ‘resultant’ and ‘configurational’ Hamiltonians. A molecule’s resultant Hamiltonian takes into account all the intra-molecular interactions and is constructed using as input only fundamental physical interactions and the value of the physical properties of the entities (such as masses, charges, and so forth) (Hendry 2010a: 210-211). Given the resultant Hamiltonian, the so-called ‘Coulombic Schrödinger equation’ is constructed, which is a complete and exact description of the relevant molecule. However, the resultant Hamiltonian is in practice never used for the solution of the Schrödinger equation. This is primarily due to the equation’s mathematical complexity. Nevertheless, if the Coulombic Schrödinger equation were to be solved, it would not distinguish between different molecular structures (specifically that of isomers), and it would not explain the symmetry properties of a molecule. Instead, quantum explanations of molecular structure are based on the construction of ‘configurational Hamiltonians’ for the solution of the Schrödinger equation of a molecule (Hendry 2010a: 210-211). Configurational Hamiltonians are constructed on the basis of ad hoc assumptions which impose on the Schrödinger equation the molecular structure that is supposed to be derived from that equation. This situation satisfies the counternomic criterion because we did not recover a molecule’s ‘structure from the “resultant” Hamiltonian, given the charges and masses of the various electrons and nuclei; rather we viewed the motions of those electrons and nuclei as constrained by the molecule of which they are part’ (Hendry 2006: 183).

Hendry presents two examples that illustrate that the counternomic criterion is satisfied with respect to molecular structure. The first example concerns isomers (see also Bishop 2010: 172-173). Isomers are sets of molecules that contain the same number and kind of atoms, but whose atoms are arranged differently. This means that isomers differ only in terms of their structure. Isomers have distinct chemical descriptions and they are invoked for the explanation of a variety of physical and chemical phenomena. If one is to describe an isomer via the use of its resultant Hamiltonian, then the Coulombic Schrödinger equation is identical with the Coulombic Schrödinger equations that describe the other relevant isomers (Hendry 2017: 153). On the other hand, if one is to describe an isomer via the use of its configurational Hamiltonian, then the Schrödinger equation that is subsequently constructed, is not identical to those that describe the other relevant isomers. According to Hendry, this means that this example satisfies the counternomic criterion. He thinks it illustrates that the molecule’s behaviour, as this is described ‘by the more basic laws governing the stuff of which it is made’ (that is, via the resultant Hamiltonian) is different from its behaviour, as this is described by assuming certain chemical properties (namely, its structure) via the configurational Hamiltonian.

The second example that Hendry takes as empirical support for downward causation involves the symmetry properties of molecules. Similarly to the case of isomers, one cannot derive the different chemical symmetry properties from the relevant resultant Hamiltonian because ‘the only force appearing in molecular Schrödinger equations is the electrostatic or Coulomb force: other forces are negligible at the relevant scales. But the Coulomb force has spherical symmetry’ (Hendry 2017: 154).

As is the case with other accounts of strong emergence in philosophy of science, Hendry’s account of strong emergence overcomes the overdetermination problem by postulating that there are certain quantum mechanical effects which do not have purely quantum mechanical causes (Wilson 2015: 353). That is, accounts of strong emergence deny the causal completeness of the physical (CCP), which states that ‘every lower-level physically acceptable effect has a purely lower-level physically acceptable cause’ (Wilson 2015: 352). Instead of the CCP, Hendry proposes an alternative principle; namely the ‘ubiquity of physics’ (UP):

Under the ubiquity of physics, physical principles constrain the motions of particular systems though they may not fully determine them. (Hendry 2010b: 188)

This principle acts as a substitute for the causal completeness of the physical (CCP) which Hendry rejects and which is incompatible with his notion of strong emergence. UP allows for the physical principles (as these are formulated via the physical laws and theories) to ‘apply universally without accepting that they fully determine the motions of the systems they govern’ (Hendry 2010b: 188). According to Hendry, unlike UP, the CCP is not well supported by physics itself, and he follows Bishop in thinking it ‘does not imply its own causal closure’ (Bishop 2006: 45). Note that, given the rejection of the CCP, strong emergence, as understood by Hendry, is incompatible with not only some form of epistemic reduction but also with reductive and non-reductive physicalism.

A critique of Hendry’s account in the philosophy of chemistry literature is provided by Scerri, who argues that the putative empirical evidence invoked by the former for the support of strong emergence is merely a ‘theoretical rather than ontological issue’ (Scerri 2012a: 25).

c. Alternative Forms of Emergence

There are alternative accounts of emergence with respect to chemistry. These are mostly accounts which focus on the unique epistemological features of chemistry and propose an understanding of emergence that is primarily epistemic, rather than metaphysical. For example, Bishop and Atmanspacher (2006) formulate an account of ‘contextual emergence’ which they take to successfully apply in two separate cases: namely to the case of molecular structure and to that of temperature (see also Bishop 2010). With respect to molecular structure, they argue that quantum mechanics provides necessary but not sufficient conditions for the description of molecular structure. This implies that reduction is not the appropriate account to correctly specify the relation between the two relevant descriptions. In order to derive a lower-level (that is, quantum mechanical) description of molecular structure, one introduces sufficient conditions by specifying the particular context in which the relevant lower-level system is considered. This allegedly supports the claim that molecular structure is a novel property which is not derivable by the quantum mechanical description alone but rather emerges from it (Bishop and Atmanspacher 2006: 1774; see also Bishop 2010: 176-177; Llored 2012: 248).

Furthermore, Llored presents ‘a relational form of emergence which pays attention to the constitutive role of the modes of intervention and to the co-definition of the levels of organization’ (Llored 2012: 245). This is not a metaphysical account of emergence; as Llored states, his proposed account is ‘agnostic’ with respect to the ontology of chemistry and rather focuses on ‘what chemists do in their daily work’ (Llored 2012: 245). In particular, Llored looks at how ‘from the Twenties to nowadays, quantum chemical methods have been constitutively concerned with the links between the molecule and its parts’ (2012: 257) (italics are in the original text). Among other things, he presents and analyses the debate between Linus Pauling and Robert Mulliken who both ‘focused on the description and the understanding of the molecule, its reactivity, and thus its transformations’ (Llored 2012: 257). Llored argues that his proposed account of emergence is not one which advocates an asymmetric relation between higher and lower-level properties. Rather, both chemical and quantum mechanical properties ‘co-emerge’ (Llored 2014: 156). Chemical phenomena are understood ‘as relative to a certain experimental context, with no possibility of separating them from this context’ (Llored 2014: 156; see also Llored and Harré 2014).

5. Beyond Reduction and Emergence

Very few accounts consider the relation of chemistry to quantum mechanics without invoking some form of reduction or emergence. In fact, if we are to understand epistemic reduction and strong emergence as the two extremes of a spectrum of inter-theoretic accounts, then there is a variety of positions that have remained to this day relatively unexplored with respect to chemistry. Nevertheless, there are some philosophers who consider the possibility of understanding chemistry’s relation to quantum mechanics without reference to reduction or emergence. This section distinguishes between two main camps. First are those accounts which consider unity without reduction. Secondly, there are accounts which support the autonomy of chemistry without reference to some form of emergence.

a. Unity without Reduction

Two philosophers of chemistry have primarily examined chemistry’s relation to quantum mechanics in terms of unity without reduction. First, Needham examines unity without reduction by presenting Pierre Duhem’s ‘scheme’ of ‘unity without reduction’ (Needham 2010: 166). He states that

unity surely does not require reduction, intuitively understood as the incorporation of one theory within another. […] Consistency, requiring the absence of contradiction, and more generally in the sense of the absence of conflicts, tensions and barriers within scientific theory, would provide weaker, though apparently adequate, grounds for unity. (Needham 2010: 163)

According to Duhem’s scheme of unity, ‘(m)icroscopic principles complement macroscopic theory in an integrated whole, with no presumption of primacy of the one over the other’ (Needham 2010: 167). This implies that Duhem’s understanding of unity is incompatible with reductionism in the sense that it rejects that physics is the most fundamental science.

Moreover, Needham argues that positions on unity can be distinguished into four groups:

(i) Unity in virtue of reduction, with no autonomous areas,

(ii) unity in virtue of consistency and not reduction, but still no autonomy because of interconnections,

(iii) unity in virtue of consistency and not reduction, with no autonomous areas, and

(iv) disunity. (Needham 2010: 163-164)

Hettema engages in the discussion of unity with respect to chemistry and evaluates Needham’s scheme of unity (2017). In particular, Hettema takes that the first form of unity assumes a form of ‘reductionism in which derivation is strict and reduction postulates are identities’ (Hettema 2017: 277). Regarding the second form of unity, Hettema argues that it faces certain challenges. For example, in this form of unity ‘the nature of the “interconnections” is (..) not well specified in Needham’s scheme’ (Hettema 2017: 277). Moreover, ‘the theories of chemistry and physics are not as strongly dependent on each other as implied (though not stated) in position (ii) in the scheme’ (Hettema 2017: 277-278). Hettema rejects the third form of unity because it allegedly disregards the ‘idea that one science may fruitfully explain aspects of another’ (Hettema 2017: 278).

As already mentioned, Hettema proposes a novel account of reduction regarding the relation between chemistry and quantum mechanics (see subsection 3d). In the broader context of unity, Hettema takes his account to propose a form of unity that Needham’s scheme does not capture. Specifically, Hettema’s account does not support ‘a form of unity in virtue of reduction with no autonomous areas’ (in line with (i)) because, unlike (i), it does not require strict derivation nor the existence of identity relations between the reduced and reducing theory. Moreover, Hettema’s account does not advocate unity without reduction either. While he acknowledges that his account shares common features with non-reductive accounts of unity in the philosophy of science literature, he maintains that his account proposes a ‘naturalised Nagelian reduction’ (Hettema 2012b: 143).

Interestingly, there are two features that his account allegedly shares with certain non-reductive accounts of unity. First, Hettema takes his account of reduction to be compatible with an understanding of theories as ‘interfield theories’ which ‘use concepts and data from neighbouring fields’ (in line with Darden and Maull 1977) (Hettema 2012b: 160). In this context, absolute reaction rate theory is characterised as an interfield theory ‘where the theories comprising the interfield are in turn reductively connected’ (Hettema 2012b:168). There is no one-to-one relation between the reduced and reducing theory; rather there is a ‘net of theories’ where ‘connective and derivative links of a Nagelian sort exist between all these theoretical approaches’ (Hettema 2012b:168). As a result, the overall reduction of chemistry is specified in terms of a network of different theories that are reductively connected between them (Hettema 2012b:171). Secondly, Hettema takes his account to be compatible with Bokulich’s non-reductive account of ‘interstructuralism’, according to which two theories are related in virtue of the ‘structural continuities and correspondences’ between them (Bokulich 2008: 173; Hettema 2012b: 163). Indeed, Hettema identifies structural continuities in the case of the absolute reaction rate theory (Hettema 2012b: 171).

Lastly, Seifert (2017) advocates unity without reduction, arguing that chemistry and quantum mechanics are unified in a non-reductive manner because they exhibit particular epistemic and metaphysical inter-connections.

b. Pluralism

The autonomy of chemistry from quantum mechanics has been defended without reference to emergence in the form of pluralist accounts. Accounts of pluralism that have not been explicitly investigated with respect to chemistry’s relation to quantum mechanics are not presented here, such as Chang’s (2012). For example, Lombardi and Labarca argue for a ‘Kantian-rooted ontological pluralism’ which is based on Putnam’s account of internalist realism (Lombardi 2014b: 23; see also Lombardi and Labarca 2005; Putnam 1981). They claim that while the epistemological reduction of chemistry is in general rejected in the philosophy of chemistry, the ontological reduction of chemistry is more or less accepted (Lombardi and Labarca 2005: 132-133). They take the acceptance of chemistry’s ontological reduction to imply an antirealist or eliminativist view of chemical ontology and to undermine philosophy of chemistry’s relevance when it comes to investigating metaphysical issues (Lombardi and Labarca 2005: 134). In this context, they argue that a hierarchical view of ontology, where everything is grounded on more fundamental physical entities, should be substituted by a view of the world where ‘different but equally objective theory-dependent ontologies interconnected by nomological, non-reductive relationships’, coexist (Lombardi and Labarca 2005: 146; Lombardi 2014b).

There are various objections against this account of ontological pluralism (Needham 2006; Manafu 2013; Hettema 2014: 195-196; see also Lombardi and Labarca 2006). For example, Manafu argues that Lombardi and Labarca have insufficiently argued for the ‘equal’ reference of concepts that are postulated by different theories. This is because if a theory is reduced to, superseded by, or merely has different theoretical virtues from another theory, then it is not necessary that such a theory employs concepts that actually refer to things that exist (Manafu 2013: 227).

Schummer also argues in favour of a pluralist position. He claims that chemistry’s relation to physics should be understood in accordance to methodological pluralism (2014b). Chemistry and each of its sub-disciplines have distinct subject matters, pose different research questions and employ distinct methods and concepts. Even when it comes to concepts that are employed by both chemistry and physics, such as ‘molecule’ and ‘molecular structure’, Schummer argues that these concepts frequently have different meanings in each of the two disciplines and are employed in the context of radically distinct models, methods and research goals (Schummer 2014b: 260).

6. Conclusion

Given how chemistry’s relation to quantum mechanics has been investigated in the philosophy of chemistry so far, it is possible to draw the following conclusions. First, in the first decades of the 21^st century, the philosophy of chemistry persistently argued that chemistry’s relation to quantum mechanics is not a reductive relation, as philosophers and physicists such as Nagel and Dirac commonly supposed. Another point drawn from this analysis is that one cannot correctly spell out the relation between the two sciences unless one takes into account the role of approximations, assumptions, models and idealisations in the two sciences.

Moreover, it is evident that more can be said about chemistry’s relation to quantum mechanics. There is substantial material from the philosophy of science which has not been considered with respect to chemistry and which could contribute to a richer and more accurate understanding of the relation between the two sciences. For example, given the alleged failure of Nagelian reduction, it would be interesting to examine whether a different understanding of epistemic reduction applies to the case of chemistry. Alternative accounts of epistemic reduction that take into account the unique models, idealisations, and practices that the special sciences employ would contribute to formulating a novel understanding of the relation of chemistry with quantum mechanics. Also, it is worth investigating whether chemistry and quantum mechanics are unified in a way that neither requires some form of epistemic or ontological reduction, nor collapses to a strongly emergent or pluralist worldview. Lastly, there are various understandings of pluralism which have not been applied to the case of chemistry and which could further support general accounts of pluralism in the sciences. All in all, more can be said about chemistry’s relation to quantum mechanics which can fruitfully contribute to one’s analysis of reduction, unity, pluralism and emergence.

7. References and Further Reading

Arriaga, J. A. J., S. Fortin, and O. Lombardi. 2019, ‘A new chapter in the problem of the reduction of chemistry to physics: The Quantum Theory of atoms in Molecules’, Foundations of Chemistry, 21(1): 125-136
Bader, Richard. 1990. Atoms in Molecules: A Quantum Theory (Oxford: Oxford University Press)
Bader, R. F. W, and C. F. Matta. 2013. ‘Atoms in molecules as non-overlapping, bounded, space-filling open quantum systems’, Foundations of Chemistry, 15: 253- 276
Bensaude-Vincent, Bernadette. 2008. Essais d’histoire et de philosophie de la chimie (Paris: Presses Universitaires de Paris Ouest)
Bishop, Robert C. 2006. ‘The Hidden Premise in the Causal Argument for Physicalism’, Analysis, 66: 44-52
Bishop, Robert C. 2010. ‘Whence chemistry?’, Studies in History and Philosophy of Modern Physics, 41: 171-177
Bishop, R. C., and H. Atmanspacher. 2006. ‘Contextual emergence in the description of properties’, Foundations of Physics, 36(12): 1753-1777
Bogaard, Paul A. 1978. ‘The Limitations of Physics as a Chemical Reducing Agent’, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, 1978(2): 345-356
Bokulich, Alisa. 2008. Reexamining The Quantum-Classical Relation: beyond Reductionism And Pluralism (Cambridge: Cambridge University Press)
Broad, C. D.. 1925. The Mind and Its Place in Nature (London: Routledge and Kegan Paul)
Causá, M., A. Savin, and B. Silvi. 2014. ‘Atoms and bonds in molecules and chemical explanations’, Foundations of Chemistry, 16(1): 3-26
Chang, Hasok. 2012. Is Water H2O? Evidence, Realism and Pluralism (Dordrecht: Springer)
Chang, Hasok. 2015. ‘Reductionism and the Relation Between Chemistry and Physics’, in Relocating the History of Science, ed. by T. Arabatzis et al., Boston Studies in the Philosophy and History of Science, Vol. 312 (Dordrecht: Springer) pp. 193-209
Chang, Hasok. 2017. ‘What History Tells Us about the Distinct Nature of Chemistry’, Ambix, 64(4): 360-374, DOI: 10.1080/00026980.2017.1412135
Darden, L., and N. Maull. 1977. ‘Interfield Theories’, Philosophy of Science, 44(1): 43-64
Dirac, Paul. 1929. ‘The quantum mechanics of many-electron systems’, Proceedings of the Royal Society of London, Series A, Containing Papers of a Mathematical and Physical Character, 123(792): 714-733
Dizadji-Bahmani, F., Frigg, R., and S. Hartmann. 2010. ‘Who’s Afraid of Nagelian Reduction?’, Erkenntnis, 73: 393-412
Fazekas, Peter. 2009. ‘Reconsidering the Role of Bridge Laws in Inter-Theoretical Reductions’, Erkenntnis, 71: 303-322
Gavroglu, K., and A. Simões. 2012. Neither Physics nor Chemistry. A History of Quantum Chemistry (Cambridge MA: MIT Press)
González, J. C. M., Fortin, S., and O. Lombardi. 2019. ‘Why molecular structure cannot be strictly reduced to quantum mechanics’, Foundations of Chemistry, 21(1): 31-45
Goodwin, William. 2013. ‘Quantum Chemistry and Organic Theory’, Philosophy of Science, 80(5): 1159-1169
Griffiths, David J. 2005. Introduction to Quantum Mechanics, 2nd edn (USA: Pearson Education International)
Hendry, Robin F. 1998. ‘Models and Approximations in Quantum Chemistry’, Poznan Studies in the Philosophy of Science and the Humanities, 63: 123-142
Hendry, Robin F. 1999. ‘Molecular Models and the Question of Physicalism’, HYLE, 5: 117-34
Hendry, Robin F. 2006. ‘Is there Downwards Causation in Chemistry?’, in Philosophy Of Chemistry: Synthesis of a New Discipline, ed. by Davis Baird, Eric Scerri and Lee McIntyre, Boston Studies in the Philosophy of Science, Vol. 242 (Dordrecht: Springer) pp. 173-189
Hendry, Robin F. 2008. ‘Two Conceptions of the Chemical Bond’, Philosophy of Science, 75(5): 909-20
Hendry, Robin F. 2010a, ‘Emergence vs. Reduction in Chemistry’, in Emergence in Mind, ed. by Cynthia Macdonald and Graham Macdonald (Oxford: Oxford University Press) pp. 205-221
Hendry, Robin F. 2010b. ‘Ontological reduction and molecular structure’, Studies in History and Philosophy of Modern Physics, 41: 183–91
Hendry, Robin F. 2012. ‘Reduction, emergence and physicalism’, in Philosophy Of Chemistry, ed. by Andrea Woody, Robin F. Hendry and Paul Needham (Amsterdam: Elsevier) pp. 367–386
Hendry, Robin F. 2017. ‘Prospects for Strong Emergence in Chemistry’, in Philosophical and Scientific Perspectives on Downward Causation, ed. by Michele P. Paoletti, and Francesco Orilia (New York: Routledge) pp. 146-63
Hendry, R. F., and P. Needham. 2007. ‘Le Poidevin on the Reduction of Chemistry’, The British Journal for the Philosophy of Science, 58(2): 339–53
Hettema, Hinne. 2012a. Reducing chemistry to Physics: Limits, Models, Consequences (North Charleston SC: Createspace)
Hettema, Hinne. 2012b. ‘The Unity of Chemistry and Physics: Absolute Reaction Rate Theory’, HYLE- International Journal for Philosophy of Chemistry, 18(2): 145-173
Hettema, Hinne. 2013. ‘Austere quantum mechanics as a reductive basis for chemistry’, Foundations of Chemistry, 14: 311-326
Hettema, Hinne. 2014. ‘Linking chemistry with physics: a reply to Lombardi’, Foundations of Chemistry, 16: 193-200
Hettema, Hinne. 2017. The Union of Chemistry and Physics- Linkages, Reduction, Theory Nets and Ontology (Springer International Publishing)
Hofmann, James R. 1990. ‘How the Models of Chemistry Vie’, PSA 1990, 1: 405- 419
IUPAC. 2014. Compendium of Chemical Terminology: Gold Book, Version 2.3.3, Available at: http://goldbook.iupac.org/pdf/goldbook.pdf [accessed 3/05/2018]
Klein, Colin. 2009. ‘Reduction Without Reductionism: A Defence of Nagel on Connectability’, The Philosophical Quarterly, 59(234): 39-53
Le Poidevin, Robin. 2005. ‘Missing Elements and Missing Premises: A Combinatorial Argument for the Ontological Reduction of Chemistry’, British Journal of Philosophy of Science, 56: 117-134
Llored, Jean-Pierre. 2012. ‘Emergence and quantum chemistry’, Foundations of Chemistry, 14(1): 245–274
Llored, Jean-Pierre. 2014. ’Whole- Parts Strategies in Quantum Chemistry: Some Philosophical Mereological Lessons’, Hyle, 20(1): 141-163
Llored J. P., and R. Harré. 2014. ‘Developing the mereology of chemistry’, in Mereology and the Sciences, ed. by C. Calosi and P. Graziani (London: Springer) pp. 189-212
Lombardi, Olimpia. 2014a. ‘Linking chemistry with physics: arguments and counterarguments’, Foundations of Chemistry, 16(3): 181-192
Lombardi, Olimpia. 2014b. ‘The Ontological Autonomy of the Chemical World: Facing the Criticisms’, in Philosophy of Chemistry. Boston Studies in the Philosophy and History of Science, vol. 306, ed. by E. Scerri and L. McIntyre (Dordrecht:Springer)
Lombardi O., and M. Labarca. 2005. ‘The Ontological Autonomy of the Chemical World’, Foundations of Chemistry, 7(2): 125-148
Lombardi, Olimpia. 2006. ‘The Ontological Autonomy of the Chemical World: A response to Needham’, Foundations of Chemistry, 8: 81-92
Lombardi, Olimpia. 2007. ‘The Philosophy of Chemistry as a New Resource for Chemical Education’, Journal of Chemical Education, 84(1): 187-192
Manafu, Alexandru. 2013. ‘Internal realism and the problem of ontological autonomy: a critical note on Lombardi and Labarca’, Foundation of Chemistry, 15: 225-228
Matta, Chérif F. 2013. ‘Special issue: Philosophical aspects and implications of the quantum theory of atoms in molecules (QTAIM)’, Foundations of Chemistry, 15(3): 245- 251
Matta C. F., and R.J. Boyd (ed.). 2007. The Quantum Theory of Atoms in Molecules: From Solid State to DNA and Drug Design, Weinheim: Wiley‐VCH Verlag GmbH & Co. KGaA
McLaughlin, Brian. 1992. ‘The Rise and Fall of British Emergentism’, in Emergence or Reduction? Essays on the Prospect of a Non-Reductive Physicalism, ed. by A. Beckerman, H. Flohr and J. Kim (Berlin: Walter de Gruyter) pp. 49-93
Mill, John S. 1930. A System of Logic Ratiocinative and Inductive (London: Longmans, Green and Co.)
Nagel, Ernest. 1979. The Structure of Science: Problems in the Logic of Scientific Explanation, 3rd edn (Hackett Publishing)
Needham, Paul. 1999. ‘Reduction and abduction in chemistry- a response to Scerri’, International Studies in the Philosophy of Science, 13(2): 169-184
Needham, Paul. 2006. ‘Ontological Reduction: a comment on Lombardi and Labarca’, Foundations of Chemistry, 8(1): 73-80
Needham, Paul. 2009. ‘Reduction and Emergence: A Critique of Kim’, Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition, 146(1): 93-116
Needham, Paul. 2010. ‘Nagel’s analysis of reduction: Comments in defence as well as critique’, Studies in History and philosophy of Modern Physics, 41: 163- 170
Oppenheim, P., and H. Putnam. 1958. ‘The unity of science as a working hypothesis’, in Minnesota Studies in the Philosophy of Science vol. 2, ed. by H. Feigl et al. (Minneapolis: Minnesota University Press) pp.3-36
Palgrave Macmillan Ltd. 2004. Dictionary of Physics (UK: Palgrave Macmillan)
Primas, Hans. 1983. Chemistry, Quantum Mechanics and Reductionism, 2nd edn (Berlin: Springer)
Putnam, Hilary. 1965. ‘How Not to Talk About Meaning’, in Boston Studies in Philosophy of Science, vol. II, ed. by R.S. Cohen and M. Wartofsky (New York: Humanities press) pp. 206-207
Putnam, Hilary. 1981. Reason, Truth and History. Cambridge: Cambridge University Press
Ramsey, Jeffry L. 1997. ‘Molecular Shape, Reduction, Explanation and Approximate Concepts’, Synthèse, 111: 233–251
Scerri, Eric. 1991. ‘The Electronic Configuration Model, Quantum Mechanics and Reduction’, British Journal for the Philosophy of Science, 42: 309-325
Scerri, Eric. 1994. ‘Has Chemistry Been at Least Approximately Reduced to Quantum Mechanics?’, PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, 1994: 160-170
Scerri, Eric. 1998. ‘Popper’s naturalised approach to the reduction of chemistry’, International Studies in the Philosophy of Science, 12(1): 33-44, DOI: 10.1080/02698599808573581
Scerri, Eric. 2006. ‘Normative and Descriptive Philosophy of Science and the Role of Chemistry’, in Philosophy of Chemistry: Synthesis of a New Discipline, ed. by Davis Baird, Eric Scerri and Lee McIntyre, Boston Studies in the Philosophy of Science, Vol. 242 (Dordrecht: Springer) pp.119- 128
Scerri, Eric. 2007a. ‘Reduction and Emergence in Chemistry- Two Recent Approaches’, Philosophy of Science, 74(5): 920-31
Scerri, Eric. 2007b. ‘The Ambiguity of Reduction’, HYLE, 13(2): 67-81
Scerri, Eric. 2012a. ‘Top-down causation regarding the chemistry-physics interface: a sceptical view’, Interface Focus, 2: 20-25
Scerri, Eric. 2012b. ‘What is an element? What is the periodic table? And what does quantum mechanics contribute to the question?’, Foundations of Chemistry, 14: 69-81
Scerri, E., and G. Fisher (ed.). 2015. Essays in the Philosophy of Chemistry (Oxford: Oxford University Press)
Schummer, Joachim. 1998. ‘The Chemical Core of Chemistry I: A Conceptual Approach’, HYLE- International Journal for Philosophy of Chemistry, 4: 129-162
Schummer, Joachim. 2014a. ‘Editorial: Special Issue on ‘General Lessons from Philosophy of Chemistry’ on the occasion of the 20th Anniversary of HYLE’, HYLE- International Journal for Philosophy of Chemistry, 20: 1-10
Schummer, Joachim. 2014b. ‘The Methodological Pluralism of Chemistry and Its Philosophical Implications’, in Philosophy of Chemistry: Growth of a New Discipline, ed. by Eric Scerri E. and Lee McIntyre (Dordrecht: Springer) pp.57-72
Seifert, Vanessa A.. 2017. ‘An alternative approach to unifying chemistry with quantum mechanics’, Foundations of Chemistry, 19(3): 209-222
Sutcliffe B. T., and R. G. Woolley 2012. ‘Atoms and Molecules in Classical Chemistry and Quantum Mechanics’, in Handbook of the Philosophy of Science. Volume 6: Philosophy of Chemistry, ed. by Robin F. Hendry, Paul Needham and Andrea I. Woody (Amsterdam: Elsevier BV) pp. 387-426
Tapia, O. 2006. ‘Can Chemistry Be Derived from Quantum Mechanics? Chemical Dynamics and Structure’, Journal of Mathematical Chemistry, 39(3/4): 637-639
van Brakel, Jaap. 1999. ‘On the Neglect of the Philosophy of Chemistry’, Foundations of Chemistry, 1(2): 111-174
van Brakel, Jaap. 2000. Philosophy of Chemistry. Between the Manifest and the Scientific Image (Leuven: Leuven University Press)
van Brakel, Jaap. 2014. ‘Philosophy of Science and Philosophy of Chemistry’, HYLE- International Journal for Philosophy of Chemistry, 20: 11-57
van Riel, Raphael. 2011. ‘Nagelian Reduction beyond the Nagel Model’, Philosophy of Science, 78(3): 353-375
Weininger, Stephen J. 1984. ‘The Molecular Structure Conundrum: Can Classical Chemistry Be Reduced to Quantum Chemistry?’, Journal of Chemical Education, 61: 939-944
Weisberg, Michael. 2008. ‘Challenges to the Structural Conceptions of Chemical Bonding’, Philosophy of Science, 75: 932–46
Wilson, Jessica. 2015. ‘Metaphysical Emergence: Weak and Strong’, in Metaphysics in Contemporary Physics, ed. by Tomasz Bigaj and Christian Wüthrich (Poznan Studies in the Philosophy of the Sciences and the Humanities) pp. 345-402
Woolley, R. G. 1976. ‘Quantum theory and molecular structure’, Advances in Physics, 25(1): 27-52
Woolley, R. G. 1978. ‘Must a Molecule Have a Shape?’, Journal of the American Chemical Society, 100(4): 1073-1078
Woolley, R. G. 1985. ‘The Molecular Structure Conundrum’, Journal of Chemical Education, 62: 1082-1085
Woolley, R. G. 1991. ‘Quantum Chemistry Beyond the Born-Oppenheimer Approximation’, Journal of Molecular Structure (Theochem), 230: 17-46
Woolley, R. G. 1998. ‘Is there a quantum definition of a molecule?’, Journal of Mathematical Chemistry, 23: 3–12
Woolley, R. G., and B. T. Sutcliffe. 1977. ‘Molecular Structure and the Born-Oppenheimer Approximation’, Chemical Physics Letters, 45(2): 393-398
Woody, Andrea I. 2000. ‘Putting Quantum mechanics to Work in Chemistry: The Power of Diagrammatic Representation’, Philosophy of Science, 67: S612-S627

Author Information

Vanessa A. Seifert
Email: vs14902@bristol.ac.uk
University of Bristol
United Kingdom

The Axiology of Theism

The existential question about God asks whether God exists, but the axiology of theism addresses the question of what value-impact, if any, God’s existence does (or would) have on our world and its inhabitants. There are two prominent answers to the axiological question about God. Pro-theism is the view that God’s existence does (or would) add value to our world. Anti-theism, by contrast, is the view that God’s existence does (or would) detract from the value of our world. Philosophers have observed that the answer to the axiological question may vary depending on its target and scope. For instance, assessments about God’s value-impact could made from an impersonal perspective without reference to individuals, or from a personal perspective with reference to the value-impact of God only for a particular person or persons. Axiological assessments can also take into account one, some, or all of the purported advantages and downsides of God’s existence.

No general consensus has emerged in the literature regarding the correct answer(s) to the axiological question about God. Some philosophers argue that the answer to the question is obvious, or that the very question itself is unintelligible. For instance, it might be unintelligible to the many theists who hold that if God does not exist then nothing else would exist. So, it is impossible to compare a world with God to a world without God. The most promising argument in support of anti-theism in the literature is the Meaningful Life Argument, which suggests that God’s existence would make certain individuals’ lives worse, for those individuals have life plans so intimately connected with God’s non-existence that, if it turned out God exists, their lives would lose meaning if God were to exist. The most promising argument for pro-theism is best understood as a cluster of arguments pointing to many of the purported advantages of God’s existence including divine intervention (that is, God performing miracles that help people) and the impossibility of gratuitous evil on theism. Additionally, some pro-theists claim that since God is infinitely good that any state of affairs with God is also infinitely good. To date, the literature has focused on comparing the axiological value of theism (especially Christianity) to atheism (especially naturalism). Future work will likely include axiological assessments of the other religious and non-religious worldviews.

The Axiological Question about God
Is the Axiological Question Intelligible?
Different answers to the Axiological Question
Arguments for Pro-Theism
Arguments for Anti-Theism
1. The Meaningful Life Argument
2. The Goods of Atheism Argument
Connections to the Existence of God
Future Directions
1. Exploration of Different Answers
2. Other Worldviews
References and Further Reading

1. The Axiological Question about God

A perennial topic in the philosophy of religion is the existential question of whether God exists. Arguments in support of theism include the ontological, cosmological, teleological, and moral arguments. Arguments in support of atheism, on the other hand, include the arguments from evil, from no best world, and from divine hiddenness. Many of these arguments and topics have a rich philosophical history and sophisticated versions of them continue to be discussed in the literature. The importance of the existential question is obvious: God’s existence is tied to the truth value of the theistic religions. It is of little surprise, then, that philosophers of religion have spilled so much ink over these topics.

This article does not discuss the existential question of whether God exists. Rather, it will examine the question of the axiological question about the value-impact of God’s existence. Some brief remarks by Thomas Nagel are often credited as the starting point in the literature (Kahane 2011, 679; Kraay and Dragos 2013, 159; Penner 2015, 327). In his book The Last Word, Thomas Nagel quips: “I hope there is no God! I don’t want there to be a God; I don’t want the universe to be like that” (1997, 130). Nagel is an atheist who thinks he is rational in his atheism. He thinks that in light of the evidence, atheism is the correct answer to the existential question about God. Yet here he expresses a desire or preference about the non-existence of God. Reflections on this brief quote from Nagel have led to the emergence of discussion about the axiological question in the philosophy of religion. While it is clear Nagel is expressing a preference, philosophers initially wanted to know whether it could be developed into an axiological position.

One interesting aspect of this question is that it seems to be conceptually distinct from the existential question about God. For instance, it seems perfectly consistent for an atheist who denies that God exists to simultaneously believe that God’s existence would be good, though some have denied this claim (for example, Schellenberg 2018). It also seems consistent for a theist who is convinced that God exists to hold that there are negative consequences of God’s existence. Finally, it’s worth pointing out that the axiological question has come to be understood as a comparative question about the difference in value between different possible worlds or states of affairs (that is, between God worlds and God-less worlds).

2. Is the Axiological Question Intelligible?

In explaining what the axiological question is asking, Guy Kahane writes in an early and influential piece that

We are not asking theists to conceive of God’s death—to imagine that God stopped existing. And given that theists believe that God created the universe, when we ask them to consider His inexistence we are not asking them to conceive an empty void […] I will understand the comparison to involve the actual world [where God exists] and the closest possible world where [God does not exist] (Kahane 2011, 676).

While this makes clear the relevant comparison that Kahane and others have in view, some have suggested that the axiological question itself is unintelligible (Kahane 2012, 35-37; Mugg 2016). This is based on the fact that on a standard (Lewis/Stalnaker) semantics, counterpossibles are trivially true. God is typically understood as a necessary being. This means that if God exists, then God exists in every possible world (that is, in every possible state of affairs). Given this, the statement ‘God does not exist’ is a counterpossible. Now, consider the following conditional: If God does not exist, then the world would be better (or worse). Given theism, any counterpossible with the antecedent in the previous conditional is trivially true because there is no way that the antecedent could be true while the consequent is false. This is because there is no way for the antecedent to be true on theism. If this worry is correct, then cross-world axiological judgements are uninformative at best, and possibly unintelligible or impossible at worst. Notice that the same applies to atheism if the view in mind has it that there is no possible world in which God exists (that is, necessitarian atheism, the view that God necessarily does not exist).

One approach to this objection suggests that this type of axiological comparison is possible as a result of a process called cognitive decoupling. This occurs when an agent extracts information from a representation and then performs computations on it in isolation. Certain information is ‘screened off’ and thus not used in the reasoning process. Likewise, “[t]hose beliefs that are allowed into the reasoning process, along with suppositions, are ‘cognitively quarantined’ from the subject’s beliefs” (Mugg 2016, 448). Consider:

Bugs Bunny might pick up a hole off the ground and throw it on a wall. It is not metaphysically possible to pick up a hole, but we are able to suppose that Bugs has picked up the whole and recognize that Bugs can now jump through the wall. Thus, we can imagine an impossible state of affairs and make judgments about what would obtain within that state of affairs. In representing the impossible state of affairs, we screen out those beliefs that would lead to outright contradiction (Mugg 2016, 449).

In this context, cognitive decoupling occurs in situations in which, “when considering a counterfactual, subjects can screen out those beliefs that (with the antecedent of the counterfactual) imply contradictions” (Mugg 2016, 449). A theist who holds that God necessarily exists could address the axiological question by engaging in cognitive decoupling. This means that when addressing the axiological question, she ‘screens off’ her belief that God necessarily exists (and conversely, a necessitarian atheist could screen off her belief that God necessarily doesn’t exist). This proposal raises a number of questions, including how we can be confident that we have ‘screened off’ the appropriate beliefs, and also whether the comparison made when engaging in cognitive decoupling is relevantly similar to the real-world comparison needed to answer the axiological question.

Another proposal for dealing with this objection suggests that this worry about counterpossibles arises only when the comparison in question is understood as one between metaphysically possible worlds. But, so the proposal goes, when the relevant comparison is one between epistemically possible worlds, the counterpossible problem doesn’t apply (Mawson 2012; see also Chalmers 2011). After all, the theist who believes that God exists of metaphysical necessity holds that there are no metaphysically possible worlds where God doesn’t exist. But for a state-of-affairs to be epistemically possible for such a theist, she only needs to concede that it could obtain, for all she knows. Thus, the theist just needs to concede that, for all she knows, God may not exist. A helpful analogy comes by way of reflecting on the idea that water is H₂O. While there are no metaphysically possible worlds where water is not H₂0, for all one knows, water is not H₂0. Hence, there are epistemically possible worlds where water is not H₂0 (Chalmers 60-62). For all the necessitarian theist knows, atheism is true, while for all the necessitarian atheist knows, theism is true. Thus, regardless of whether the comparison between metaphysically possible worlds is intelligible, the comparison between epistemically possible worlds is perfectly intelligible.

Yet another reply to the counterpossible problem holds that value can intelligibly be assigned to metaphysical impossibilities (Kahane2012, 36-37). For if it is possible to assign a value to a metaphysical impossibility, then perhaps the theist who thinks that atheism is metaphysically impossible could still assign a value to the relevant counterpossibles. Consider, for instance, that a mathematical proof could rightly be called beautiful or elegant even if it turns out to be invalid. Of course, it’s controversial whether it’s appropriate to talk of the beauty of an invalid proof. If such judgments turn out not to be appropriate, then it turns out that many of our value assignments will be apparent, not factual (Kahane 2012, 37). We will think we are making a factual value judgment when it is in fact not.

To conclude this section, it’s worth noting that the literature on the axiology of theism often treats rational preference as supervening on axiological judgments (that are understood to be objective). But it is an open question whether an agent’s rational preference need always correspond to correct axiological judgments. Perhaps it could be rational for an agent to prefer a worse state of affairs to a better one, or to disprefer a better state of affairs to a worse one. Kahane (2011) appears to think this is a genuine possibility. I won’t dwell on this issue, but it’s worth keeping in mind as one explores this topic. We’re now in a position to examine different answers that can be proposed to the axiological question.

3. Different answers to the Axiological Question

While some have attempted to address worries about the intelligibility of the axiological question, many philosophers have simply proceeded directly to attempting to answer the question (presumably because they are either unaware of the problem or implicitly assume that it has a reasonable solution). No consensus as to the correct answer to the axiological question has emerged in the literature (and seems unlikely to anytime soon). What has become clear, though, is that there are a great number of different possible answers one could offer to the axiological question.

The two main general positions that have been taken up in the literature are pro-theism and anti-theism. Pro-theism is, roughly, the view that it would be good if God were to exist. Anti-theism, on the other hand, holds that it would be bad if God were to exist. There are, however, other potential answers which haven’t received as much attention. For instance, the neutralist about the axiological question holds that God’s existence has (or would have) a neutral impact on the value of the world. The quietest holds that the axiological question cannot (in principle) be answered. Finally, the agnostic holds that the axiological question might be answerable, but we are currently unable to answer it. Much more remains to be said about the plausibility of these three latter positions. (For more on these answers see Kraay 2018, 10-18.)

There are numerous specific variants of these answers to the question. There is a difference between personal and impersonal judgements about the axiological question. The former focus on the axiological implications of God’s existence with respect to individual persons, while the latter focuses on such implications without any reference to God’s value-impact on persons. Additionally, there are narrow and broad judgements about the axiological question. The former refers to just one advantage (or downside) of God’s existence (or non-existence), while the latter refers to the axiological consequences of God’s existence or non-existence overall. These judgments – personal/impersonal and narrow/broad–combine to form at minimum sixty possible answers to the axiological question when applied to five general answers stated above. Klaas J. Kraay’s (2018, 9) helpful chart enables us to visualize all of these different possibilities:

	Axiological Positions
	Pro-Theism					Anti-Theism	Neutralism	Agnosticism	Quietism
	Impersonal		Personal
	Narrow	Wide		Narrow	Wide
Theism
Atheism
Agnosticism

The first column contains all of the sub-divisions relevant to pro-theism. The other general answers can subdivided in precisely the same way. Likewise, inasmuch as there are additional general answers to the axiological question to the five offered here, this chart will increase in size. These distinctions are important for a number of different reasons. For instance, later we will see that some have claimed that defending wide personal/impersonal anti-theism is a very difficult, if not impossible task. Another interesting idea that has emerged in the literature thus far is that someone can be a narrow personal anti-theist and a wide personal/impersonal pro-theist (Lougheed 2018c). In other words, someone could hold that it would be a bad thing for her, in certain respects, if God exists, while acknowledging that would be a good thing overall if God exists.

4. Arguments for Pro-Theism

This section outlines three different considerations that speak in favour of pro-theism.

a. The Infinite Value Argument

One argument for pro-theism appeals to the idea that God is infinitely valuable (for discussion see Van Der Veen and Horsten 2013). The thought is that if God is infinitely valuable, then any world with God is infinitely valuable because God exists in every world and confers infinite value on each one. From this it follows that any theistic world is more valuable than an atheistic world (or at least not worse if atheistic worlds can be infinitely valuable). There are at least two areas in need of further development regarding this line of argument. First, more work has to be done to show how God’s infinite value can sensibly be thought to make a world (assuming theism is true) infinitely valuable. There is a vast literature on the divine attributes, but the idea of God’s infinite value has been neglected (at least in the contemporary literature). What is it to say God is infinite? How is an abstract concept, infinity, supposed to accurately describe God’s value? Second, the claim that all theistic worlds have the same infinitely high value appears to violate very basic modal and moral intuitions. Consider two worlds in which God exists, one of which includes a genocide that the other does not. These two worlds are otherwise identical. Surely such a world–all else being equal–is axiologically superior to ours.

b. The Morally Good Agents Argument

The Morally Good Agents Argument is another argument in favour of pro-theism. Here is a thought experiment motivating this argument. Imagine that Carl’s car breaks down on the highway. Carl has no phone to call for help, and he doesn’t know anything about car mechanics. First, consider a case in which Susan, a morally good agent, discovers Carl on the side of the highway and offers help. She calls a tow truck for Carl, and when she discovers Carl doesn’t have his wallet, she pays for the tow herself. Second, consider a case in which no one pulls over to assist Carl. He attempts to flag down cars, but no one stops. While Carl is in poor health he has no choice but to attempt to walk to nearest gas station for assistance. These two cases are designed to show that morally good agents tend to add value to states of affairs. If the point generalizes, then a world with morally good agents is better than one without such agents, all else being equal (Penner and Lougheed 2015, 56).

Now consider two additional scenarios. Imagine that George sees Carl attempting to flag down vehicles. George attempts to pull over in order to assist Carl, but his brakes fail and he crashes into Carl, killing him on impact. Or consider Tom, who sees a truck crash into Carl’s car and then drives away. Carl’s car is now on fire with Carl trapped inside. Tom calls 911 but knows that the paramedics won’t arrive in time to save Carl. Tom tries to open the door to save Carl, but he isn’t strong enough to pry the bent door open. The idea behind these two additional cases is to acknowledge that morally good agents, despite good intentions, don’t necessarily have the power to do good. Of course, this doesn’t apply to God. Since God is all-powerful, God won’t be constrained or unable to add value to states of affairs in ways that other morally good agents might be constrained. Inasmuch as it makes sense to think that morally good agents add value to states of affairs, then God adds value to states of affairs. All else being equal, then, a world with God is better than a world without God (Penner and Lougheed 205, 57-58).

There are a number of objections to this line of argument which attempt to show that not all else is equal. One reason to think God’s existence isn’t valuable (at least for certain individuals) is based on the idea that God violates everyone’s privacy. If God exists, then there is a sense in which God automatically violates our privacy (that is, if God is all-knowing, then God knows all of our mental states/thoughts). Without a justifying reason to violate a person’s privacy, this is an aspect in which God’s existence is a bad thing, for part of what’s involved in people forming trusting relationships with each other is that they choose what information about themselves they reveal. But this type of choice is impossible for individuals to make in the case of God. (The issue of privacy will be discussed further in section 5a below.) The question remains, however, whether this worry, assuming it really is a downside, is enough to outweigh all of the goods associated with theism. Another objection invokes a worry about an inverted moral spectrum. Suppose that what we think is good is actually bad according to God, and vice versa. If this is right, then, while it might still be technically true that God is a morally good agent (and adds value), it would make little sense to think we ought to prefer that God exist (Penner and Lougheed 2015, 68).

c. The Goods of Theism Argument

The Goods of Theism Argument represents a family of arguments (some quite informally expressed) that focus on highlighting specific goods of theism. This style of argument need not deny that there are genuine goods associated with atheism. Rather, the goods identified in connection to theism are taken to outweigh any goods associated with atheism. Also, some might acknowledge that these goods need not make it rational for certain individuals, in certain respects, to prefer theism. But, so the thought goes, these goods do show that theism is better than atheism overall.

Various theistic goods that have been identified in the literature include objective meaning or purpose, an afterlife, and cosmic justice. For perhaps only God can be the source of objective meaning, and without God every human life would ultimately be meaningless (Cottingham 2005, 37-57; Metz 2019, 9-21) In addition, theism is often associated with the existence of an afterlife, which is connected to the idea that God’s existence ensures that there will be final justice. Many who are wronged on earth are not compensated for being wronged. Those who perpetrate evil often seem to go unpunished. However, God’s existence is good because God will ensure that everyone will receive their due. This could be a logical consequence of a perfect being. The pro-theist need not be committed to the specific details of how this good is instantiated (Lougheed 2018a).

Perhaps one of the most important putative advantage of theism is that if God exists, there are no instances of gratuitous evil. For many theists hold that the existence of gratuitous evil is logically impossible if God exists (Kraay and Dragos 2013, 166; McBrayer 2010). This is because God would ensure that evil only occurs to achieve some otherwise unobtainable good or that every victim of evil will receive just compensation. Notice that there is no pressure on the pro-theist to explain how certain apparent instances of gratuitous evil are not in fact gratuitous (though this is a problem when defending the existence of God). For the pro-theist is merely claiming that if God exists, then there is no gratuitous evil. She isn’t claiming that in fact there is no gratuitous evil. That there is no gratuitous evil if God exists appears to be a very strong consideration in favour of pro-theism.

One worry for this general line of argument is about whether the goods mentioned here are goods that only obtain on theism. If it could be shown that these goods obtain on atheism (or other religious and non-religious worldviews) then they would be of little help in demonstrating that a world with God is more valuable than one without God (Kahane 2018). A more pressing worry, however, is not whether these goods also obtain on naturalism, but whether theism is exclusively what’s required for them to obtain. Perhaps a very good, very powerful, very knowledgeable being who is only slightly lesser than God could ensure that all the goods in question obtain. If this is right, then theism isn’t required for these goods to obtain. For even if such a being existed, atheism would technically be true since God does not exist in this scenario. This is one area where it becomes problematic for the axiology of the theism literature to use ‘naturalism’ and ‘atheism’ interchangeably.

5. Arguments for Anti-Theism

This section examines two important arguments for anti-theism.

a. The Meaningful Life Argument

Perhaps the most widely discussed argument for anti-theism is an argument which has come to be known as the Meaningful Life Argument. Guy Kahane is responsible for first gesturing at this argument, and his discussion is what sparked much recent interest in the axiological question about God. Kahane takes his cue from well-known objections to utilitarianism raised by Bernard Williams. Williams argues that utilitarianism is so demanding that it requires individuals to sacrifice things which give them meaning (1981, 14.). The problem, then, is that utilitarianism is so demanding that, to follow it, one’s own life would cease to have meaning (or at least one would have to stop pursuing those things which confer meaning on her life). According to Kahane, his worry about utilitarianism has a parallel in the present context: he claims that theism might be too demanding in the way that utilitarianism is too demanding. It could require that certain individuals give up things which confer meaning on their lives. Kahane writes:

If a striving for independence, understanding, privacy and solitude is so inextricably woven into my identity that its curtailment by God’s existence would not merely make my life worse but rob it of meaning, then perhaps I can reasonably prefer that God not exist—reasonably treat God’s existence as undesirable without having to think of it as impersonally bad or as merely setting back too many of my interests. The thought is that in a world where complete privacy is impossible, where one is subordinated to a superior being, certain kinds of life plans, aspirations, and projects cannot make sense… Theists sometime claim that if God does not exist, life has no meaning. I am now suggesting that if God does exist, the life of at least some would lose its meaning (Kahane 2011, 691-692).

This is the first statement of the Meaningful Life Argument. Note that these thoughts only defend narrow personal anti-theism: according to this argument, it would be worse, in certain respects and for certain individuals, if it turns out that God exists.

The merits of this argument have been debated. For instance, it has been objected that we are often mistaken about what constitutes a meaningful life (Penner 2015, 335). Consider that we often pursue some end thinking it will fulfill us. But when we achieve that end, we often find we are no more fulfilled than we were before. In other words, we often end up thinking we’ve pursued the wrong end. Since we’re highly fallible with respect to what goods contribute to a meaningful life, then we should not be confident in using such judgements to support personal anti-theism. Others have countered that for this objection to succeed, one would have to deny that the goods Kahane mentions such as independence, understanding, privacy, and solitude could contribute to an individual’s meaningful life (Lougheed 2017). But most of us don’t want to deny that these are goods. Still, it seems likely that there are quantitative and qualitative difference between how these goods are instantiated on theism compared to atheism. It remains to be seen whether such differences can successfully be articulated in a way that successfully answers the objection, and hence personal anti-theism.

Additionally, while it has been observed from the very beginning of the debate over the Meaningful Life Argument that for a good like privacy to successfully be harnessed in support of anti-theism, it needs to shown that it is intrinsically valuable, but little has been said in this regard (Kahane 2011, 684). Something is intrinsically valuable if it is valuable in and of itself. Consider that if privacy is only extrinsically valuable, it might turn out not to matter if God violates our privacy. Something is extrinsically valuable if it is only valuable based on what we can get from it. This means that God always knows where we are, what we are doing, and what we are thinking. Also, consider that this issue is one at the very heart of whether personal forms of anti-theism can be defended. For if the anti-theist and pro-theist both agree that privacy is intrinsically valuable, then in order to defend personal anti-theism, it need only be shown that God violates our privacy (as opposed to also explaining why it matters if our privacy is violated). Thus, providing a case for why goods associated with atheism such as privacy are intrinsically valuable would greatly strengthen the case for narrow personal anti-theism.

Finally, a closely related but less developed argument for anti-theism appeals to considerations about dignity to defend personal anti-theism (Kahane 2011, 688-689). Imagine that parents decide to have a child merely in order for the child to become an accomplished musician, or professional athlete, or simply for more help on the farm. The idea here is that a child should have the freedom to choose their own life path. A parent should support a child in doing this inasmuch as possible (and inasmuch as the life path in question is morally permissible). To have a child in order to fulfill some end other than their own fundamentally violates the dignity of the child. It treats the child as a means rather than solely as an end (Lougheed 2017, 350-351). The parallel case, of course, is supposed to be with respect to God’s relationship with humans. Many theistic traditions hold that humans were created solely to fulfill God’s purposes for them. If this is true, then humans aren’t permitted to pursue their own ends; they are obliged to pursue the ends God has set for them. Hence, the existence of God violates the dignity of humans. The next step in developing this line of argument is to provide more details about the conception of dignity this argument requires in order to be successful (Lougheed 2017, 351).

b. The Goods of Atheism Argument

The Goods of Atheism Argument has emerged after the Meaningful Life Argument, and it is also best understood as a cluster of arguments. It has been observed that goods associated with atheism need not necessarily be connected to meaning in order to justify narrow personal anti-theism. With respect to goods such as privacy, autonomy, and understanding, it has seemed to some that a world without God could be better for certain individuals, at least when only considering those specific goods. For if goods such as privacy and autonomy are intrinsically valuable, then they don’t need to be connected to meaning in order to support personal forms of anti-theism (Lougheed 2018c). Of course, given the many advantages associated with theism (for example, no gratuitous evil), it is difficult see how this line of argument could ever justify broad versions of anti-theism. It also remains an open question whether an individual could value these goods enough to justify personal anti-theism in absence of them being connected to her life pursuits and hence meaning.

6. Connections to the Existence of God

This section explores connections that have been drawn between the axiological question about God and the existential question of whether God exists.

a. Divine Hiddenness

The most work that has been done to connect the axiological and existential questions about God to one another is with respect to the argument from divine hiddenness for atheism. This argument runs roughly as follows. If God exists, then a relationship with God is one of the greatest goods possible. Because of this fact, if God exists there would be no instances of non-culpable, non-resistant, non-belief among those capable of a relationship with God. For belief that God exists is a necessarily requirement for a relationship with God. Yet there appear to be instances of non-culpable, non-resistant, non-belief. Or at the very least, it is more likely that such individuals exist than that God exists. Thus, it’s probable that God doesn’t exist (Schellenberg 2006; 2015)

One line of argument in the literature attempts to demonstrate that reflections on the axiological consequences of theism and atheism can be used to object to arguments from divine hiddenness. Assume that an actual good obtaining is axiologically equivalent to the experience of the same good (even when that good doesn’t actually obtain). This is intuitive when one considers that from a first-person perspective there is no difference between a good actually obtaining and the mere experience of that same good (Lougheed 2018). They’re both experienced in exactly the same way from the first person perspective. Now consider some goods often used to defend personal forms of anti-theism: privacy, independence, and autonomy. The key move in the argument is to suggest that these atheistic goods can be experienced in a theistic world where God is hidden. For example, consider the atheistic good of total and complete privacy. One can experience this good in a world where God hides. Indeed, many devoutly religious individuals sometimes report feeling alone and unable to feel God’s presence. Likewise, in a world where God hides one also gets many theistic goods. Maybe God intervenes and does a miracle to help someone, but the cause of the help is sufficiently unclear. So, it’s possible to doubt that God performed a miracle, and hence possible to doubt that God exists. Therefore, in a world where God hides, one is able to experience atheistic goods and also the theistic goods since they actually obtain. But atheistic goods cannot be experienced in a world where God isn’t hidden. If God’s existence were obvious (along with some of the divine attributes), for example, then one could not ever have the experience of total and complete privacy (even if turns out to be, in some sense, an illusion). Finally, in an atheistic world no theistic goods obtain. Thus, a world where God is hidden is axiologically superior to an atheistic world, but more importantly, it’s also superior to a world where God isn’t hidden. These considerations serve to support that idea that God might hide in order to maximize the axiological value of the world (Lougheed 2018a)

One line of thought attempts to complete the axiological solution to divine hiddenness by showing that theistic goods do indeed obtain in a world where God hides. On the one hand, it’s clear that theistic goods obtain in a world where God hides simply because this is logical consequence of God’s existence. However, on the other hand it’s not clear that the experience of theistic goods such as forming a relationship with God, cosmic justice, or the afterlife is the same in both worlds. Indeed, the experience of such goods might be so different that the axiological assessment of them ought to differ too. At best, then, we aren’t in a good position to tell whether a world where God hides is axiologically superior to a world where God isn’t hidden. This suggests that the axiological solution to divine hiddenness is at best incomplete (Lougheed 2018b).

One objection to the axiological solution to divine hiddenness attempts to show that it’s intelligible to say that many of the goods typically associated with theism can be experienced in a world where God does not exist (even if they don’t actually obtain). For instance, an afterlife and divine intervention are goods that could both be experienced in a world where God doesn’t exist (Hendricks and Lougheed 2019). Also consider that a world in which God doesn’t exist is consistent with there being an extremely powerful being who is only slightly less powerful than God. This less powerful being could intervene to help humans and bring an afterlife, and so forth. Such a being might not be possible on naturalism, but it is perfectly consistent with atheism. One of the benefits of the discussion of divine hiddenness and the axiology of theism is that it has brought into focus the goods associated with both theism and atheism, along with how we should understand the value of the experience of such goods. It seems that this is just the beginning of such discussions and much more work remains to be done on this topic.

b. Problem of Evil

One version of the problem of evil, known as the evidential (or probabilistic) problem of evil, suggests that if it’s probable that gratuitous evil exists, then it’s probable that God doesn’t exist. This is because the existence of God is taken to be logically incompatible with the existence of gratuitous evil. Some have suggested that if an individual endorses this or related arguments from evil, then she must also endorse pro-theism. This is because if she accepts the problem of evil then she believes that certain world bad-making properties (for example, gratuitous evil) are incompatible with God’s existence. But if God exists, then those bad-making properties would not exist, and hence the world would be better. So, the atheist who endorses the problem of evil as a reason for atheism must, in order to be consistent, also be a pro-theist (Penner and Arbour 2018).

c. Anti-Theism entails Atheism

Finally, some have argued that if anti-theism is true, then atheism is true. Since God is perfectly good, God must always bring about the better over the worse. However, if anti-theism is true, then there are ways in which God doesn’t always bring about the better. But if God doesn’t always bring about the better over the worse then God doesn’t exist. So, the truth of anti-theism implies the truth of atheism. More strongly, it has been suggested that any negative feature associated with theism (for example, a lack of certain types of privacy) is evidence for atheism. This is because it is logically impossible that there be any negative features associated with a God who is omnibenevolent (Schellenberg 2018).

7. Future Directions

As noted, pro-theism and anti-theism are by far the two broad answers to the axiological question that have received the most attention in the literature to date. Given that much of contemporary philosophy of religion is focused on Christian theism, it isn’t surprising that many of the advantages and drawbacks associated with theism are also most clearly associated with typically Christian conceptions of God. In light of this, it seems that minority views deserve more attention in their own right. Additionally, comparative axiological analyses of other religious and non-religious worldviews would further expand the debate.

a. Exploration of Different Answers

As noted earlier, there are at least three additional answers to the axiological question worthy of further consideration. The first is quietism. One reason to hold quietism was alluded to earlier, in Section 2. The necessitarian theist thinks there are no worlds where God doesn’t exist, and the necessitarian atheist thinks that there are no worlds where God exists. Given these views and given that the axiological question is a question about comparative judgments, one might think that it’s impossible to make the relevant comparison. As mentioned above, one way around this counterpossible worry might be to think of the comparison as one between epistemically possible worlds as opposed to metaphysically possible worlds. Another reason for quietism might be that worlds are somehow fundamentally incommensurable with one another and hence can’t be compared (Kraay 2018, 13). Consider that what makes an apple taste good is wholly different from what makes cheese taste good. It doesn’t make sense to compare them axiologically even though they’re both foods. This is a simple example intended to motivate incommensurability (Kraay 2011; Penner 2014).

The second additional answer to the axiological question is agnosticism. This view holds that while the axiological question is perhaps in principle answerable, we aren’t currently in a good position to discover the answer. Hence, we should suspend judgment about the answer to the axiological question. One way of motivating this view is that scepticism about whether we have all of the relevant information required in order to make cross-world value judgments. Not only that, we might worry that even if we could identify particular good-making and bad-making features of a specific world, that we don’t know how to combine those features so as to discover the overall value of that world. So, the agnostic holds that we aren’t in a good position to make value judgments about worlds, though such judgments are in principle possible (Kraay 2018, 10-11).

The third additional answer to the axiological question is neutralism. This involves the claim that God’s existence does not make an axiological difference to worlds. Perhaps God is valuable but shouldn’t be factored into assessments of world value. Or maybe one believes the axiological values of theism and atheism are precisely identical (Kraay 2018, 14). Quietism, agnosticism, and neutralism are surely not the only additional answers to the axiological question, but they represent a starting place for further research into different perspectives on the axiology of theism.

b. Other Worldviews

While the axiological question has only been asked about theism (and atheism), there is no in-principle reason why it couldn’t also be asked about other religious and non-religious worldviews. Indeed, the name ‘axiology of theism’ gives away the rather narrow focus of the literature so far. And it’s even narrower still in focusing not just on ‘theism’ in general but on ‘monotheism’ in particular. There are numerous ways the current debate could be expanded. For instance, pantheism considers God and the Universe to be one. The axiological question might not make sense with respect to pantheism (or might need to be reconstructed) since world value apart from God makes little since if pantheism is true. Panentheism considers the universe to be a proper part of God and thus suffers from a similar worry. Or consider that on a polytheistic religion such as Hinduism the axiological question can be asked with respect to many different Gods. Many of the different deities of Hinduism each have their own unique axiological value. Furthermore, one can explore whether it makes sense to assess the value of each deity separately or whether they need to be assessed together. Finally, consider that it’s far from clear that there is the concept of evil on Buddhism. At the very least, the Buddhist understanding of evil is quite different from how the Judeo-Christian tradition understands it. This brings into focus the question of whether it’s possible to make objective axiological judgments without somehow depending on the values of what one is supposed to be assessing in the first place. These concerns are raised only to show that the axiological question is quite far-ranging, and that much work remains to be done not only in assessing the value of theism and atheism, but also the values of other religious and non-religious worldviews.

8. References and Further Reading

Azadegan, E. (2019) “Antitheism and Gratuitous Evil.” The Heythrop Journal 60 (5): 671-677.
- Argues that personal anti-theism is a form of gratuitous evil.
Cottingham, John. (2005) The Spiritual Dimension: Religion, Philosophy and Human Value. Cambridge: Cambridge University Press.
Chalmers, David (2011) “The Nature of Epistemic Space,” in Epistemic Modality Andy Egan and Brian Weatherson (eds) Oxford: Oxford University Press, pp. 60-106.
- Provides a model of epistemic possibility.
Davis, S.T. (2014) “On Preferring that God Not Exist (or that God Exist): A Dialogue.” Faith and Philosophy 31: 143-159.
- A simply written dialogue discussing different ways of defending both anti-theism and pro-theism.
Dumsday, T. (2016) “Anti-Theism and the Problem of Divine Hiddenness.” Sophia 55: 179-195.
Hedberg, T., and Huzarevich, J. (2017) “Appraising Objections to Practical Apatheism.” Philosophia 45: 257-276.
Hendricks, P. and Lougheed, K. (2019) “Undermining the Axiological Solution to Divine Hiddenness.” International Journal for Philosophy of Religion 86: 3-15.
- Argues that theistic goods could be experienced in a world where God doesn’t exist.
Kahane, G. (2011) “Should We Want God to Exist?” Philosophy and Phenomenological Research 82: 674-696.
- This is responsible for starting the axiology of theism literature is the first statement of the Meaningful Life Argument for anti-theism.
Kahane, G. (2012) “The Value Question in Metaphysics.” Philosophy and Phenomenological Research 85: 27-55.
Kahane, G. (2018) “If There Is a Hole, It Is Not God-Shaped.” In Kraay, K. [Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 95-131.
- Argues that God isn’t required to get many of the theistic goods mentioned by pro-theists.
Kraay, K.J. Ed. (2018) Does God Matter? Essays on the Axiological Consequences of Theism. Routledge.
- This is the only edited collection on the axiological question and contains essays addressing a wide variety of issues from well-known philosophers of religion.
Kraay, K.J. (2018). “Invitation to the Axiology of Theism.” In Kraay, K.J.[Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 1-36.
- An extremely detailed survey chapter of the current debate including helpful prompts for further discussion.
Kraay, K.J. (2011) “Incommensurability, Incomparability, and God’s Choice of a World. International Journal for Philosophy of religion 69 (2): 91-102.
Kraay, K.J. and Dragos, C. (2013) “On Preferring God’s Non-Existence.” Canadian Journal of Philosophy 43: 153-178.
- Responsible for identifying many of the more fine-grained answers to the axiological question.
Linford, D. and Megill, J. (2018) “Cognitive Bias, the Axiological Question, and the Epistemic Probability of Theistic Belief.” In Ontology of Theistic Beliefs: Meta-Ontological Perspectives. Ed. Mirslaw Szatkowski. Berlin: de Gruyter.
Lougheed, K. (2017) “Anti-Theism and the Objective Meaningful Life Argument.” Dialogue 56: 337-355.
- Defends the Meaningful Life Argument against Penner (2018).
Lougheed, K. (2018a) “The Axiological Solution to Divine Hiddenness.” Ratio 31: 331-341.
- Argues that a world where God hides is more valuable than a world where God’s existence is obvious and a world where God doesn’t exist.
Lougheed, K. (2018b) “On the Axiology of a Hidden God.” European Journal for Philosophy of Religion 10: 79-95
- Argues that we cannot tell whether a world where God hides is more valuable than world where God’s existence is obvious.
Lougheed, K. (2018c). “On How to (Not) to Argue for the Non-Existence of God.” Dialogue: Canadian Philosophical Review 1-23.
- Argues that pro-theism is not easier to defend than anti-theism.
Luck, M. and Ellerby, N. (2012) “Should we Want God Not to Exist?” Philo 15: 193-199.
Mawson, T. (2012) “On Determining How Important it is Whether or Not there is a God.” European Journal for Philosophy of Religion 4: 95-105.
McBrayer, J. (2010). “Skeptical Theism.” Philosophy Compass 5: 611-623.
McLean, G.R. (2015) “Antipathy to God.” Sophia 54: 13-24.
Metz, T. (2019). God, Soul and the Meaning of Life. Cambridge University Press.
- An introduction to different theories of what constitutes a meaningful life.
Mugg, Joshua (2016) “The Quietist Challenge to the Axiology of God: A Cognitive Approach to Counterpossibles.” Faith and Philosophy 33: 441-460.
- Applies a theory from the philosophy of mind to solve the worries about whether the axiological question is intelligible.
Penner, M.A. (2018) “On the Objective Meaningful Life Argument: A Reply to Kirk Lougheed.” Dialogue 57: 173-182.
- Replies to Lougheed (2017).
Penner, M.A. (2015) “Personal Anti-Theism and the Meaningful Life Argument.” Faith and Philosophy 32: 325-337.
- Develops Kahane (2011) into a more detailed version of the Meaningful Life Argument for anti-theism, but ultimately rejects it.
Penner, M.A. (2014) “Incommensurability, incomparability, and rational world-choice.” International Journal for Philosophy of Religion 75 (1): 13-25.
Penner, M.A. and Arbour, B.H. (2018) “Arguments from Evil and Evidence for Pro-Theism.” In Kraay, K.J. [Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 192-202.
Penner, M.A. and Lougheed, K. (2015) “Pro-Theism and the Added Value of Morally Good Agents.” Philosophia Christi 17: 53-69.
- Argues that God’s existence adds value to the world since God is a morally good agent.
Rescher, N. (1990) “On Faith and Belief.” In Human Interests. Stanford: Stanford University Press, 166-178.
- The first time the axiology of God’s existence is explicitly mentioned in the contemporary literature.
Schellenberg, J.L. (2006). Divine Hiddenness and Human Reason. Cornell University Press.
- This book represents the first statement of the argument from divine hiddenness as discussed in the contemporary literature.
Schellenberg, J.L. (2015) The Hiddenness Argument: Philosophy’s New Challenge to Belief in God. Oxford University Press.
- A statement on divine hiddenness intended to be accessible to a wide audience.
Schellenberg, J.L. (2018) “Triple Transcendence, the Value of God’s Existence, and a New Route to Atheism.” In Kraay, K.J.[Ed.] Does God Matter? Essays on the Axiological Consequences of Theism. Routledge, 181-191.
Van Der Veen, J. and Horsten, L. (2013) “Cantorian Infinity and Philosophical Concepts of God.” European Journal for Philosophy of Religion 5: 117-138.
Williams, B. (1981) “Persons, Character and Morality,” in Moral Luck. Cambridge: Cambridge University Press.

Author Information

Kirk Lougheed
Email: philosophy@kirklougheed.com
Concordia University of Edmonton
Canada

John Wisdom (1904-1993)

Between 1930 and 1956, John Wisdom set the tone in analytic philosophy in the United Kingdom. Nobody expressed this better than J. O. Urmson in his Philosophical Analysis: Its Development Between the Two World Wars (1956) where, after Bertrand Russell and Ludwig Wittgenstein, Wisdom is the most frequently quoted philosopher. Wisdom was the leading figure of the Cambridge School of Therapeutic Analysis (which included other thinkers such as B. A. Farrell, G. A. Paul, M. Lazerowitz, and Norman Malcolm); the other major British school of analytic philosophy was that of ordinary language philosophy centered primarily at Oxford University.

Wisdom adopted the positions of both G. E. Moore and Wittgenstein, but he rejected the radical critique of metaphysics levelled by the Wittgenstein-inspired Vienna Circle. In contrast to Wittgenstein, Wisdom was not a philosopher of language: he maintained that most significant philosophical problems originate not with language but, in the first instance, as a result of our encounter with problems of the real world. From this standpoint, Wisdom introduced into analytic philosophy the discourse on the meaning of life and on problems of philosophy of religion. Be this as it may, prior to the appearance of Wittgenstein’s Philosophical Investigations (1953), Wisdom’s published works were read as indicators of the directions that Wittgenstein’s thought was taking following the latter’s return to philosophy in 1929.

By the 1960s, Wisdom’s influence had radically diminished. This was due largely to the ascendancy of exact philosophy of language and analytic metaphysics. This development, together with increasing emphasis on the power of scientific knowledge and its techniques, largely overshadowed the exploration of philosophical puzzles, human understanding (“apprehension”), and techniques of deliberation, which were Wisdom’s three chief theoretical concerns.

Biography
Interpretation, Analysis, and Incomplete Symbols
1. Interpretation and Analysis
2. The Task of Analytic Philosophers
Logical Constructions
The Metaphysical Turn
Other Minds
What is Philosophy?
Philosophy of Religion
References and Further Reading
1. Primary Sources
  1. Books
  2. Papers
2. Secondary Sources

1. Biography

(Arthur) John Terence Dibben Wisdom was born to the family of a clergyman in Leyton, Essex, on December 9, 1904. He attended the Aldeburgh Lodge School and the Monkton Combe School in Somerset. In 1921, he became a member of Fitzwilliam House, Cambridge, where he read philosophy and attended lectures by G. E. Moore, C. D. Broad, and J. M. E. McTaggart. Wisdom received his Bachelor of Arts in 1924, after which he worked for five years at the National Institute of Industrial Psychology. In 1929, he married the South African singer Molly Iverson. The couple had a son, Thomas, born in 1932, before separating during the Second World War. Between 1929 and 1934, Wisdom was a Lecturer in the Department of Logic and Metaphysics at the University of St. Andrews and a colleague of G. F. Stout. After the publication of his Interpretation and Analysis (1931) and the series of five articles on “Logical Constructions” (1931-1933), Wisdom was named Lecturer in Philosophy at Cambridge and a Fellow of Trinity College. This afforded him the opportunity to acquire firsthand knowledge of Ludwig Wittgenstein’s philosophical work.

Between 1948 and 1950, Wisdom delivered two series of Gifford Lectures on “The Mystery of the Transcendental” and “The Discovery of the Transcendental” that were never published (Ayers 2004). In 1950, Wisdom married Pamela Elspeth Stain, a painter. From 1950 to 1951, he served as president of the Aristotelian Society. In 1952, he was named Professor in Philosophy at Cambridge. Following his retirement from Cambridge in 1968, Wisdom spent four years teaching at the University of Oregon. Wisdom returned to Cambridge in 1972, and six years later was elected Honorary Fellow of Fitzwilliam College. He died in Cambridge on September 12, 1993.

2. Interpretation, Analysis, and Incomplete Symbols

a. Interpretation and Analysis

In his first book in 1931, Wisdom maintains that interpretation and analysis are two kinds of definition. Interpretation is a one-act paraphrase of a word or a phrase, a presentation of its meaning that remains at the same “level,” as when one links a word to its synonyms. By contrast, analysis “unpacks” the meaning at a deeper level (1931, p. 17). St. Augustine effectively captured the difference between interpretation and analysis in his famed reply to the question “What is time?”: “I know well enough what it is, provided that nobody asks me; but if I am asked what it is and try to explain, I am baffled” (Confessions, Book 11). Wisdom reads Augustine as communicating that he knows the interpretation of “time” but not its analysis. Problems arise because the two forms of definition are often difficult to distinguish in practice since elements of analysis tend to find their way into interpretations, with the result that the two categories sometimes overlap (p. 17).

A central theme in Interpretation and Analysis is Jeremy Bentham’s notion of fictitious entities. According to Bentham:

A fictitious entity is an entity to which, though by the grammatical form of the discourse employed in speaking of it, existence be ascribed, yet in truth and reality existence is not meant to be ascribed. (Bentham 1837, viii. p. 197)

The difference between objects of reality and fictional entities is that the latter are not components of facts. They have, as Bentham put it, only “verbal reality.”

Preserving individual perceptions and corporeal substances in his ontology, Bentham declares all other items “fictitious entities.” Such are the 10 predicaments of Aristotle, but also the color red. Similarly, Wisdom holds that persons, animals, and unicorns are individuals, while events and qualities are not. But concepts like “nations” are both individuals and fictitious entities.

b. The Task of Analytic Philosophers

Following Moore, Wisdom maintains that the business of analytic philosophy is to obtain a clear and precise grasp of a phrase’s meaning. A significant part of Moore’s work consists in trying to find the answer to questions like “What do we mean when we say: ʻThis is a blackboardʼ?” (p. 8). However, following another of his teachers, Broad, Wisdom takes analysis to be only one practice of philosophy. There is also a speculative philosophy, which is fully on par with analytic philosophy. The task of analytic philosophers is to clarify the propositions of speculative philosophy (compare to Broad 1924). Wisdom dedicates to this task a special book, Problems of Mind and Matter (1934a), in which he investigates G. F. Stout’s Mind & Matter (1931), which explores three notions: the “mental,” the “material,” and “psychology.”

Wisdom argues against the claim that language is the subject matter of analytic philosophy. He admits that “one of the best clues to the analysis of facts is the [analysis of the] sentence which expresses it” (1931, p. 64), but he insists that he does not really want to say that every philosophical proposition is bad grammar. In other places, Wisdom is more explicit: “The work of an analytic philosopher is not work on language. Indeed, all his results could be stated in many other systems of symbols” (p. 15) (compare to § 4.2). This point suggests that the findings and formulations of the analytic philosopher might be useful to the special sciences. For example, an analysis of the concept of “rent” can be used in political economy. What analytic philosophers strive for above all is clarity and precision everywhere, not only in philosophy.

3. Logical Constructions

Wisdom discusses his doctrine of logical construction—a term introduced by Bertrand Russell—in a series of five articles that appeared in Mind from 1931 to 1933. The philosophical community for a number of years considered these essays to be “the most wholehearted of all attempts to set out the logical assumptions implicit in philosophical analysisʼ” (Passmore 1966, p. 365).

a. The Tasks of Philosophical Analysis

From what derives the difference between the analytic philosopher and the translator? Wisdom holds that the difference is one of diverse paraphrastic intentions. In the same way in which the statement of the liar does not differ from the statement of the ignorant, the philosopher and the translator often speak the same words, but they intend different things.

That the analytic philosopher’s task closely approximates that of the translator reveals that the philosopher’s aim is not to learn new facts but to acquire a deeper insight into the ultimate structure of the facts. Such analysis is worth doing, in Wisdom’s view, since we may perfectly well know the facts but may possess no knowledge about their essential structure whatsoever (1931-3, p. 169-70) (see § 2.1).

The latter claim is directed, in particular, against the Vienna Circle (compare to Stebbing 1933) inasmuch as, while Wisdom rejects metaphysical entities (for example, sense-data), at the same time he embraces metaphysics as a discipline studying the ultimate meaning, the structure of things.

b. Sketching Versus Picturing

Wisdom rejects the idea of the early Moore that propositions exist. This move appears to follow his reluctance to connect analysis to the world as an ontological entity. Wisdom also rejects Wittgenstein’s statement that “propositions” “picture” facts. This is confirmed by the fact that while “a sentence requires a speaker, a picture… requires an artist” (p. 62). Further justifying this position, he argues that when we write one sentence twice, we write two sentences, while the fact that these sentences “sketch” remains one and the same.

Instead of picturing, Wisdom maintains that language “sketches” facts (p. 56). By the act of “sketching,” one makes each element of a sentence to “name” an element of the fact, while the order of elements in the sentence “shows” the form of the elements of the fact: it shows the “shape” of the fact. Wisdom calls the replacement of the components of facts by elements of the sentence “docketing” (p. 51).

Wisdom assesses sentences on a scale of “good expression” of facts. The sentences that best express a fact feature elements of the same spatial order as the elements of the fact. Importantly enough, the sentences of the ordinary language are not identical with the spatial form of the fact that it is expressing but rather with something from a different logical level that might be derived from spatial form (p. 62). To avoid confusions, Wisdom recommends that when, for example, we report a red patch on this white sheet of paper, we would be more precise if we were to say “this red” instead of “this is red.”

c. Types of Analysis

A fact can be about (can sketch) another fact only if it is of the same order. Wisdom regards a fact to be of the “first order”—that is, its elements qualify as “ultimate elements”—if it is not a fact about a fact: in other words, if it features no element like “community,” or character like “machine,” or any other Benthamite fictitious entities. Wisdom also distinguishes “first derivative” facts: “If one supposes it to be a fact that some object is red, then the first derivative will be the fact that the object is characterized by red” (Urmson 1956, p. 81). The first derivative facts are logical constructions.

Since the ways facts can be about other facts can be of different orders, there are correspondingly different types of analysis. Wisdom discriminates between material, philosophical, and logical analysis (1934b, p. 16). Logical analysis assesses “functors.” Philosophical analysis, by contrast, serves a constructive role, making primary sentences of secondary sentences. Its objective is to render secondary facts ostensive, thereby yielding insight into their structure. Philosophical cognition can be defined as insight into structure, regardless of how one achieves that insight. It employs the method of what Wisdom identifies as “ostentation.”

The scientist undertakes material analyses. Analyses of this sort are even more ostensive than those Wisdom classifies as philosophical. This cannot be a surprise since material analysis is a same-level analysis, philosophical analysis makes a translation into a new level. Despite this clear difference between the two types of analysis, it is a matter of fact that scientists often perform philosophical analysis, while philosophers on their side commonly engage in material analysis, for example, when they attempt to define “good” in naturalistic ethics.

d. Ostentation, Instead of Reference

Philosophers have always made use of the method of “ostentation.” Wisdom sees, for example, Bentham employing it under the guise of “paraphrase,” Russell under the guise of “logical construction” and “incomplete symbol.” Unfortunately, the method has never been analyzed in detail.

Wisdom defines “ostentation” as “a species of substitution” (1933, p. 1) by means of which one more clearly states the facts to which sentences refer. Each meaningful sentence ostensively “locates” facts, albeit with different success. Sentences containing general names, for instance, do not locate facts as successfully as do sentences with individual names.

The importance of the introduction of the notion of ostentation is that with its help, Wisdom avoids resorting to the use of L. S. Stebbing’s “absolute specific sense-qualities” (1933-4, p. 26). While Stebbing believed that the aim of analysis is “to know what precisely there is in the world” (1932-3, p. 65), Wisdom saw the task of analytic philosophy as exploring the ultimate structure of the facts.

4. The Metaphysical Turn

a. Philosophical Perplexity

Between 1934 and 1937, Wisdom regularly attended Wittgenstein’s classes in Cambridge. The impact of this encounter is clearly evident in “Philosophical Perplexity,” where Wisdom proclaims:

I can hardly exaggerate the debt I owe to [Wittgenstein] and how much of the good in this work is his—not only in treatment of this philosophical difficulty and that but in the matter of how to do philosophy. (1936, p. 36 n.)

In the paper, Wisdom underlines his old position that philosophical statements provide no new information. Their point is different from that of the factual propositions. The task of philosophical propositions is:

… the illumination of the ultimate structure of facts, that is the relations between different categories of being or (we must be in the mode) the relations between different sub-languages within a language. (1936, p. 37)

What is new in “Philosophical Perplexity” is the suggested (Wittgensteinian) tolerance toward the opposing claims philosophers make. If, for example, one philosopher maintains that philosophical statements are verbal, and another that they are not verbal, we can affirm that they both are right.

Wisdom pays special attention to the sentences that the neo-positivists dismiss as meaningless. Typical examples of such sentences are: “God exists,” “Humans are immortal,” and “I know what is on in my friend’s mind”—sentences that give rise to traditional philosophical problems. Wisdom insists that it is misleading to call them all “meaningless,” at least because each proposition of this sort exhibits a meaninglessness of different kind (compare to § 5.3). Nonsensical in different respects are propositions such as that two plus three is six and that one can play chess without the queen.

Puzzles of this sort can be solved “by reflecting upon the peculiar manner in which those sentences work,” in other words, by reflecting on their style, not on their subject. Wisdom’s “mnemonic slogan” now is: “It’s not the stuff, it’s the style that stupefies” (p. 38). Foregrounding style as a substantive philosophical concern, Wisdom initiates a move to discriminate between the “content” of a proposition and what we actually want to say with it—its “point.”

b. Philosophical “Statements” as both Misleading and Illuminating

Wisdom maintains that we often cannot say of a philosophical theory why it is false, although we feel that it is theoretically poor. Actually, the philosopher cannot say why a philosophical statement is false, simply because philosophical “statements” are not, properly speaking, statements but rather recommendations for elucidating some matter.

What misleads in philosophical “statements” is, above all, that they have a non-verbal air (compare to § 2.2). Philosophers often maintain, for example, that they can never know what is going on in other minds, as if they are dreaming of a world in which this were possible. This complaint is misleading, argues Wisdom, since it implies likeness that does not exist and conceals likeness that does.

Wisdom further claims that “philosophical theories are illuminating in a corresponding way, namely when they suggest or draw attention to a terminology which reveals likeness and differences concealed by ordinary language” (p. 41). In other words, by struggling with a philosophical puzzle, we can achieve progress alternatively shifting from provocation to resolution (p. 42).

The conclusion Wisdom reaches is that to accept that a theory or a point of view might not only lead one to adopt different theoretical positions but also to acquire a novel cognitive stance of a general kind. Importantly enough, cognitive differences are possible inasmuch as every judgement is also a decision. Even “a man who says that 1 plus 1 makes 2 does not really make a statement,” declares Wisdom, “he registers a decision” (1938, p. 53) (compare to § 5.2).

c. Descriptive Metaphysics

Just as with the propositions of mathematics, and the statements of psychoanalysis, ethics, poetry, and literature, it is difficult to define metaphysical claims. Apparently, metaphysics is closer to logic, understood as a discipline of a priori definitions. This is the conclusion that Moore reached studying Plato and Aristotle and that Russell came to as well in his study of logic and mathematics. Wisdom finds that by contrast with the logician, “the metaphysician looks for the definition of the indefinable” (1938, p. 60). Thus, metaphysics is not a kind of analysis—analysis is a function of logic. Rather, the ends of metaphysics are achieved in a “game of analyses.” When we define metaphysical questions and sentences, we are articulating the goals of play in the game.

To put it otherwise, the metaphysician is not aiming at analysis as such: “What metaphysicians want, or really want, is not definition but description” (p. 65). If we, nevertheless, would like to speak of analysis instead of descriptions in metaphysics, we should stipulate that the metaphysician is striving to analyze the unanalyzable.

5. Other Minds

Over a period of three years, beginning in 1940, Wisdom published a series of eight papers in Mind under the title “Other Minds” (1952a). The publication was the most important philosophical event in Britain during the Second World War, which explains why the opening discussion at the Joint Session of the Aristotelian Society and Mind Association in 1946 was on “Other Minds” at which Wisdom and J. L. Austin presented their positions (compare to Austin 1946).

a. Philosophical Quasi-Doubts and their Therapy

In these papers, Wisdom holds that philosophy is based on ever-recurring doubts. However, when we try to discuss these doubts, they “turn to dust” (1952, p. 6). Why is this? To answer this question, we need to discriminate between natural doubts about some fact of which we have no knowledge, and philosophical doubts. Philosophical doubts are less doubts in the normative sense than concerns over “logical irregularities.”

Wisdom differentiates three kinds of philosophical doubts: (i) Some doubts stem from the infinite corrigibility of statements about people and things, for example, “Smith believes that flowers feel.” (ii) A second sort are “inner-outer doubts.” When assailed by such concerns, we know all the data of a case but nevertheless doubt what is going on “in Smith’s head.” This state of mind figures in circumstances where, for example, we see that a driver stops at red light but do not in fact know whether he sees the red light. (iii) Wisdom’s third class of doubt involves thoughts such as whether a zebra without stripes is still a zebra and whether a man can fulfill a promise by mistake.

Quasi-doubts of these kinds are doubts about predication. They all hinge on the problem of determining whether S is P. Wisdom detects three sources of the problem: (i) Infinity of the criterion of whether S is P. This engenders doubts of the kind evinced by questions such as “Are the taps closed?” and “Is this love?” (ii) A second source is conflict of criteria as to whether S is P. We see this in questions like “Can you play chess without the queen?” and “Are tomatoes fruits or vegetables?” (iii) Wisdom’s third source is hesitation by leap of criteria that determine whether S is P—the “leap” being from the inner to the outer, from the present to the past, from the actual to the potential.

Wisdom takes his position from psychoanalytic therapy, whereby “the treatment is the diagnosis and the diagnosis is the description, the very full description, of the symptoms” (p. 2 n.). The philosophical difficulty is eliminated only when the philosopher himself comprehensively describes his question—not in abstract general terms but narratively, telling stories about them. Wisdom’s conclusion is that ultimately “every philosophical question, when it isn’t half asked, answers itself; when it is fully asked, answers itself” (ibid.). This is the main principle of his therapeutic analysis (compare to § 5.4).

b. Contemplating Possibilities

Wisdom also maintains that instead of speaking of metaphysical doubt, it is more correct to speak of contemplating possibilities (p. 6, 33). When I am pondering a philosophical puzzle “rival images are before me… two alternatives, two possibilities” (p. 14) and, in a process of deliberating on them, I understand the puzzle. Such contemplation aims at judgement, at decision (compare to § 4.2). In fact, “all philosophical doubts are requests for decision” (p. 3 n.), not for information.

As contemplation of possibilities, philosophical knowledge is clearly a priori. According to Wisdom, philosophical knowledge is the most general knowledge, more general than mathematical knowledge. That is why the “ignorance” in philosophy is not bona fide ignorance; the “doubt” in it is not genuine doubt. The philosophical pseudo-ignorance is usually combined with the perfect knowledge of the object. Moreover, observes Wisdom, “to grasp how philosophy though not logic is a priori and though a priori is not logic takes one far towards dissolving its difficulties” (p. 20).

c. The Logic of Philosophical “Statements”

According to Wisdom, the philosophical question is neither a logical proposition nor an empirical warning. It is a question of the form “Aren’t we really all mad?” or an exclamation like “We are all sinners!” Such phrases are requests for notational reform. They are not an appeal for a search of new facts.

Like all conflicts in philosophy, the “conflict between Sceptics and Phenomenalists,” avers Wisdom, “is removed not by proving the one [side] being wrong and the other right, but by investigating certain of the cases of each one’s saying what he does” (p. 56). One can do this by means of “careful description” of the usage of the competing phrases (compare to § 4.3). Wisdom perceives this method as being similar to that of the writers, who blend technique “with the detailed description of the concrete occasion” (p. 57).

Meaningless statements of belief, however, are different in type. This is evident in the contrast between, for example, the statement that in the dead man there is still something alive, and the statement that the clock is moved by a leprechaun, both of which differ typologically from the statement that particular man now exists in a body other than his own. In this connection, Wisdom notes that “there is more difference between the grammar of ʻcurly wolfʼ and ʻpretence wolfʼ than there is between the grammar of ʻcurly wolfʼ and ʻinvisible wolfʼ” (p. 25; compare to p. 68). Moreover, “even within the category of physical objects there are differences in logic” (p. 76 n.), as in how “has legs” relates to “is a chair” differently than to “is a cushion.”

The principle “every sort of statement has its own sort of logic” implies that we cannot decide which among competing metaphysical statements is ultimately the winner (p. 62); there are no final proofs here. The inferences drawn in philosophy are no more than probable; they are true only in “colloquial sense.” As Wisdom explains, we can say “none of these answers will do. There is a step [a decision], and we take it, but goodness knows how [… and this] is not an alternative answer, it is a repetition of the complaint” (ibid.).

d. Therapeutic Analysis

To the uncertainty expressed by the question “How do I know other minds?,” we can reply “By analogy.” This answer, however, as Wisdom points out, is as misleading as it is true; it seems true only initially. In fact, it is just another deceptive “smoother” in that it tranquillizes critical thought, albeit only momentarily. If we say, for instance, that the hippopotamus is a water horse, we must immediately add how this identification misleads.

Wisdom concludes from the foregoing the following thesis of therapeutic analysis:

The whole difficulty [in philosophy] arises like difficulty in a neurotic; the forces are conflicting but nearly equal. The philosopher remains in a state of confused tension unless he makes the [therapeutic] effort necessary to bring them all out by speaking of them and to make them fight it out by speaking of them together. It isn’t that people can’t resolve philosophical difficulties but that they won’t. In philosophy it is not a matter of making sure that one has got hold of the right theory but of making sure that one has got hold of them all. Like psychoanalysis it is not a matter of selecting from all our inclinations some which are right, but of bringing them all to light by mentioning them and in this process creating some which are right for this individual in these circumstances. (p. 124 n.)

e. On Certainty

An argument against the skeptical criticism of the claim “There are invisible leprechauns in the clock” is that we can imagine invisible leprechauns known only by the deity. Apparently, questions like “Are there leprechauns?” are not necessarily meaningless.

Even if we were to see the noumena, this would merely be a visual perception again; thus, as philosophers, we would need to be skeptical about them, too. It turns out that we cannot even imagine true noumena. Wisdom concludes that the skeptic’s statements do not participate in the discourse. In fact:

The sceptic refuses to back anything, saying that everything may lose except Logic which doesn’t. In saying this he appears to back something but he doesn’t. For his own statement can’t lose and doesn’t run. (1952a, p. 102 n.)

Some may claim that we can directly know other minds by telepathy. However, this again is only indirect knowledge—it is not a solution to the problem. To talk, for example, of John seeing literally everything that Smith sees is to speak of one person existing in two bodies. If somehow we all were to have a telepathic connection with Smith’s mind, then his private life would be common and the mind-processes in his head would be physical events.

The notion that we can have knowledge of someone else’s mind is, as Wisdom sees it, absurd. We encounter a logical impossibility here. To say “we can’t know other minds” is in the first instance to acknowledge that this is physiologically impossibly. Once we understand that telepathy, too, cannot be a source of knowing other minds, however, we see that such knowledge is a logical impossibility.

6. What is Philosophy?

a. Epistemic Anxiety

The question “what is philosophy?” plays central role in Wisdom’s works. In a review written in 1943, he maintains that:

… oscillation in deciding between philosophical doctrines goes hopelessly on until one gives up suppressing conflicting voices and lets them all speak their fill. Only then we can modify and reconcile them. (1943, p. 108)

All this provokes in us a feeling of uneasiness, since:

… we are very apt to be dissatisfied with our weighing[;] the weights too often and too much change every reweighing… It is that oscillation which finds expression in [the avowal] “I don’t know what I really want.” (p. 109)

This feeling of epistemic anxiety is most familiar from our experience with moral dilemmas, as on those occasions when we exclaim, “I shouldn’t have done that!” and then, a bit later, we temporize with a remark like, “Well, it isn’t that bad!” Wisdom finds a similar situation when trying to resolve a philosophical issue.

The worst thing, in Wisdom’s conclusion, that we can teach a child is blindly to be driven by a love or hatred that is unchangeable in principle. The pedagogical effort should teach the child to react cautiously and reflectively in different situations. The pupil should be taught to cultivate a broad spectrum of reasoning that he can bring to bear in examining every new development in his environment (compare to Ryle 1979, p. 121). Wisdom explains that the person who best accomplishes this increases the child’s:@

… discrimination not so much of the objects to which he reacts as of his reaction to the objects… Not merely putting something into the child but bringing out the uneasiness which lurks in him. (1952a, p. 110)

b. No Proofs in Philosophy

Wisdom maintains that there cannot be proofs in philosophy—neither in a logical sense nor in an analytic sense. Philosophical proofs are invalid in principle. Indeed, a proof is only possible in complex cases, for example, by algebraic problems, where we have long chains of reasoning. In philosophy, however, the cases we are inclined to consider “proved” are simple. Exactly this is the source of the difficulty: the simpler the case, the more ambiguous are the words of the conclusion. This leads one to contemplate different alternatives and, in the process, to hesitate as to the conclusion. Proofs, however, are free from hesitation per definitionem. There are philosophical questions, not philosophical proofs.

Wisdom maintains that every philosophical question is a request for description of a class of “logical animals”—of a very familiar class of animals. “And because the animals are so familiar there is no question of the answers being wrong descriptions—but only of whether they are happy descriptions or not” (1944b, p. 112).

Entangled philosophical questions introduce new logic. Wisdom understands this to mean that they introduce new ways of seeing things that reveal what is already known in principle but is not before our eyes. Philosophical questions can be likened to the question of a person who is well aware of what a semaphore is but still asks what it is. Obviously this is not a question about facts. Wisdom construes it as a request for a new description, one motivated by the hope that it will eliminate some perplexity. In other words, philosophers exercise deductive reasoning that starts from things that everybody knows (compare to Russell 1914, p. 189ff.).

c. Philosophy Explores Puzzles

In marked disagreement with Wittgenstein, the later Wisdom maintains that “a purely linguistic treatment of philosophical conflicts is often inadequate” (1946a, p. 181). Philosophical puzzles commonly do not, he finds, possess a linguistic etiology (compare to §§ 2.2, 4.2), and they are not different in type from some other unsettling puzzles that confront us in life. The reasonableness employed in philosophical dispute is, says Wisdom, typically of the sort that a woman employs when she decides “which of the two men is the right one for her to marry,” or that a man uses when he must “decide which of two professions is the right one for him to take up” (p. 178).

In fact, the philosopher discusses his problems just as does the businessman, the judge, or the army general does. However, he never approaches his discussions as a preparation for action. The philosopher, declares Wisdom, simply “desires the discussion never to end and dreads its ending.” He is like:

… the man who cannot be sure that he has turned off the tag or the light. He must go again to make sure, and then perhaps he must go again because though he knows the light’s turned off he yet can’t feel sure. (p. 172)

However, in contrast to the neurotic, the philosopher can never resolve his doubts. This is because he does not actually doubt but just pretends to doubt, and he does not pretend merely to others but to himself as well.

Philosophy also resembles logic and mathematics but fields no theories or theorems. Instead, it formulates puzzles, such as those captured in questions like “Can a man do what the other does?” Puzzles of this kind introduce new forms of logic, which the philosopher sifts for hidden characteristic marks of conventional logic. Philosophical puzzles are no less unreal than caricatures; neither do they assert facts. They arise partly from language and partly from our pre-predicative practices.

d. Philosophy Treats Paradoxes

Wisdom’s skeptic claims that we cannot be absolutely sure that, for example, this map represents London. This is true for all statements “about what is so.” When we see a fox head, we can be still not sure that this is a fox’s head. This worry Wisdom dismisses as a product of the logical model of the “man behind the scene [which is…] inappropriate to his logical situation” (1950a, p. 250). What is to be realized when looking at such statements is “how each answer [to a sceptical claim] illuminates what others obscure and obscures what the others illuminate” (p. 254).

It is through a process of asking similar questions and developing answers to them that philosophical problems are resolved. Questions such as “whether the infinite numbers are numbers,” “whether the wild horses are horses,” and “whether a chess game without the queen is a chess game” are all questions of this sort, according to Wisdom, and are requests for judgment (compare to § 7.2). As such discourses reach their terminus, perplexity is replaced by new apprehension, a new “take” on the matter at hand.

Questions of the type “What is this?” are neither inductive nor deductive. Their point differs with different questioners and with different circumstances. Resolving them requires prolonged investigation, which may end in expressions of exasperation, such as “I won’t bother any more with it! I have already thought it over!” Such questions are paradoxical.

Likewise paradoxical, avers Wisdom, are the doctrines of metaphysics, when they are not platitudes. They are “truths which couldn’t but be true” (p. 264), similar to the infinite tautology of absolute skepticism. Usually, they are expressed as paradoxical questions that concern the character of foundations or of knowledge. Metaphysicians approach their questions in terms of general themes, such as things and persons, space and time, good and evil, and so on.

7. Philosophy of Religion

a. Epistemic Attitudes

Wisdom devotes considerable attention to discussing problems of philosophy of religion. His main claim here is that the religious believer and the atheist think about different worlds. “The theist,” he says, “[often] accuses the atheist of blindness and the atheist accuses the theist of seeing what isn’t there” (1944c, p. 158). This difference in attitude determines the difference in seeing different worlds (p. 160).

People with different attitudes see the same facts differently. For example, a married couple may enter a room, and one sense that someone had been there, while the one adamantly deny that there is any clue to substantiate the spouse’s hunch. Most such occurrences are rather a question of feeling than of experience. Wisdom considers it inappropriate in such cases to ask who is right.

Such exercises in reasoning are typically explored in philosophy as well as in religion. However, Wisdom holds that they also have place in some a priori domains of theoretical thinking—in philosophy of mathematics, for example, where two competing parties (say, logicists and constructivists) defend theses, each of them being “right” in their way.

Wisdom’s conclusion, clearly opposing the logic of Gottlob Frege and Russell, is that in such disciplines “the process of argument is not a chain of demonstrative reasoning” (p. 157). Of course, the growth of knowledge in these disciplines is, similarly to that in science, cumulative. However, it starts from several independent premises—not by mechanically iterating the transformation of a set of premises, as in Principia Mathematica.

Wisdom adduces that we can find a solution to a cognitive problem not only by adding new illuminations but also “by talk.” Occasionally, in the process of trying to demonstrate that our opponent is wrong, we become aware that it is we who are mistaken. Often our opponent has unconscious reasons for his attitude, which we should try to make explicit. Such a methodology finds us “connecting and disconnecting” cases, thus “explaining a fallacy in reasoning” (p. 161).

b. The Logic of God

In a 1950 BBC presentation titled “The Logic of God,” Wisdom introduces the example of someone who tries on a new hat and gets the following reaction: “My dear, it’s the Taj Mahal” (1965a). Literally understood, the claim that the hat is a temple is clearly absurd. However, just as absurd is the statement that we can or cannot know other minds. Be this as it may, such claims are not pointless. They simply call, in Wisdom’s view, for a “dialectic process in which they are balanced” (p. 263). Thus, the paradox “We are all mad” should be balanced with its opposite: “We are all sane.” We then arrive at the (quasi-Hegelian) synthesis, “Some of us are mad, but others are not.” Wisdom recommends the same procedure when we address metaphysical problems. Otherwise, we are exposed, he believes, to the threat of the one-sided “road to Solipsism [where] there blows the same wind of loneliness which blows on the road to the house with walls of glass which no one can break” (p. 282).

Wisdom maintains that “sometimes it is worth saying what everybody knows” (1950b, p. 2), in particular, as doing so changes our apprehension of the facts. Such statements do not tell the truth. They reveal it. Indeed, “we sometimes use words neither to give information… nor to express and evoke feelings… but to give greater apprehension of what is before us” (p. 6).

Not all questions have an answer. Among the great unanswerable questions is whether God exists. Wisdom avers that we have only fragmentary evidence for such existence, not proofs. If we want a complete proof here, we should need per impossible to adduce all of God’s characteristics. Similarly, the complete proof of the existence of the rainbow cannot be less complex than all its characteristics.

To substantiate this position, Wisdom refers to his theory of logical models, according to which different kinds of objects have their own logic. For example, the logic of God is much more alien to the logic of electricity, than the logic of milk is to the logic of wine (p. 15). It is more eccentric. A typical characteristic of the logic of God, in contrast to the logic of electricity, is that we have no idea what to expect about its real essence.

There are similar “logics of ignorance.” Thus, the actor may not know exactly how he will act when he assumes the role of his character. He will see that he is getting it wrong only after a first misstep. Conversely, the actor understands that he is on the right track only when his work is complete. Something similar happens when we act in our own character. Euripides, St. Paul, and Sigmund Freud observed how sometimes the agent is not aware that it is not he who performs his deeds. He is governed by his Super-Ego, the logic of which is close (at least for St. Paul) to that of God.

That our knowledge is not only knowledge of facts is attested, Wisdom holds, by the circumstance that, as Freud put it, we do not know even ourselves. We see this in the difficulty we experience when we strive to transcend limited judgments in order to reach some final judgment, or a “divine” judgment, which Wisdom describes as “a judgment which takes everything into account and gives it its correct weight” (1965d, p. 32-3).

c. The Meaning of Life

Wisdom considers the Existentialist movement in philosophy, rather popular on the Continent in the 1950s and 1960s, an evasion, a diversion from the real difficulties of life. He praises it for concentrating on something that only a relatively few philosophers considered worthy of debate in the decades immediately following the Second World War. He charges, however, that the existentialists’ arguments were by and large merely ad rem. It is well known, declares Wisdom, that “one of the best ways of keeping concealed the most horrible is to emphasize the horror of the less horrible and to denigrate the good” (1965c, p. 37).

Against the existentialists, Wisdom insists that despite all the misery in the world, there are situations in which we find complete meaning. He further notes that we can ask “What holds all this up?” but not “What holds up all things?” To be more exact, one cannot answer the question “What is the meaning of all this?” in a single determinate thought or sentence. We find the meaning, on Wisdom’s conception, in many scattered moments of cheerfulness that do not attach to intellectual dishonor, stupidity, or evasion.

Apparently, “What is the meaning of all this?” is not a meaningless question, as the logical positivists maintained. There are many clearly meaningful cases in which one asks “what is the meaning of all this,” as when, for example, the critic tries to grasp the idea of a play. We cannot give only one answer to such questions, though, nor can we supply a fully complete list of the things we believe to be the answer. This, however, does not mean that the words cheat us, as it were, and that such questions cannot be addressed in principle, or that we cannot progress toward an answer. Indeed, opines Wisdom, “the historians, the scientists, the prophets, the dramatists and the poets assist us in our attempts to answer the question of life” (p. 42).

Wisdom concludes that religious issues are also issues of fact (compare to § 6.2). They require new apprehension of facts, in the same way as the court aims at illumination and new apprehension of the facts. To articulate religious propositions is not, according to Wisdom, simply to express an attitude toward life, as the emotivists believe. Nor are such propositions merely matters of intuition or of decision.

8. References and Further Reading

a. Primary Sources

i. Books

1931. Interpretation and Analysis in Relation to Bentham’s Theory of Definition, London: Kegan Paul.
1931-3. Logical Constructions, ed. by J. J. Thomson, New York: Random House, 1969.
1934a. The Problems of Mind and Matter, 2nd ed., Cambridge: Cambridge University Press.
1952a. Other Minds, 2nd ed., Oxford: Blackwell, 1965.
1953a. Philosophy and Psycho-Analysis, Oxford: Blackwell.
1965a. Paradox and Discovery, Oxford: Blackwell.
1991. Proof and Explanation: The Virginia Lectures, ed. by S. F. Barker, Lanham (Maryland): University of America Press.

ii. Papers

1933. “Ostentation,” in (1953a): 1-15.
1934b. “Is Analysis a Useful Method in Philosophy?” in (1953a): 16-35.
1936. “Philosophical Perplexity,” in (1953): 36-50.
1938. “Metaphysics and Verification,” in (1953a): 51-101.
1943. “Critical Notice: C. H. Waddington, and others, Science and Ethics,” in (1953a): 102-111.
1944a. “Moore’s Technique,” in (1953a): 120-148.
1944b. “Philosophy, Anxiety and Novelty,” in (1953a): 112-119.
1944c. “Gods,” in (1953a): 149-168.
1946a. “Philosophy and Psycho-Analysis,” in (1953a): 169-181.
1946b. “Other Minds,” in (1952a): 206-229.
1947. “Bertrand Russell and Modern Philosophy,” in (1953a): 195-209.
1948a. “Note on the New Edition of Professor’s Ayer’s Language, Truth and Logic,” in (1953a): 229-247.
1948b. “Things and Persons,” in (1953a): 217-228.
1950a. “Metaphysics,” in (1952a): 245-65.
1950b. “The Logic of God,” in (1965a): 1-22.
1952b. “Ludwig Wittgenstein, 1934-37,” in (1965a): 87-9.
1953b. “Philosophy, Metaphysics and Psycho-Analysis,” in (1953a): 248-82.
1957. “Paradox and Discovery,” in (1965a): 114-38.
1959. “G. E. Moore,” in (1965a): 82-87.
1961a. “A Feature of Wittgenstein’s Technique,” in (1965a): 90-103.
1961b. “The Metamorphoses of Metaphysics,” in (1965a): 57-81.
1965b. “Religious Belief,” in (1965a): 43-56.
1965c. “Existentialism,” in (1965a): 34-37.
1965d. “Freewill,” in (1965a): 23-33.
1971. “Epistemological Enlightenment,” Proceedings of the American Philosophical Association, 44: 32-44.

b. Secondary Sources

Austin, J. L. 1946. “Other Minds,” Philosophical Papers, 2nd ed., Oxford: Oxford University Press, 1970, p. 76-116.
Ayers, Michael. 2004. “John Wisdom,” Oxford Dictionary of National Biography, vol. 59, Oxford: Oxford University Press, p. 827-8.
Broad, C. D. 1924. “Critical and Speculative Philosophy,” in: J. H. Muirhead (ed.), Contemporary British Philosophy, 1st ser., London: Allen & Unwin, p. 75-100.
Flew, Antony. 1978. A Rational Animal, Oxford: Clarendon Press.
Milkov, Nikolay, The Varieties of Understanding: English Philosophy Since 1898, New York: Peter Lang, p. 435-521.
Moore, G. E. 1917. “The Conception of Reality,” in (1922), p. 197-219.
Moore, G. E. 1922. Philosophical Studies, London: Allen & Unwin.
Moore, G. E. 1966. Lectures on Philosophy, ed. by C. Lewy, London: Allen & Unwin.
Passmore, John. 1966. A Hundred Years of English Philosophy, 2nd ed., Harmondsworth: Penguin (1st ed. 1957).
Price, H. H. 1953. Thinking and Experience, London: Hutchinson University Library.
Russell, Bertrand. 1914. Our Knowledge of the External World, London: Routledge, 1993.
Russell, Bertrand. 1918. “The Philosophy of Logical Atomism”; in: idem, Logic and Knowledge, ed. by R. C. Marsh, London: Allen & Unwin, p. 175-281.
Ryle, Gilbert. 1949. The Concept of Mind, Harmondsworth: Penguin (2nd ed.), 1973.
Ryle, Gilbert. 1979. On Thinking, Oxford: Blackwell.
Stebbing, Susan. 1932-3. “The Method of Analysis in Metaphysics,” Proceedings of the Aristotelian Society 33: 65-94.
Stebbing, Susan. 1933. ‘Logical Positivism and Analysis’, Proceedings of the British Academy 19: 53-87.
Stebbing, Susan. 1933-4. “Constructions, Proceedings of the Aristotelian Society 34: 1-30.
Stout, G. F. 1931. Mind & Matter, Cambridge: Cambridge University Press.
Urmson, J. O. 1956. Philosophical Analysis: Its Development between the two World Wars, Oxford: Clarendon Press.
Wittgenstein, Ludwig. 1953. Philosophical Investigations, Oxford: Blackwell.
Wittgenstein, Ludwig. 2005. The Big Typescript, Oxford: Blackwell.

Author Information

Nikolay Milkov
Email: nikolay.milkov@upb.de
University of Paderborn
Germany

Plato: The Academy

plato Plato’s enormous impact on later philosophy, education, and culture can be traced to three interrelated aspects of his philosophical life: his written philosophical dialogues, the teaching and writings of his student Aristotle, and the educational organization he began, “the Academy.” Plato’s Academy took its name from the place where its members congregated, the Akadēmeia, an area outside of the Athens city walls that originally held a sacred grove and later contained a religious precinct and a public gymnasium.

In the fifth century B.C.E., the grounds of the Academy, like those of the Lyceum and the Cynosarges, the two other large gymnasia outside the Athens city walls, became a place for intellectual discussion as well as for exercise and religious activities. This addition to the gymnasia’s purpose was due to the changing currents in Athenian education, politics, and culture, as philosophers and sophists came from other cities to partake in the ferment and energy of Athens. Gymnasia became public places where philosophers could congregate for discussion and where sophists could offer samples of their wisdom to entice students to sign up for private instruction.

This fifth-century use of gymnasia by sophists and philosophers was a precursor to the “school movement” of the fourth century B.C.E., represented by Antisthenes teaching in the Cynosarges, Isocrates near the Lyceum, Plato in the Academy, Aristotle in the Lyceum, Zeno in the Stoa Poikile, and Epicurus in his private garden. Although these organizations contributed to the development of medieval, Renaissance, and contemporary schools, colleges, and universities, it is important to remember their closer kinship to the educational activities of the sophists, Socrates, and others.

Plato began leading and participating in discussions at the Academy’s grounds in the early decades of the fourth century B.C.E. Intellectuals with a variety of interests came to meet with Plato—who gave at least one public lecture—as well as conduct their own research and participate in dicussions on the public grounds of the Academy and in the garden of the property Plato owned nearby. By the mid-370s B.C.E., the Academy was able to attract Xenocrates from Chalcedon (Dillon 2003: 89), and in 367 Aristotle arrived at the Platonic Academy from relatively far-off Stagira.

While the Academy in Plato’s time was unified around Plato’s personality and a specific geographical location, it was different from other schools in that Plato encouraged doctrinal diversity and multiple perspectives within it. A scholarch, or ruler of the school, headed the Academy for several generations after Plato’s death in 347 B.C.E. and often powerfully influenced its character and direction. Though the Roman general Sulla’s destruction of the Academy’s grove and gymnasium in 86 B.C.E. marks the end of the particular institution begun by Plato, philosophers who identified as Platonists and Academics persisted in Athens until at least the sixth century C.E. This event also represents a transition point for the Academy from an educational institution tied to a particular place to an Academic school of thought stretching from Plato to fifth-century C.E. neo-Platonists.

The Academy Prior to Plato’s Academy: Sacred Grove, Religious Sanctuary, Gymnasium, Public Park
Athenian Education Prior to Plato’s Academy: Old Education, Sophists, Socrates and his Circle
The Academy in Plato’s Time
1. Location and Funding
2. Areas of Study, Students, Methods of Instruction
The Academy after Plato
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. The Academy Prior to Plato’s Academy: Sacred Grove, Religious Sanctuary, Gymnasium, Public Park

In early times, the area northwest of Athens near the river Cephissus was known as the Akadēmeia or Hekadēmeia and contained a sacred grove, possibly named after a hero called Akademos or Hekademos (Diogenes Laertius, Lives and Opinions of Eminent Philosophers III.7-8, cited hereafter as “Lives”). Plutarch mentions a mythical Akademos as a possible namesake for the Academy, but Plutarch also records that the Academy may have been named after a certain Echedemos (Theseus 32.3-4). While the Academy may have been named after an ancient hero, it is also possible that an ancient hero may have been created to account for the Academy’s name.

The Academy was bordered on the east by Hippios Kolonos and to the south by the Kerameikos district, which was famous for its pottery production. In the late sixth century B.C.E., the Peisistratid tyrant Hipparchus reportedly constructed a public gymnasium in the area known as the Academy (Suda, Hipparchou teichion). This building project, known for its expense, walled in part of the area known as the Academy. Hipparchus probably developed the gymnasium at the Academy to win favor with residents of the Kerameikos district. Like the other major gymnasia outside the city walls, the Lyceum and the Cynosarges, the Academy’s function as a gymnasium operated in tandem with its function as a religious sanctuary.

After Xerxes led the Persians to burn Athens in 480 B.C.E., Themistocles rebuilt the city wall in 478 B.C.E. (Thucydides 1.90), dividing the Kerameikos into an inner Kerameikos and outer Kerameikos. Some time afterwards, Cimon reportedly rebuilt the Academy as a public park and gymnasium by providing it with a water supply, running tracks, and shaded walks (Plutarch, Cimon 13.8). On the way to the Academy from Athens, one passed from the inner Kerameikos to the outer Kerameikos through the Dipylon gate in the city’s wall; continuing on the road to the Academy, one passed through a large cemetery. Referring to the area of the outer Kerameikos on the way to the Academy, Thucydides writes, “The dead are laid in the public sepulcher in the most beautiful suburb of the city, in which those who fall in war are always buried, with the exception of those slain at Marathon” (Thucydides 2.34.5, trans. Crawley). Pausanias, writing in the second century C.E., likewise describes the Academy as a district outside of Athens that has graves, sanctuaries, alters, and a gymnasium (Attica XXIX-XXX). In addition to the shrines, altars, and gymnasium mentioned by Thucydides and Pausanias, there were also gardens and suburban residences in the nearby area (Baltes 1993: 6).

Due to the improvements initiated by Hipparchus and Cimon, the Academy became a beautiful place to walk, exercise, and conduct religious observances. Aristophanes’ The Clouds, first produced in 423 B.C.E., contrasts the rustic beauty of the Academy and traditional education of the past with the chattering and sophistic values of the Agora. Describing the difference, Aristophanes’ “Better Argument” says,

But you’ll be spending your time in gymnasia, with a gleaming, blooming body, not in outlandish chatter on thorny subjects in the Agora like the present generation, nor in being dragged into court over some sticky, contentious, damnable little dispute; no, you will go down to the Academy, under the sacred olive-trees, wearing a chaplet of green reed, you will start a race together with a good decent companion of your own age, fragrant with green-brier and catkin-shedding poplar and freedom from cares, delighting in the season of spring, when the plane tree whispers to the elm. (1002-1008, trans. Sommerstein)

While The Clouds illustrates that the grounds of the Academy in the 420s had running tracks, a water source, sacred olive groves, and shady walks with poplar, plane, and elm trees, it is not clear whether the Academy was as free of sophistry as Aristophanes presents it, perhaps ironically, in his comedy. At any rate, the Academy was very soon to become a place for intellectual discussion, and its peaceful environment was also headed for disruption by the Spartan army’s occupation of its grounds during the siege of Athens in 405-4 B.C.E.

2. Athenian Education Prior to Plato’s Academy: Old Education, Sophists, Socrates and his Circle

The Greek word for education, paideia, covers both formal education and informal enculturation. Paideia was traditionally divided into two parts: cultural education (mousikē), which included the areas of the Muses, such as poetry, singing, and the playing of instruments, and physical education (gymnastikē), which included wrestling, athletics, and exercises that could be useful as training for battle. Instruction in cultural and physical education was not paid for by public expenditure in the archaic or classical period in Athens, so it was only available to those who could afford it. Education often took place in public places like gymnasia and palestras. During the classical period, writing and basic arithmetic became a basic part of elementary education as well. In addition to formal education, attendance at religious festivals, dramatic and poetic competitions, and political debates and discussions formed an important part of Athenians’ education. Broadly, an Athenian man educated in the “Old Education” championed by Aristophanes’ “Better Argument” would be familiar with the poetry of Homer and Hesiod, be able to read, write, and count well enough to manage his personal life and participate in the life of the polis, and be cultured enough to appreciate the city’s comic and tragic festivals.

In the fifth century B.C.E., philosophers and sophists came to Athens from elsewhere, drawn by the city’s growing wealth and climate of intellectual activity. Anaxagoras likely came to Athens sometime between 480 and 460 B.C.E. and associated with Pericles, the important statesman and general (Plato, Phaedrus 270a). Parmenides and Zeno came to Athens in the 450s, and sophist Protagoras from Abdera came to Athens in the 430s and also associated with Pericles. Gorgias the rhetorician from Leontini came to Athens in 427 B.C.E., and he taught rhetoric for a fee to Isocrates, Antisthenes, and many others.

Itinerant teachers like Protagoras and Gorgias both supplemented and destabilized the traditional education provided in Athens, as Aristophanes’ comedy The Clouds, the dialogues of Plato, and other sources document. In order to gain paying students, sophists, rhetoricians, and philosophers would often make presentations in public places like the Agora or in Athens’s three major gymnasia, the Academy, the Cynosarges, and the Lyceum. While the accounts of Xenophon and Plato contradict Aristophanes’ comic portrayal of Socrates as a teacher of rhetoric and natural science, the Platonic dialogues do show Socrates frequenting gymnasia and palestras in search of conversation. In the dialogue Euthyphro, Euthyphro associates Socrates with the Lyceum (2a); in the dialogue Lysis, Socrates narrates how he was walking from the Academy to the Lyceum when he was drawn into a conversation at a new wrestling school (203a-204a). Similarly, the Euthydemus presents a conversation between Socrates and two sophists in search of students in a gymnasium building on the grounds of the Lyceum (271a-272e). While Socrates, unlike the sophists, did not take payment or teach a particular doctrine, he did have a circle of individuals who regularly associated with him for intellectual discussion. While the establishment of philosophical schools by Athenian citizens in the major gymnasia of Athens seems to be a fourth-century phenomenon, the Platonic dialogues indicate that gymnasia were places of intellectual activity and discussion in the last decade of the fifth century B.C.E., if not before.

3. The Academy in Plato’s Time

As noted in the previous section, the Academy, the Lyceum, and the Cynosarges functioned as places for intellectual discussion as well as exercise and religious activity in the fifth century B.C.E. It is likely that the aristocratic Plato spent some of his youth at these gymnasia, both for exercise and to engage in conversation with Socrates and other philosophers. After Socrates’ death in 399 B.C.E., Plato is thought to have spent time with Cratylus the Heraclitean, Hermogenes the Parmenidean, and then to have gone to nearby Megara with Euclides and other Socratics (Lives III.6). Isocrates, student of Gorgias, began teaching in a private building near the Lyceum around 390 B.C.E., and Antisthenes, who also studied with Gorgias and was a member of Socrates’ circle, held discussions in the Cynosarges around that time as well (Lives VI.13). While the Platonic Academy is often seen as the prototype of a new kind of educational organization, it is important to note that it was just one of many such organizations established in fourth-century Athens.

It is likely that Isocrates and Antisthenes established schools of some sort before Plato. Contemporary scholars often assign a founding date for the Academy between the dates of 387 B.C.E. and 383 B.C.E., depending on these scholars’ assessment of when Plato returned from his first trip to Syracuse. Rather than assign a particular date at which the Academy was founded, as though ancient schools possessed formal articles or charters of incorporation (see Lynch 1972), it is more plausible to note that Plato began associating with a group of fellow philosophers in the Academy in the late 390s and that this group gradually gathered energy and reputation throughout the 380s and 370s up until Plato’s death in 347 B.C.E.

a. Location and Funding

Plato was himself from the deme of Collytus, a wealthy district southwest of the Acropolis and within the city walls built by Themistocles. Collytus was a few miles from the Academy, so Plato’s relocating nearby the Academy would have been an important step in establishing himself there. While some have emphasized the Academy’s remoteness from the Agora (Rihill 2003:174), the six stades (three quarters of a mile) from the Dipylon gate and three more stades from the Agora would not have constituted much of a barrier to anyone interested in seeing the goings on of the Academy in Plato’s time.

In keeping with the Academy’s customary use as a place of intellectual exchange, Plato used its gymnasium, walks, and buildings as a place for education and inquiry; discussions held in these areas were semi-public and thus open to public engagement and heckling (Epicrates cited in Athenaeus, Sophists at Dinner II.59; Aelian, Historical Miscellany 3.19; Lives VI.40). While some scholars have thought that Plato somehow resided in the sacred precinct and gymnasium of the Academy or purchased property there, this is not possible, for religious sanctuaries and areas set aside for gymnasia were not places where citizens (or anyone else) could set up residency. Rather, as Lynch, Baltes, and Dillon have argued, Plato was able to purchase a property with its own garden nearby the sanctuaries and gymnasium of Academy. While much of the Platonic Academy’s business was conducted on the public grounds of the Academy, it is natural that discussions and possibly shared meals would also occur at Plato’s nearby private residence and garden. Given the proximity of Plato’s private residence to the sanctuary and gymnasium of the Academy and the fact that his nearby property and school were both referred to as “the Academy” (Plutarch, On Exile 603b), there has been confusion about the particulars of the physical plant of the Platonic Academy.

Plato was of aristocratic stock and of at least moderate wealth, so he had the financial means to support his life of philosophical study. Following Socrates’ example and departing from the sophists and Isocrates, Plato did not charge tuition for individuals who associated with him at the Academy (Lives IV.2). Still, students at the Academy had to possess or come up with their own sustenance (Athenaeus, Sophists at Dinner IV.168). In addition to receiving funds from either Dion of Syracuse or Anniceris of Cyrene to purchase property near the Academy (Lives III.20), Diogenes Laertius records that Dion paid for Plato’s costs as choregus or chorus leader—a claim also made in Plutarch’s Dion XVII.2)—and purchased Pythagorean philosophical texts for him, and that Dionysus of Syracuse gave him eighty talents (Lives III.3,9). Part of the purpose of Plato’s trips to Syracuse may have been to participate in political reform, but it is also possible that Plato was seeking patrons for the philosophical activity engaged in at the Academy.

While it is probable that Plato associated with other philosophers, including the Athenian mathematician Theaetetus, in the Academy as early as the late 390s (see Nails 2009: 5-6; Nails 2002: 277; Thesleff 2009: 509-518 with Proclus’s Commentary on the First Book of Euclid’s Elements, Book 2, Chapter IV for more details on Theaetetus’s involvement with the Academy), it is the purchase of the property near the Academy after his trip to see Dion in Syracuse that scholars often refer to when speaking of the founding of the Academy in either 387 B.C.E. or 383 B.C.E. While purchase of this property was important to the development of the Platonic Academy, it is important to remember, as Lynch has shown, that Plato’s Academy was not legally incorporated or a juridical entity. While the wills of Theophrastus (Lives V.52-53) and Epicurus (Lives X.16-17) make provisions for the continuation of their schools and the future control of school property, the will of Plato does not mention the Academy as such (Lives III.41-43). This indicates that while the Platonic Academy was thriving during Plato’s lifetime, it was not essentially linked to any private property possessed by Plato (compare Dillon 2003: 9; see further Nails 2002: 249-250).

b. Areas of Study, Students, Methods of Instruction

The structure of the Platonic Academy during Plato’s time was probably emergent and loosely organized. Scholars infer from the varied viewpoints of thinkers like Eudoxus, Speusippus, Xenocrates, Aristotle, and others present in the Academy during Plato’s lifetime that Plato encouraged a diversity of perspectives and discussion of alternative views, and that being a participant in the Academy did not require anything like adherence to Platonic orthodoxy. In this way, Plato reflected Socrates’ willingness to discuss and debate ideas rather than the sophists’ claim to teach students mastery of a particular subject matter. To get a sense of the topics discussed in the Academy, our primary sources are the Platonic dialogues and our knowledge of the persons present at the Academy.

While it is tempting to talk of teachers and students at the Academy, this language can lead to difficulties. While Plato was clearly the heart of the Academy, it is not clear how, if at all, formal status was accorded to members of the Academy. The Greek terms mathētēs (student, learner, or disciple), sunēthēs (associate or intimate), hetairos (companion), and philos (friend), as well as other terms, seem to have been variously used to describe the persons who attended the Academy (Baltes 1993: 10-11; Saunders 1986: 201).

While the precise function of the Platonic dialogues within the Academy cannot be settled, it is practically certain that they were studied and perhaps read aloud by the Academics in Plato’s time. It is also likely that the dialogues were circulated as a way to attract possible students (Themistius, Orations 23.295). As a cursory survey, dialogues like the Republic, Timaeus, and Theaetetus show Plato’s interest in mathematical speculation; the Republic, Statesman, and the Laws attest to Plato’s interest in political theory; the Cratylus, Gorgias, and Sophist show an interest in language, logic, and sophistry, and many dialogues, including the Parmenides, Sophist, and Republic show an interest in metaphysics and ontology. While Plato’s interests were varied and interconnected, the topics of the dialogues reflect topics that Academics were likely to be engaged with.

The array of topics examined in Plato’s dialogues do parallel some of what we know about the philosophical interests of the individuals at the Academy in Plato’s lifetime. Theaetetus of Athens and Eudoxus of Cnidus were mathematicians, and Phillip of Opus was interested in astronomy and mathematics in addition to serving as Plato’s secretary and editor of the Laws. Aristotle, a wealthy citizen of Stagira, came to the Academy in 367 as a young man and stayed until Plato’s death in 347. Aristotle’s twenty-year long participation in the Platonic Academy shows Plato’s openness in encouraging and supporting philosophers who criticized his views, the Academy’s growing reputation and ability to attract students and researchers, and sheds some light on the organization of the Academy. Aristotle reportedly taught rhetoric at the Academy, and it is certain that he researched rhetorical and sophistical techniques there. It is very probable that Aristotle began writing many of the works of his that we possess today at the Academy (Klein 1985: 173), including possibly parts of the biological works, even though biological research based on empirical data is not a line of inquiry that Plato pursued himself. Aristotle’s multiple references to Platonic dialogues in his own works also suggest how the Platonic dialogues were used by students and researchers at the Academy. While most of the pupils at the Platonic Academy were male, Diogenes Laertius lists two female students, Lastheneia of Mantinea and Axiothea of Philius in his list of Plato’s students (Lives III.46-47).

While the Platonic Academy was a community of philosophers gathered to engage in research and discussion around a wide array of topics and questions, the Academy, or at least the individuals gathered there, had a political dimension. Plutarch’s Reply to Colotes claims that Plato’s companions from the Academy were involved in a wide variety of political activities, including revolution, legislation, and political consulting (1126c-d). The various Epistles ascribed to Plato support this view by attesting to Plato’s involvement in the politics of Syrcause, Atarneus, and Assos. While claims that the Academy was an “Organized School of Political Science” or the “RAND Corporation” of antiquity go too far in ascribing formal structure and organization to the Academy, Plato and the individuals associated with the Academy were involved in the political issues of their time as well as purely theoretical discussions about political philosophy.

As noted above, some of the discussions Plato held were on the public grounds of the Academy, while other discussions were held at his private residence. Aristoxenus records at least one poorly received public lecture by Plato on “the good” (Elements of Harmonics II.30), and a comic fragment from Epicrates records Plato, Speusippus, Menedemus, and several youths engaging in dialectical definition of a pumpkin (Athenaeus, Sophists at Dinner 2.59). While it is difficult to reconstruct how instruction occurred at the Academy, it seems that dialectical conversation, lecture, research, writing, and the reading of the Platonic dialogues were all used by individuals at the Academy as methods of philosophical inquiry and instruction.

Although the establishment of the Academy is an important part of Plato’s legacy, Plato himself is silent about his Academy in all of the dialogues and letters ascribed to him. The word “Academy” occurs only twice in the Platonic corpus, and in both cases it refers to the gymnasium rather than any educational organization. One occurrence, already mentioned, is from the Lysis, and it describes Socrates walking from the Academy to the Lyceum (203a). The other occurrence, in the spurious Axiochus, refers to ephebic and gymnastic training (367a) on the grounds of the Academy and does not refer to anything that has to do with Plato’s Academy.

Plato’s silence about the Academy adds to the difficulty of labeling his Academy with the English word “school.” Diogenes Laertius refers to Plato’s Academy as a “hairesis,” which can be translated as “school” or “sect” (Lives III.41). The noun “hairesis” comes from the verb “to choose,” and it thereby signifies “a choice of life” as much as “a place of instruction.” The head of the Academy after Plato was called the “scholarch,” but while scholē forms the root of our word “school” and was used to refer to Plato’s Academy (Lives IV.2), it originally had the meaning of “leisure.” The Greek word diatribē can also be translated as “school” from its connotation of spending time together, but no matter what Greek term is used, the activities occurring at the Academy during Plato’s lifetime do not neatly map on to any of our concepts of school, university, or college. Perhaps the clearest term to describe Plato’s Academy comes from Aristophanes’ Clouds, written at least three decades before the Academy was established: phrontistērion (94). This term can be translated as “think tank,” a term that may be as good as any other to conceptualize the Academy’s multiple and evolving activities during Plato’s lifetime.

4. The Academy after Plato

In 347 B.C.E. Plato died at the age of approximately eighty years old. According to Diogenes Laertius, Plato was buried in the Academy (Lives III.41). Unlike the claim that Plato purchased property in the sacred precinct of the Academy, this assertion is possible, for the grounds of the Academy were used for burial, shrines, and memorials. At any rate, Pausanias records that in his own time there was a memorial to Plato not far from the Academy (Attica XXX.3).

Although the entrenchment of the words “academy” and “academic” in contemporary discourse make the persistence of the Platonic Academy seem inevitable, this is probably not how it appeared to Plato or to members of the Academy after his death (Watts 2007: 122). Rather, the Academy continued to develop its sense of identity and plans for persistence after Plato’s death.

One way to develop a partial picture of the Academy after Plato’s death is to review the succession of Academic scholarchs. The chronological succession of scholarchs after Plato, according to Diogenes Laertius, is as follows:

Speusippus of Athens, Plato’s nephew, was elected scholarch after Plato’s death, and he held that position until 339 B.C.E.
Xenocrates of Chalcedon was scholarch until 314 B.C.E.
Polemo of Athens was scholarch of the Academy until 276 B.C.E.
Crates of Athens, a pupil of Polemo, was the next scholarch.
Arcesilaus of Pitane was scholarch until approximately 241 B.C.E.
Lacydes of Cyrene was scholarch until approximately 216 B.C.E.
Telecles and Evander, both of Phocaea, succeed Lacydes as dual scholarchs.
Hegesinus of Pergamon succeed the dual scholarchs from Phocaea.
Carneades of Cyrene succeeded Hegesinus.
Clitomachus of Carthage succeeded Carneades in 129 B.C.E.

While Clitomachus is the last scholarch listed by Diogenes Laertius, Cicero provides us with information about Philo of Larissa, with whom he himself studied (De Natura Deorum I.6,17). Philo was a pupil of Clitomachus and was a head of the Academy (Academica II.17; Sextus Empiricus, Outlines of Phyrrhonism I.220). Antiochus of Ascalon, who also taught Cicero, is sometimes considered a head of the Academy (Sextus Empiricus, Outlines of Phyrrhonism I.220-221), but his philosophical position (I.235) and the fact that his school did not meet on the grounds of the Academy (Cicero, De Finibus V.1) makes Antiochus’s school discontinuous with the Platonic Academy.

The terms “Old Academy,” “Middle Academy,” and “New Academy” are used in somewhat different ways by Cicero, Sextus Empiricus, and Diogenes Laertius to describe the changing viewpoints of the Platonic Academy from Speusippus to Philo of Larissa. What seems clear from the various accounts is that, with Arcesilaus, a skeptical edge entered into Academic thinking that persisted through Carneades and Philo of Larissa.

The Mithridatic War of 88 B.C.E. and Sulla’s destruction of the grounds of the Academy and Lyceum as part of the siege of Athens in 86 B.C.E. (Plutarch, Sulla XII.3) mark the rupture between the geographical precinct of the Academy and the lineage of philosophical instruction stemming from Plato that together constitute the Platonic Academy. The destruction of the gymnasium at the Lyceum also marks the end of Aristotle’s peripatetic school (Lynch 1972: 207).

While the Platonic Academy can be said to end with the siege led by Sulla, philosophers including Cicero, Plutarch of Chaeronea, and Proclus continued to identify themselves as Platonists or Academics. In 176 C.E., the Roman Emperor and Stoic philosopher Marcus Aurelius helped continue the influence of Platonic and Academic thought by establishing Imperial Chairs for the teaching of Platonism, Stoicism, Aristotelianism, and Epicureanism, but the holders of these chairs were not associated with the long-abandoned schools that once met on the grounds of the Lyceum or the Academy.

Sometime in the fourth century C.E., a Platonic school was reestablished in Athens by Plutarch of Athens, though this school did not meet on the grounds of the Academy. After Plutarch, the scholarchs of this Platonic school were Syrianus, Proclus, Marinus, Isidore, and Damascius, the last scholarch of this Academy. In 529 C.E. the Christian Roman Emperor Justinian forbade Pagans from publicly teaching, which, along with the Slavonic invasions of 580 C.E. (Lynch 1972: 167), marks an end of the flourishing of Neo-Platonism in Athens.

The Platonic Academy forms an important part of Plato’s intellectual legacy, and analyzing it can help us better understand Plato’s educational, political, and philosophical concerns. While studying the Academy sheds light on Plato’s thought, its history is also invaluable for studying the reception of Plato’s thought and for gaining insight into one of the crucial sources of today’s academic institutions. Indeed, the continued use of the words “academy” and “academic” to describe educational organizations and scholars through the twenty first century shows the impact of Plato’s Academy on subsequent education.

Today, the area that contains the sacred precinct and gymnasium that housed Plato’s Academy lies within a neighborhood known as Akadimia Platonos. The ruins of the Academy are accessible by foot, and a small museum, Plato’s Academy Museum, helps to orient visitors to the site.

5. References and Further Reading

a. Primary Sources

Aelian, (Claudius Aelianus) (2nd-3rd cn. C.E.). Historical Miscellany. Trans. Nigel G. Wilson. Cambridge, MA: Loeb Classical Library, 1997.
- Chapter XIX of Book 3 of Aelian’s Historical Miscellany is titled “Of the dissention between Aristotle and Plato.” This chapter records a conflict between Plato and Aristotle that has been used to infer that Plato had a private home where he taught in addition to leading conversations on the grounds of the Academy.
Aristophanes (c.448-380 B.C.E.). Clouds. Trans. Alan Sommerstein. Warminster: Aris and Phillips, 1991.
- While written too early to shed light on Plato, this text is crucial for understanding Athenian education, the sophists, and Socrates. It also contains the passage cited above that describes the grounds of the Academy in the 420s.
Aristotle (384-322 B.C.E.).
- The writings of Aristotle are a valuable resource for learning more about the philosophies of some of the individuals that were part of the early Academy. See for example the references to Speusippus in Metaphysics Zeta, Chapter 2, Lambda, Chapter 7, and Mu, Chapter 7; see also the references Euxodus in Metaphysics Alpha, Chapter 8, Lambda, Chapter 8, and Nicomachean Ethics, Book 10, Chapter 2.
Aristoxenus of Tarentum (c.370-300 B.C.E.). The Harmonics of Aristoxenus. Ed. and trans. Henry S. Macran. Oxford: Clarendon Press, 1902.
- Aristoxenus was a student of Aristotle’s and he is an early source for Plato’s public lecture “On the Good.”
Athenaneus of Naucratis (2nd-3rd cn. C.E.). The Deipnosophists. In Seven Volumes. Trans. Charles Burton Gluck. Cambridge, MA: Loeb Classical Library, 1951.
- This lengthy work is a source of much information about antiquity. Scholars of the Academy are particularly drawn to the fragment from Epicrates preserved by Athenaneus that gives a comic presentation of Platonic dialectic.
Cicero, Marcus Tullius (106-43 B.C.E.).
- Cicero’s many writings, including Academia, De Natura Deorum, De Finibus, and Tusculan Disputions contain information about the Academy.
Diogenes Laertius (2nd-3rd cn. C.E.). Lives and Opinions of Eminent Philosophers. Two Volumes. Trans. R. D. Hicks. Cambridge, MA: Loeb Classical Library, 1925.
- Diogenes is an invaluable resource for the lives of ancient philosophers, although he is writing five hundred or so years after the philosophers he describes.
Pausanias. (2nd cn. C.E.). Description of Greece. Four Volumes. Trans. W. H. S. Jones. Cambridge, MA: Loeb Classical Library, 1959.
- Book I of Pausanias’ work deals with Attica; Chapters XXI-XXX shed light on the history of the Academy and how it appeared to Pausanias several centuries later.
Philodemus. (c.110-c.30 B.C.E.). Index Academicorum.
- Philodemus was an Epicurean philosopher who wrote a work on the Platonic Academy. Some fragments of this work have been discovered. For more information, see Blank (2019), below.
Plato. Complete Works. Ed. John Cooper. Indianapolis: Hackett, 1997.
- While the dialogues and letters of Plato do not mention the Platonic Academy, they are an important resource in understanding Plato’s educational and political commitments and activities as well as the educational environment of Athens in the last few decades of the fifth century.
Plutarch of Chaeronea (c.45-125 C.E.). Parallel Lives and Moralia.
- Plutarch’s works are collected in the Loeb Classical Library under Lives (Eleven Volumes) and Moralia (Fifteen Volumes). Particularly valuable for the student of the Academy are Reply to Colotes and Life of Dion, but many of the works found in Plutarch’s corpus shed light on Plato, the Academy, and Platonism.
Proclus (412-485 C.E.). A Commentary on the First Book of Euclid’s Elements. Trans. Glenn R. Morrow. Princeton: Princeton University Press, 1970.
- Book 2, Chapter IV of Proclus’s commentary gives an account of the development of mathematics that includes helpful information about Plato and other members of the Academy. The “Foreword to the 1992 Edition” of Morrow’s translation by Ian Mueller is also helpful to students of Plato’s Academy.
Sextus Empiricus (2nd-3rd cn. C.E.). Outlines of Pyrrhonism. Four Volumes. Trans. R. G. Bury. Cambridge, MA: Loeb Classical Library, 1955.
- As part of his presentation of skepticism, Sextus articulates how skepticism and Academic philosophy differ in Book I, Chapter XXXIII.
Suda.
- The Suda is a tenth-century C.E. Byzantine Greek encyclopedia. The entries on “To Hipparchou teichion,” “Akademia,” and “Platon” were helpful for this article. An online version of the Suda can be accessed at http://www.stoa.org/sol/
Themistius (c.317-388 B.C.E.). The Private Orations of Themistius. Trans. Robert J. Penella. Berkeley: University of California Press, 2000.
- Themistius was a philosopher and senator in the fourth century C.E. who taught in Constantinople. In his 23rd Oration, “The Sophist” he relays that a Corinthian farmer became Plato’s student after he read the Gorgias; Axiotheia had a similar experience reading the Republic, and Zeno of Citium came to Athens after reading the Apology of Socrates.
Thucydides (c.5th cn. B.C.E.). The Peloponnesian War. Ed. Robert B. Strassler. Trans. Richard Crawley. New York: Touchstone, 1998.
- While Thucydides’ work does not shed light on the Academy, he does describe its environs and other aspects of Athenian history that are important for understanding Plato.

b. Secondary Sources

Athanassiadi, Polymnia. Damascius. The Philosophical History. Athens: Apamea Cultural Association, 1999.
Baltes, Matthias. “Plato’s School, the Academy,” Hermathena, No. 155 (Winter 1993): 5-26.
- A very clear and well documented portrait of Plato’s Academy.
Blank, David, “Philodemus,” The Stanford Encyclopedia of Philosophy (Spring 2019 Edition), Edward N. Zalta (ed.), URL = .
Brunt, P. A. “Plato’s Academy and Politics” in Studies in Greek History and Thought. Oxford: Oxford University Press, 1993.
Cherniss, Harold. The Riddle of the Early Academy. Berkeley: University of California Press, 1945.
Chroust, Anton-Herman. “Plato’s Academy: The First Organizational School of Political Science in Antiquity,” The Review of Politics, Vol. 29, No. 1 (Jan., 1967): 25-40.
Dancy, R. M. Two Studies in the Early Academy. Albany: State University of New York Press, 1991.
Dillon. John. The Heirs of Plato: A Study of the Old Academy (347-274 BC). Oxford: Clarendon Press, 2003.
- A study of the Academy with special attention to the philosophies of Plato’s successors.
Dillon, John. The Middle Platonists: 80 B.C. to A.D. 220. Ithaca: Cornell University Press, 1996.
Glucker, John. Antiochus and the Late Academy. Göttingen: Hypomnemata 56, 1978.
Hadot, Pierre. What is Ancient Philosophy? Trans. Michael Chase. Cambridge, MA: Harvard University Press, 2002.
Hornblower, Simon and Anthony Spawforth. The Oxford Classical Dictionary. 3rd ed. Oxford: Oxford University Press, 2003.
Klein, Jacob. Lectures and Essays. Annapolis: St. John’s College Press, 1985.
Lynch, John Patrick. Aristotle’s School: A Study of a Greek Educational Institution. Berkeley: University of California Press, 1972.
- This work is essential to anyone investigating classical educational institutions.
Mintz, Avi. Plato: Images, Aims, and Practices of Education. Cham: Switzerland: Springer, 2018.
Nails, Debra. Agora, Academy, and the Conduct of Philosophy. Dordrecht: Kluwer Academic Publishers, 1995.
Nails, Debra. The People of Plato: A Prosopography of Plato and Other Socratics. Indianapolis: Hackett Publishing, 2002.
- This work provides historical context for all of the individuals mentioned in the Platonic dialogues.
Nails, Debra. “The Life of Plato of Athens” in A Companion to Plato, edited by Hugh Benson. Malden, MA: Wiley-Blackwell Publishing, 2009.
Natali, Carlo. Aristotle: His Life and School. Edited by D. S. Hutchinson. Princeton: Princeton University Press, 2013.
Press, Gerald A., ed. The Bloomsbury Companion to Plato. London: Bloomsbury Academic, 2015.
- A very valuable reference work on Plato. Chapter 1, “Plato’s Life—Historical and Intellectual Context” and Chapter 5, “Later Reception, Interpretation and Influence of Plato and the Dialogues” are particularly valuable for those interested in the history of the Academy.
Preus, Anthony. Historical Dictionary of Ancient Greek Philosophy. 2nd edition. Lanham: Rowman & Littlefield Publishers, 2015.
- This clear and reliable historical dictionary is useful for students of ancient Greek philosophy.
Rihill, T. E. “Teaching and Learning in Classical Athens,” Greece & Rome, Vol. 50, No.2 (Oct., 2003): 168-190.
Saunders, Trevor J. “‘The Rand Corporation of Antiquity’? Plato’s Academy and Greek Politics” in Studies in Honor of T. B. L. Webster, vol. I, eds. J. H. Betts et al. Bristol: Bristol Classical Press, 1986.
Thesleff, Holger. Platonic Patterns: A Collection of Studies. Las Vegas: Parmenides Publishing, 2009.
Wareh, Tarik. The Theory and Practice of Life: Isocrates and the Philosophers. Cambridge, MA: Center for Hellenic Studies, 2012.
Watts, Edward. “Creating the Academy: Historical Discourse and the Shape of Community in the Old Academy, The Journal of Hellenic Studies, Vol. 127 (2007): 106-122.
- This article argues that the Old Academy developed in an unplanned fashion and that the Old Academy attempted to craft its identity based on life-style and character as much as doctrine.

Author Information

Lewis Trelawny-Cassity
Email: lcassity@antiochcollege.edu
Antioch College
U. S. A.

James Frederick Ferrier (1808—1864)

James Frederick Ferrier was a mid-nineteenth-century Scottish metaphysician who developed the first post-Hegelian system of idealism in Britain. Unlike the British Idealists in the latter half of the nineteenth century, he was neither a Kantian nor a Hegelian. Instead, he largely develops his idealist metaphysics via his defense of Berkeley and through his rejection of Thomas Reid’s philosophy of common sense. In this way, he is a transitional figure between the philosophy of Enlightenment Scotland and the development of British Idealism in the latter half of the nineteenth century. Ferrier was also the first philosopher in English to refer to the philosophy of knowledge as Epistemology.

The most fully realized version of his metaphysics appears in his Institutes of Metaphysic. For Ferrier, epistemology is primary and must be the starting point for philosophy. His metaphysics depends on the axiom that the minimum unit of cognition involves a synthesis of subject-with-object, which is the absolute in cognition. From here he develops an idealist ontology, which concludes that which really exists is the absolute: some self in union with some object. The central features of his philosophy include the importance of self-consciousness, a rejection of noumena or things-in-themselves, and his theory of ignorance.

Life and Works
Thought and Writings
Reception and Influence
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Life and Works

Ferrier was born in Edinburgh, Scotland, in 1808. His father, John Ferrier, was a lawyer known as a Writer to the Signet, and his mother was Margaret Wilson. His family was well connected; his uncle, John Wilson (also known as “Christopher North”), was an author and the Professor of Moral Philosophy at Edinburgh University, and his aunt was the novelist Susan Ferrier. Notable figures such as Sir Walter Scott, James Hogg, William Wordsworth, and Thomas De Quincey were acquainted with Ferrier and his family. He began his education in Ruthwell, Dumfriesshire, where he lived with the family of a Rev. Dr. Duncan. He then went to Edinburgh High School, followed by a period at another school in Greenwich. At the age of seventeen, he attended Edinburgh University for two academic sessions from 1825 to 1827. And, then in 1828 he moved to Oxford to study at Magdalen College for his B.A., which he received in 1831. His student life was unexceptional, and he did not show a particular aptitude for philosophy until later in his life.

He returned to Edinburgh after graduation and began a short-lived career in law. It was at this time that he developed his interest in philosophy. In the early 1830s he became friends with the philosopher Sir William Hamilton, and they remained in close contact until Hamilton’s death in 1856. Indicative of his growing interest in German thought, Ferrier traveled to Germany in 1834 where he spent several months in Heidelberg; his awareness of the German Idealists is apparent from the fact that he returned to Scotland with a photograph and a medallion of Hegel. In 1837 he married his cousin Margaret Wilson who was the daughter of his famous uncle “Christopher North.” By all accounts, they had a happy marriage and went on to have five children.

In the late 1830s, Ferrier started to publish articles in philosophy, and this led to his subsequent academic career. In 1842 he gained his first academic chair, becoming the Professor of Civil History at Edinburgh. In 1844-1845 he acted as Hamilton’s substitute in the Chair of Logic and Metaphysics at Edinburgh during the older philosopher’s illness. Then, in 1845, Ferrier moved his family to St. Andrews where he became the Professor of Moral Philosophy and Political Economy. He unsuccessfully attempted to get two Edinburgh Chairs: Moral Philosophy in 1852 and Logic and Metaphysics in 1856. He was unsuccessful in the first case due to sectarian politics and in the latter instance because his metaphysics were considered to be too far from the Scottish philosophy of his predecessors. For this reason, he remained at St Andrews for the remainder of his career. He died in St Andrews in 1863, and he is buried in St Cuthbert’s Churchyard, which is in the city center of Edinburgh.

Ferrier published several articles on literature and philosophy during his lifetime, and many of these were published in Blackwood’s Magazine. Among his articles, there are a few that are particularly indicative of his philosophical interests and eloquent writing style. These are his seven-part series “An Introduction to a Philosophy of Consciousness” (1838-1839), “Berkeley and Idealism” (1842), and “Reid and the Philosophy of Common Sense” (1847). A selection of his collected works appears in three volumes (originally published by Blackwood and Sons in 1875 and republished by Thoemmes Press in 2001). The first volume contains his most significant work, the Institutes of Metaphysic, which was originally published in 1854; here, Ferrier presents a complete system of metaphysics. The contemporary reaction to this was mixed, and Ferrier believed that certain critics, in an attempt to stifle his self-designated “new Scottish philosophy” in favor of the more traditional, or “old Scottish philosophy,” of his predecessors, deliberately misinterpreted his Institutes. Therefore, he subsequently wrote a scathing defense of the Institutes called Scottish Philosophy: The Old and the New (1856) in which he reiterates his arguments in favor of idealism and attacks his critics. A selection from Scottish Philosophy appears as “Appendix” to “Institutes of Metaphysic” in the first volume of his complete works. The second volume contains his lectures on Greek Philosophy, which he worked on in the later years of his life and was published posthumously. The final volume consists of a selection of his articles.

2. Thought and Writings

a. Self-consciousness

A topic that Ferrier concentrates on throughout his philosophical works is self-consciousness, which he generally refers to as “consciousness.” It is: “that notion of self, and that self-reference, which in man generally, though by no means invariably, accompanies his sensations, passions, emotions, play of reason, or states of mind whatsoever” (Ferrier 2001: vol. 3. 40). His focus on self-consciousness is central to his rejection of the Enlightenment goal to develop a “science of human nature.” Further, it forms the basis of his idealism.

He places upmost importance on self-consciousness because he believes that it is the peculiar and defining characteristic of humanity. He contends that things such as sensation and the capacity for reason are not only shared with other animals but they are given by nature; the human being who is subject to them is akin to “a spoke in an unresting wheel. Nothing connected with him is really his. His actions are not his own” (Ferrier 2001: vol. 3. 36). By contrast, consciousness is the act of will through which a thing becomes a person. One is not born conscious, it must be asserted: “The notion of self … is absolutely genetic or creative. Thinking oneself ‘I’ makes oneself ‘I,’ and it is only by thinking himself ‘I’ that a man can make himself ‘I’; or, in other words, change an unconscious thing into that which is now a conscious self” (Ferrier 2001: vol. 3. 109). Prior to consciousness there is no self or personality; without it the human being is a creature of nature that lives for others. Yet, post-consciousness a person’s acts are her own. It follows that consciousness is the precondition for everything that involves a self. In this way, consciousness is required for freedom, responsibility, morality, religion, and conscience.

Moreover, Ferrier explains in “An Introduction to a Philosophy of Consciousness” that a person’s knowledge of the external world depends on an act of negation in which she distinguishes between the self and the not-self. Thus, one becomes aware of the not-self in conjunction with the self. He describes this principle of idealism as “the fundamental act of humanity” (Ferrier 2001: vol. 3. 177). The concomitance of self and other forms the basis of his metaphysics, and it is a topic that he returns to throughout his published works.

In “An Introduction to a Philosophy of Consciousness” he sets out his concerns with contemporary philosophy and calls for a change of focus. His primary target is the Enlightenment goal to develop a “science of human nature.” In his view, this project is impossible because humanity is essentially different from anything else in the world that can be studied. For instance, in astronomy there is a distinction between the subject and the object; the scientist (the subject) is removed from the celestial objects (the objects) that she studies. Yet, in a “science of human nature” the philosopher is at once both the subject and the object. Now, given that self-consciousness is the defining feature of humanity and thereby central to any account of humanity, a problem arises. If the mind is an object of research, the object is deprived of its characteristic feature, namely self-consciousness, which remains with the subject of the research, leaving nothing but “a wretched association machine” (Ferrier 2001: vol. 3. 195). But, if the mind is considered with self-consciousness, then it cannot be properly considered an object of research because the objectivity is lost in so far as the subject and the object are identical. This leads Ferrier to suggest a change of focus for philosophy; instead of the empirical endeavor of a “science of human nature,” he prefers a more metaphysical approach, which is the development of a “philosophy of consciousness.”

In suggesting a “philosophy of consciousness,” Ferrier conceives philosophy as an extension of what people already do. Philosophy and self-consciousness are different only in degree and not in kind. Philosophy is a systematic and elevated self-consciousness, whereas self-consciousness is unsystematic and informal philosophy. He describes it as follows: “Consciousness is philosophy nascent; philosophy is consciousness in full bloom and blow … thus all conscious men are to a certain extent philosophers, although they may not know it” (Ferrier 2001: vol. 3. 197).

b. Reappraisal of Berkeley

Later in the nineteenth century, the British Idealists such as T. H. Green, F. H. Bradley, and Edward Caird were influenced by Kant and the German Idealists. Ferrier was aware of the German philosophers, but his own idealism does not appear to be directly influenced by them. Nonetheless, he was the first Scottish philosopher to seriously consider them. Thomas de Quincey said that: “he was introduced, as if suddenly stepping into an inheritance, to a German Philosophy refracted through an alien Scottish medium” (The Testimonials of J.F. Ferrier 1852, p.22). His friend and mentor, Hamilton, attempted to synthesize the commonsense philosophy deriving from Reid with the transcendental realism of Kant. Ferrier separates himself from Kant (and by extension also from Hamilton) by rejecting the existence of noumena or thing-in-themselves in the absence of percipient beings. He considers the German Idealists in a more favorable light, and he wrote biographical entries on both Schelling and Hegel for the Imperial Dictionary of Philosophy (see Ferrier 2001: vol. 3. 545-568). He also makes the occasional reference to Fichte, Schelling, and Hegel in his published works; in general, he views them positively, while depicting Hegel as an opaque genius. For instance, he says:

whatever truth there may be in Hegel, it is certain that his meaning cannot be wrung from him by any amount of mere reading, any more than the whisky which is in bread … can be extracted by squeezing a loaf into a tumbler. He requires to be distilled, as all philosophers do, more or less—but Hegel to an extent which is unparalleled. A much less intellectual effort would be required to find out the truth for oneself than to understand his exposition of it. (Ferrier 2001: vol. 1. 96)

Yet, the most important idealist influence for Ferrier was the Irish philosopher Berkeley: “we are disposed to regard [Berkeley] as the greatest metaphysician of his own county (we do not mean Ireland; but England, Scotland, and Ireland) at the very least” (Ferrier 2001: vol. 3. 458). Indeed, Ferrier, along with his contemporary Alexander Campbell Fraser, can be credited with reviving Berkeley’s philosophy in the nineteenth century. Ferrier refers to Berkeley on numerous occasions throughout his published works, and in “Berkeley and Idealism” he provides an argument for idealism that is developed out of his reaction to Berkeley. First, he defends Berkeley from the accusation that he denies the existence of the external world. Second, he expands on an idealist conception of non-existence, which is something that he believes that Berkeley has overlooked.

Berkeley shared Locke’s belief that ideas are the immediate objects of the mind. However, he rejected Locke’s view that ideas represent real things, and that real things are the indirect objects of the mind. Berkeley argued that ideas are the real things and that there is nothing beyond them. Thus, for Berkeley, the mind directly knows reality. His conclusion that ideas are real things led many to conclude that Berkeley denied the existence of material objects (for instance, see Leibniz, Samuel Johnson, and Reid). Yet, Ferrier strongly rejects the widespread belief that Berkeley denies the existence of matter. He argues that Berkeley readily accepts the existence of matter in the ordinary understanding of such; the external world consists of solid extended bodies that are perceived by the senses. However, he allows that Berkeley denies the existence of the world in itself, a world beyond perceivers. Ferrier emphasizes that what Berkeley wants to show is that reality is as it appears to perceivers; it is the immediate object of perceptions. He denies the existence of intermediate entities between the perceiver and reality and instead argues that that which is perceived is that which exists. In connection with this, Ferrier supports another aspect of Berkeley’s epistemology, specifically, his contention that primary and secondary qualities are akin in so far as each depends on perceivers and provide information about reality. Neither primary nor secondary qualities denote anything more objective about reality; reality is that which is perceived and both primary and secondary qualities are perceived.

Berkeley considered his own philosophy to be in line with common sense and Ferrier agrees. According to Ferrier, it is Berkeley rather than Reid who is “the champion of common sense” (Ferrier 2001: vol. 3. 301). Berkeley’s idealism places the mind in direct contact with reality; there are no intermediate entities. And, this, Ferrier suggests, is in line with the experience of ordinary people who do not distinguish between the perceptions of objects and the objects themselves. It is the notion of thing-in-themselves, or of a world that exists independently of perceivers that is at odds with common sense. Berkeley’s idealism, by contrast, is in accordance with common sense.

On the one hand, Ferrier describes Berkeley as “the champion of common sense.” On the other hand, he says that the significance of Berkeley’s philosophy is that he provides the basis for absolute idealism. He says:

[Berkeley] was the first to stamp the indelible impress of his powerful understanding on those principles of our nature, which, since his time, have brightened into imperishable truths in the light of genuine speculation. His genius was the first to swell the current of that mighty stream of tendency towards which all modern meditation flows, the great gulf-stream of Absolute Idealism. (Ferrier 2001: vol. 3. 293)

For Ferrier, common sense and absolute idealism are complementary. According to Ferrier, when “genuine idealism” is “instructed by the unadulterated dictates of common sense” it is indistinguishable from “genuine unperverted realism” (Ferrier 2001: vol. 3. 309).

His admiration for Berkeley is clear and he says: “Among all philosophers, ancient or modern, we are acquainted with none who presents fewer vulnerable points than Bishop Berkeley” (Ferrier 2001: vol. 3. 291). Nevertheless, he acknowledges that there is a weakness in Berkeley’s philosophy, namely, his failure to address non-existence. Something that is levied against idealism is the suggestion that it contains the implication that things flit in and out of existence; for example, the tree exists only in so far as it is perceived, and when it is not perceived, it cannot exist. Ferrier recognizes that Berkeley’s account seems to suggest that the world exists only in so far as it is perceived. He believes that this makes him vulnerable to accusations of subjective idealism. To overcome this, Ferrier broadens Berkeley’s account to include non-existence.

There are two parts to his discussion of non-existence. First, he reiterates the Berkeleian argument that mind-independent objects cannot exist because it is impossible to conceive of them. He says that if a philosopher speaks of the world-as-it-is-in-itself (for instance, the world existing prior to and following the existence of percipient beings), they are obliged to posit an ideal percipient. For example, in order to think of the River Nile existing in a world where there are no percipient beings, one must think about it in terms of its perceivable qualities: size, color, boundaries and so forth. But, in thinking of such things, one is still thinking of the act of perception and not the thing-in-itself. Here, Ferrier returns to “the fundamental act of humanity.” He emphasizes that that which is perceived is inseparable from the act of perception; it is impossible to consider what is seen in isolation from the act of seeing, what is heard in isolation from the act of hearing, and so on.

Second, Ferrier asserts that this argument must be extended to included non-existence as well. Not only is the existence of the world inconceivable without a real or ideal perceiver, but also non-existence similarly requires such a perceiver. In order to conceive nothing, that is silence, colorlessness, tastelessness, and so forth, the philosopher must refer to her perceptual framework. He develops Berkeley’s view that existence is percipi by insisting that non-existence is also percipi. Using Kantian language, he argues that “no phenomena, not even … the phenomenon of the absence of phenomena, are thus independent or irrespective” (Ferrier 2001: vol. 3. 315). Ferrier contends that it is not only matter that depends upon perceivers but also the non-existence of matter. He says:

[U]niversal colourlessness, universal silence, universal impalpability, universal tastelessness, and so forth, are just as much phenomena requiring, in thought, the presence of an ideal percipient endowed with sight and hearing and taste and touch, as their more positive opposites were phenomena requiring such a percipient. (Ferrier 2001: vol. 3. 311)

In this way, non-existence is just as much a known concept as existence. In order to conceive of either the existence or the non-existence of the world, a percipient being, whether real or ideal, is required. By supplementing Berkeley’s theory in this manner, he believes it becomes invulnerable to accusations of subjective idealism; one cannot say that the world will cease to exist in the absence of percipient beings because percipient beings are required to conceive of the world ceasing to exist.

c. Critique of Reid

Although he died more than a decade before Ferrier was born, Thomas Reid’s influence on Scottish philosophy remained strong during Ferrier’s youth and career. Hamilton is famous for his annotated edition of Reid’s works, and while Ferrier professes admiration for Hamilton’s scholarship, he wholeheartedly rejects the focus of his intellect. In Ferrier’s view, Reid produced a form of realism that not only failed to overcome the representative theory of perception but also resulted in its own form of representationism. Additionally, for Ferrier, Reid’s commonsense philosophy is inadequate and anti-philosophical. Instead, he calls for a new Scottish philosophy that is more systematic and rational; that is, an idealist metaphysics.

Reid was a Berkeleyan in his youth, but Hume’s skepticism led him to reassess his philosophical assumptions, which, in turn, led him to reject the theory of ideas. A version of the theory of ideas can be found in a range of philosophers from Descartes to Hume. In general, this theory posits that ideas are the immediate objects of one’s mind. This epistemological belief allows for a variety of metaphysical positions, including: Locke’s realism, Berkeley’s idealism, and Hume’s skepticism. Reid recognized that Hume’s astute reasoning was the logical development of the theory of ideas. At the same time, he could not accept Hume’s conclusions that we must be skeptical about things such as the continued existence of objects or the continuation of one’s personal identity. Thus, Reid examined the foundations of this theory: the existence of ideas. He realized that he had no experience of ideas and concluded that they are philosophical constructs, which are at odds with common sense. According to Reid, all persons share a priori commonsense principles upon which all reasoning depends. For instance, the belief in the existence of the external world, the principle of causality, and the belief that one is the same person she was yesterday and will be tomorrow, all count among Reid’s principles of common sense. The aspect of Reid’s theory that is most important for Ferrier is his philosophy of perception. Reid holds that we perceive objects directly and not via intermediate entities such as ideas. In his view, all persons have a commonsense belief in the existence of the external world that is irresistible and prior to reasoning. In this way, Reid was said to remove representationism from the theory of perception; the objects of knowledge are the things themselves rather than representative intermediaries such as ideas. Ferrier, however, argues that Reid failed to disprove representationism and that Reid’s theory of perception retains a form of representationism.

A discussion of the perception of matter is central to Ferrier’s philosophical writings, and it is this issue that he believes demonstrates the central difference between Berkeley and the commonsense school. One of his main talking points is representationism. On this topic, he dismissively says that “Berkeley thus accomplished the very task which, fifty or sixty years afterwards, Reid laboured at in vain” (Ferrier 2001: vol. 1. 490). Ferrier believes that Reid and others have misunderstood Berkeley by mistaking him for a representationist. Yet, Ferrier believes that idealism—both his own and Berkeley’s—is the only type of philosophy that can overcome representationism. He criticizes Reid’s theory of perception throughout his published works, and his argument against him is best expressed in his article “Reid and the Philosophy of Common Sense.” Here, he refutes Reid’s realist account of perception and develops his own idealist theory.

Ferrier divides philosophical accounts of perception into two schools: the metaphysical school and the psychological school. His idealist metaphysics is an example of the former and Reid’s commonsense philosophy is an example of the latter. Both schools accept that the perception of matter occurs, yet, they disagree about what this entails. Ferrier considers “the perception of matter” to be a whole, indivisible unit:

In the estimation of metaphysic, the perception of matter is the absolutely elementary in cognition, the ne plus ultra of thought. Reason cannot get beyond, or behind it. It has no pedigree. It admits of no analysis. It is not a relation constituted by the coalescence of an objective and a subjective element. It is not a state or a modification of the human mind. It is not an effect which can be distinguished from its cause. It is not brought about by the presence of antecedent realities. It is positively the FIRST, with no forerunner. The perception-of-matter is one mental word, of which the verbal words are mere syllables. (Ferrier 2001: vol. 3. 410, 411)

On the other hand, there is the psychological school’s approach to the perception of matter, which considers the relation between two component parts: the subjective perception and the objective matter. And, in Ferrier’s view, this approach leads to representationism.

Representationists make a distinction between an immediate and a remote object of the mind. For instance, Locke argues that we know things in the world via our ideas; things are the indirect objects of our minds, whereas ideas are the immediate object of our minds. What Ferrier believes is that Reid and other “psychologists” similarly set up a remote and an immediate object of the mind in their accounts of perception. He argues that the psychological school holds that there is the material world which exists regardless of whether it is perceived or not and that there are percipient beings who know the material world via their perceptions of it. It follows that in this account of the perception of matter there is both an objective aspect (the external world) and a subjective aspect (the subject’s perception of that world). He observes that this creates both an immediate and a remote object of knowledge; the subject knows her perception of the world immediately, whereas she knows the world remotely and only via her perception of it. He says:

When a philosopher divides, or imagines that he divides, the perception of matter into two things, perception and matter; holding the former to be a state of his own mind, and the latter to be no such state; he does, in that analysis, and without saying one other word, avow himself to be a thoroughgoing representationist. For his analysis declares that, in perception, the mind has an immediate or proximate, and a mediate or remote object. Its perception of matter is the proximate object, the object of its consciousness; matter itself, the material existence, is the remote object—the object of its belief. (Ferrier 2001: vol. 3. 415)

Therefore, Ferrier suggests that in avoiding representationism, Reid and others are paradoxically guilty of the very thing that they are attempting to dispel. In order to truly avoid representationism Ferrier insists on an idealist account of perception. Again he returns to “the fundamental act of humanity.” In his view, the “perception of matter” is a composite that cannot be broken down into its constituent parts; subjects and objects are always presented at once and can never be separated.

While Ferrier’s critique of Reid’s analysis if the perception of matter is astute, at other times, he makes derogatory remarks about his predecessor in an ad hominem manner. For instance, he says that when Reid is considered alongside philosophers such as Berkeley or Hume, he is akin to a “whale in a field of clover” (Ferrier 2001: vol. 1. 495). Remarks such as these have more to do with the dominance of commonsense philosophy during his lifetime and the ways in which it hampered his own career than with a thoughtful analysis of Reid’s ideas. Yet, despite his dismissal of Reid and the philosophy of common sense, Ferrier, nevertheless, wants to retain the language of “common sense.” Indeed, he believes that his own idealism is an example of an enlightened system of common sense.

d. Idealist Metaphysics

One of Ferrier’s criticisms with the philosophy of common sense is that he believes it formalizes the inadequacies of ordinary thinking.

Common sense … is the problem of philosophy, and is plainly not to be solved by being set aside, but just as little is it to be solved by being taken for granted, or in other words, by being allowed to remain in the primary forms in which it is presented to our notice. (Ferrier 2001: vol. 3. 64)

By contrast, he thinks that philosophy should fulfill a corrective purpose; he says: “philosophy exists only to correct the inadvertencies of man’s ordinary thinking” (Ferrier 2001: vol. 1. 32). A rational consideration of the laws of thought is required to separate unrefined opinions from the “genuine principles of common sense.” This is exactly what he tries to achieve in his major work the Institutes of Metaphysic; here, he attempts to systematically reveal the laws of thought via reason.

The Institutes is arranged into three main books, which follow on from one another: the Epistemology, the Agnoiology or theory of ignorance, and finally the Ontology. Together, they comprise his idealist metaphysics. Unusually, for a philosophical work, the Institutes is written in a deductive style. Ferrier’s metaphysics are deduced from an axiomatic, self-evident principle. In the introduction to his Institutes he asserts that: “From this single proposition the whole system is deduced in a series of demonstrations, each of which professes to be as strict as any demonstration in Euclid, while the whole of them taken together constitute one great demonstration” (Ferrier 2001: vol. 1. 30). His “Epistemology” consists of twenty-two propositions, the “Agnoiology” has eight propositions, and he concludes with the eleven propositions that form his “Ontology.” Each proposition involves a demonstration and a subsequent discussion in which he posits a counter-proposition that he disproves.

While Ferrier’s own philosophy is largely unknown to contemporary epistemologists, it is noteworthy that he was the first philosopher in English to call the philosophy of knowledge “epistemology.” His own epistemology is central to his philosophy as is evident from the fact that it forms the largest part of his metaphysics. It is also the common focus that appears in all of his published works. In his 1841 article “The Crisis of Modern Speculation,” he says: “Before we can be entitled to speak of what is, we must ascertain what we can think” (Ferrier 2001: vol. 3. 272). And, this is a principle that he follows in the Institutes by grounding his metaphysics in his epistemology. For Ferrier, it is important to secure of the laws of thought before making any positive statements about reality. Thus, “Proposition I” or “the primary law or condition of all knowledge” is the axiom from which the rest of Ferrier’s system follows. It asserts that: “Along with whatever any intelligence knows, it must, as the ground or condition of knowledge, have some cognisance of itself” (Ferrier 2001: vol. 1. 79).

The first proposition asserts that self-consciousness is the necessary concomitant of all knowledge; in knowing anything (for example, “that Tuesday follows Monday,” or “that one is reading Ferrier’s metaphysics”), at the same time, a person knows herself. In this way, Ferrier’s Institutes are the natural development of his work on consciousness; self-consciousness, as the peculiar feature of humanity, shapes his entire metaphysics. From this starting point, the main deductive conclusion that follows is that the minimum unit of cognition requires some self in union with some object. This forms Ferrier’s conception of the absolute; for Ferrier, a synthesis of subject-with-object is the absolute in knowledge.

If that which can be known must be a synthesis of subject-with-object, then, this is a union, which cannot be broken down into its constituent parts. As such, there can be no mere objects or matter per se. He says:

Everything which I, or any intelligence, can apprehend, is steeped primordially in me … Whether the object be what we call a thing or what we call a thought, it is equally impossible for any effort of thinking to grasp it as an intelligible thing or as an intelligible thought, when placed out of all connection with the ego. This is a necessary truth of all reason—an inviolable law of all knowledge. (Ferrier 2001: vol. 1. 120)

Hence, in perception, there can be no objects as they are, independent of knowers (typically known as things-in-themselves or noumena). For Ferrier, things-in-themselves are not objects of knowledge; they are unthinkable and as such they are the contradictory and unknowable by any mind, including by a supreme knower. In rejecting things-in-themselves, he has in mind Reid but also Hamilton and Kant as well as any philosophers who hold that there is a noumenal world. In his idealist epistemology, the notion of a thing-in-itself contradicts the laws of thought; one cannot conceive of a thing-in-itself because the synthesis of subject-with-object is the minimum unit of cognition, which cannot be broken down. Similarly, subjects-in-themselves are unknowable by all minds, including that of a supreme knower. In this way, the ego or self in itself is unknowable. While the self is the constant concomitant of all knowledge, there must also be an object that it is conjoined with. Ferrier calls the self the universal in all knowledge and the object is the particular in all knowledge.

Once he has established what can be known, he wants to reveal what cannot be known. Thus, in his Agnoiology he considers what, if anything, is a possible object of ignorance. This is one of the most unique and interesting features of Ferrier’s philosophy because the philosophy of ignorance has been given limited attention in the history of philosophy. His definition of ignorance is: not knowing that which could be known. In his view, ignorance involves a deficit or a privation of knowledge; it is a failure by the knower, to know something that could be known. In some cases, this might be a result of one’s limited constitution; for instance, a finite knower has more limited abilities for cognition than a supreme knower and there are some things that a finite knower could never know but are nevertheless the object of knowledge for some knower. In other cases, this might be a failure of will or effort; for instance, one might not know the time of day at a given moment, although that is something that could be rectified. By contrast, there are things that could never be known by any knower, including a supreme knower. This is what Ferrier designates the contradictory. For instance, no one, including a supreme knower, could know that 2 + 2 = 5 because this violates the laws of reason. For Ferrier, not knowing the contradictory is not ignorance but rather evidence of the strength of reason. Thus, “Proposition III” of his “Agnoiology” or “the law of all ignorance” asserts that: “We can only be ignorant of what can possibly be known; in other words, there can be an ignorance only of that of which there can be a knowledge” (Ferrier 2001: vol. 1. 412).

Given that in his “Epistemology” he has already concluded that the object of knowledge must be a synthesis of subject-with-object, the central conclusion of the “Agnoiology” is that that which we are ignorant of is a synthesis of subject-with-object, or in other words, the absolute in cognition. That which is the object of knowledge is some synthesis of subject-with-object. That which is the object of ignorance is some synthesis of subject-with-object. Thus, the possible objects of knowledge and ignorance are one and the same: the absolute in cognition. It follows that matter per se and the ego per se are neither the objects of knowledge nor ignorance. He returns to his contention that his idealism is in line with common sense when he says:

Novel, and somewhat startling, as this doctrine may seem, it will be found, on reflection, to be the only one that is consistent with the dictates of an enlightened common sense … If we are ignorant at all (and who will question our ignorance?) we must be ignorant of something; and this something is not nothing, nor is it the contradictory. (Ferrier 2001: vol. 1. 434)

Once Ferrier has established that the absolute must be the object of knowledge and ignorance, he moves to the question of being and considers what is. His “Ontology” directly follows from his “Epistemology” and the “Agnoiology.” In the opening proposition of this section he sets out the possibilities for that which is, which he refers to as “Absolute Existence.” It must be that which is (1) an object of knowledge, (2) that which is an object of ignorance, or (3) that which is neither an object of knowledge nor an object of ignorance. That which we can neither know nor be ignorant of is the contradictory and as such cannot be that which absolutely exists; Ferrier argues that this is a conclusion that even skeptics must allow for. He says:

No form of scepticism has ever questioned the fact that something absolutely exists, or has ever maintained that this something was the nonsensical. The sceptic, even when he carries his opinions to an extreme, merely doubts or denies our competency to find out and declare what absolutely exists. (Ferrier 2001: vol. 1. 466)

Therefore, that which exists must be the object of knowledge or ignorance, or, in other words, it is the absolute: a synthesis of subject-with-object.

The influence of Berkeley again becomes apparent in the development of his idealist ontology because he concludes the Institutes with the proposition that there is only one necessary absolute existence, namely, a supreme mind in synthesis with the universe. He says: “All absolute existences are contingent except one; in other words, there is One, but only one, Absolute Existence which is strictly necessary; and that existence is a supreme and infinite, and everlasting Mind in synthesis with all things” (Ferrier 2001: vol. 1. 522). Grounding Ferrier’s metaphysics is the notion that God is both the supreme knower and the only necessary knower. Every other knower is finite and contingent; therefore, the existence of reality cannot depend on them. Ferrier argues that reason dictates that there must be a supreme mind to prevent the universe from being contradictory. This is because objects per se are contradictory. Therefore, the universe, which constitutes the objective part of knowledge, must be in conjunction with some subject in order to provide it with existence.

3. Reception and Influence

Ferrier was arguably the best Scottish philosopher of his generation. However, his contemporaries did not uniformly welcome his idealist metaphysics, believing the Institutes to be too far removed from the philosophy of his predecessors. Commonsense philosophy was dominant in the Scottish universities in the decades following Reid’s death. Subsequent generations of philosophers from Dugald Stewart to Hamilton defended some version of commonsense philosophy, which led nineteenth-century writers such as Ferrier, Andrew Seth Pringle-Pattison, and James McCosh to speak of a tradition of “Scottish philosophy.” In the history of Scottish philosophy, the role of the universities was of considerable importance, and acquiring a key university Chair often signified the status of the philosopher at the time. Many important philosophers held such academic chairs; for instance, both Adam Smith and Thomas Reid held the Chair of Moral Philosophy at Glasgow, Dugald Stewart was the Chair of Moral Philosophy at Edinburgh, and Sir William Hamilton was the Chair of Logic and Metaphysics at Edinburgh. A notable exception to this list is David Hume who unsuccessfully tried to acquire Chairs of philosophy at both Edinburgh and Glasgow. In many respects, Ferrier was the obvious candidate to succeed Hamilton in the esteemed Chair of Logic and Metaphysics at Edinburgh. Although Hamilton was best known for his editions of Reid’s works, he tried to combine Reid with Kant, while placing a greater emphasis on metaphysics than there had been before. Ferrier developed this tendency towards metaphysics even further with his idealism and his rejection of Reid’s commonsense philosophy. Additionally, Ferrier had taught in place of Hamilton during his mentor’s illness during the forties, and he was highly esteemed by Hamilton and others for his philosophical acuity. Nevertheless, Ferrier was unsuccessful in his attempt to acquire the Chair of Logic and Metaphysics in 1856, losing out to the lesser-known Alexander Campbell Fraser.

He reacted angrily to his defeat and it led him to produce his polemical work Scottish Philosophy: The Old and the New, which is a defense of his philosophical system as well as a scathing attack on his opponents. Ferrier’s animosity is not directed at Fraser; instead, he targets those who campaigned against him as well as Edinburgh’s Town Council who were responsible for appointing Hamilton’s successor. Here, he employs extraordinary rhetoric to argue that there is a distinction between old and new Scottish philosophy. In his analysis, his idealist metaphysics represents a “new Scottish philosophy,” whereas adherence to Reid and Hamilton is equivalent to perpetuating the “old Scottish philosophy.” In the campaign against Ferrier, his idealism was portrayed as being insufficiently Scottish. He replies that his philosophy is quintessentially Scottish even though it differs from Reid and Hamilton in certain respects. He says: “Philosophy is not traditional. As a mere inheritance it carries no benefit to either man or boy. The more it is a received dogmatic, the less it is a quickening process” (Ferrier 1856: 9). To discredit Ferrier his philosophy was compared to both Hegel and Spinoza with associations of pantheism and atheism mixed with nationalism and xenophobia. Ferrier denies the accusation that his philosophy is Hegelian and points out that claims to the contrary are simply propaganda. Moreover, he responds to suggestions that his philosophy is similar to Spinoza’s by wholeheartedly demonstrating his antipathy toward those who campaigned against him: “all the outcry which has been raised against Spinoza has its origin in nothing but ignorance, hypocrisy, and cant” (Ferrier 1856: 14). Ferrier was educated in the Scottish tradition, and the work he created was in direct reaction to it. The difference between Ferrier’s Institutes of Metaphysic and Reid’s philosophy of common sense is substantial. However, the difference between Ferrier’s thought and Hamilton’s is less dramatic.

Ironically, some decades later, the association with Hegel did not carry a negative connation. Alexander Campbell Fraser went on to teach several of the British Idealists of the latter part of the nineteenth century, and Edward Caird, an avowed Hegelian, was the Professor of Moral Philosophy in Glasgow for several years. The idealist R. B. Haldane summed up this change in attitude when he said: “The Time-Spirit is fond of revenges” (Haldane 1899: 9). In retrospect, Ferrier’s idealism appeared a few decades too early to be received by a receptive audience.

4. References and Further Reading

a. Primary Sources

Ferrier, James Frederick, Philosophical Works of James Frederick Ferrier, 3 vols: i. Institutes of Metaphysic, ii. Lectures on Greek Philosophy, iii. Philosophical Remains, Bristol: Thoemmes Press, 2001.
Ferrier, James Frederick, Scottish Philosophy: The Old and the New, Edinburgh: Sutherland and Knox, 1856.

b. Secondary Sources

Boucher, David, “Introduction” in The Scottish Idealists: Selected Philosophical Writings, Exeter: Imprint Academic, 2004.
Broadie, Alexander, A History of Scottish Philosophy, Edinburgh: Edinburgh University Press, 2009.
Cairns, Revd. J, An Examination of Professor Ferrier’s “Theory of Knowing and Being,” Edinburgh: Thomas Constable and Co, 1856.
Davie, George, Ferrier and the Blackout of the Scottish Enlightenment. Edinburgh: Edinburgh Review, 2003.
Davie, George, The Democratic Intellect: Scotland and Her Universities in the Nineteenth Century. Edinburgh: Edinburgh University Press, 1961.
Davie, George, The Scotch Metaphysics A Century of Enlightenment in Scotland. London: Routledge, 2001.
Ferreira, Phillip, “James Frederick Ferrier” in A. C. Grayling, Naomi Goulder, and Andrew Pyle (eds.), Continuum Encyclopedia of British Philosophy, London: Thoemmes Continuum, 2006, ii. 1085-1087.
Fraser, Alexander Campbell, “Ferrier’s Theory of Knowing and Being” in Essays in Philosophy. Edinburgh: W.P. Kennedy, 1856.
Graham, Graham (ed.), Scottish Philosophy in the Nineteenth and Twentieth Centuries, Oxford: Oxford University Press, 2015.
Graham, Graham, “The Nineteenth-Century Aftermath” in Broadie, Alexander ed. The Cambridge Companion to the Scottish Enlightenment, Cambridge: Cambridge University Press, 2003.
Haldane, E. S., James Frederick Ferrier. Edinburgh and London: Oliphant Anderson & Ferrier, 1899.
Haldane, John, “Introduction” in Ferrier, James Frederick, Philosophical Works of James Frederick Ferrier, Bristol: Thoemmes Press, i. Institutes of Metaphysic, 2001.
Jaffro, Laurent, “Reid said the business, but Berkeley did it.” Ferrier interprète de l’immatérialisme in Revue philosophique de la France et de l’étranger 135: 1, pp.135-149, 2010.
Keefe, Jenny, “James Ferrier and the Theory of Ignorance” in The Monist, Volume 90, No.2, pp.297-309, 2007.
Keefe, Jenny, “The Return to Berkeley” in British Journal for the History of Philosophy, Volume 15, Issue 1, pp.101-113, 2007.
Lushington, E. L., “Introductory Notice” in Ferrier, James Frederick, Philosophical Works of James Frederick Ferrier, Bristol: Thoemmes Press, ii. Lectures on Greek Philosophy, 2001.
Mander, W. J., British Idealism: A History, Oxford: Oxford University Press, 2011.
Mander, W. J. and Panagakou, S., British Idealism and the Concept of the Self, London: Palgrave Macmillan, 2016.
Mander, W. J. (ed.), The Oxford Handbook of British Philosophy in the Nineteenth Century, Oxford: Oxford University Press, 2014.
Mayo, Bernard, “The Moral and the Physical Order: A Reappraisal of James Frederick Ferrier,” Inaugural Lecture, University of St Andrews, 1969.
McCosh, James, The Scottish Philosophy, New York: Robert Carter and Brothers, 1875.
McDermid, Douglas, “Ferrier and the Myth of Scottish Common Sense Realism” in Journal of Scottish Philosophy, Volume 11, Issue 1, pp.87-107, 2013.
McDermid, Douglas, The Rise and Fall of Scottish Common Sense Realism, Oxford: Oxford University Press, 2018.
Muirhead, J. H., The Platonic Tradition in Anglo-Saxon Philosophy, London: George Allen & Unwin, 1931.
Segerstedt, Torgny T., The Problem of Knowledge in Scottish Philosophy (Reid-Stewart-Hamilton-Ferrier). Lund: Gleerup, 1931.
Seth, Andrew, Scottish Philosophy: A Comparison of the Scottish and German Answers to Hume, Edinburgh and London: William Blackwood and Sons, 1885.
Sorley, W. R., A History of English Philosophy, Cambridge: Cambridge University Press, 1920.
Thomson, Arthur, Ferrier of St Andrews: An Academic Tragedy, Edinburgh: Scottish Academic Press, 1985.
The Testimonials of J.F. Ferrier, Candidate for the Chair of Moral Philosophy in the University of Edinburgh, Second Series, 1852.

Author Information

Jenny Keefe
Email: keefe@uwp.edu
University of Wisconsin–Parkside
U. S. A.

Eduard Hanslick (1825–1904)

Eduard Hanslick was a Prague-born Austrian aesthetic theorist, music critic, and the first professor of aesthetics and history of music at the University of Vienna, who is commonly considered the founder of musical formalism in aesthetics. His seminal treatise Vom Musikalisch-Schönen (On the Musically Beautiful) of 1854 is one of the most significant contributions to musical aesthetics ever written, as is evident from the ten editions the book went through during Hanslick’s lifetime, with many editions to follow. Hanslick’s classic treatise has been translated into English as early as 1891. On the Musically Beautiful, or OMB, posits an aesthetic approach to music derived solely from its specific material features that helped to shape the fields of aesthetics and musicology up to our own day. Hanslick’s scientific and objectivist orientation, his critical attitude towards metaphysics, and his theory of emotion—strikingly reminiscent of modern cognitive concepts—guarantee his continued relevance for current debates.

OMB is notorious primarily for its ostensible repudiation of any pertinent connection between music and affect states. Hanslick’s concept of music, according to this view, is based solely on the formal aspects of pure music that does not arouse, express, represent, or allude to human emotion in any way relevant to its artistic essence: The content of music, Hanslick (in)famously proclaimed, consists entirely of “sonically moved forms.”

This article provides an introduction to Hanslick’s biography, his early music reviews, which differ considerably from the eventual opinions he is commonly associated with, and portrays the key arguments of Hanslick’s aesthetic approach as presented in OMB, including a reconstruction of the complex genesis of this book. The concluding paragraphs encompass an overview of several crucial sources of Hanslick’s viewpoint, seemingly oscillating between German idealism and Austrian positivism, as well as a concise history of Hanslick’s reception in analytical philosophy of music, which continues to struggle with the issues posed by Hanslick’s cognitive concept of emotion and has drafted numerous strategies to circumvent Hanslick’s skeptical outcome.

Biography
Early Works and Critical Writings
Vom Musikalisch-Schönen / On the Musically Beautiful
The Intellectual Background of Hanslick’s Aesthetics
The Reception of Hanslick’s Aesthetics and Its Relevance to Current Discourse
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Biography

Eduard Hanslick, who Germanized his surname by inserting a “c” upon his move to Vienna in 1846, was born in Prague on September 11, 1825 as the son of Josef Adolf (1785–1859) and Karoline Hanslik (1796–1843), daughter of the Jewish court factor Salomon Abraham Kisch (1768–1840). According to Hanslick’s memoirs, his father was responsible for his education and thus may have sparked his interest in aesthetics, as Josef Adolf edited the two volumes of Johann Heinrich Dambeck’s Vorlesungen über Ästhetik (Lectures on Aesthetics, 1822–23) and filled in as Dambeck’s substitute in 1816–17, teaching aesthetics at Prague’s Charles University. Hanslick, who also took lessons with the renowned composer Václav Tomášek (1774–1850), completed his philosophical elementary studies—a three-year course in general education mandatory for all prospective university attendees—between 1840 and 1843, enrolled in law at Prague, and attained his doctoral degree in Vienna in 1849 (on Hanslick’s early days, see Grey 2002, 828–29; Grey 2011, 360–61; Hanslick 2018, xv–xvi). Hanslick’s background in law had significant influence on his philosophical methodology as his standard for evidence and his emphasis on “proximate causes” (Hanslick 1986, 32)—which limit the chain of “admissible causes-in-fact” and enable Hanslick’s strong focus on “the music itself” instead of the listener, performer, or composer (Pryer 2013, 55)—are clearly derived from juridical training. After a short-lived employment as a fiscal civil servant in Klagenfurt (Carinthia) in 1850–52, during which Hanslick prepared for an academic profession (Wilfing 2018, 91n), he returned to Vienna to work at the ministry of finances and was subsequently transferred to the ministry of education in 1854.

This move proved crucial for Hanslick’s future career, as Count Thun-Hohenstein (1811–88), who led the education department from 1849 to 1860, had been charged with the overall reform of Austrian education following the 1848–49 revolution, and Hanslick thus came into direct contact with Thun’s agenda and the demands of the science policies of the Hapsburg Monarchy. The initial traces of the book he would become famous for also fall within this time frame, with OMB completed in 1854. In 1856, this book was acknowledged retroactively as a philosophical habilitation, thereby granting Hanslick an unsalaried professorship at the University of Vienna that turned into a salaried position in 1861, and ultimately a full post in 1870. Hanslick retained this post until he retired in 1895, and his successor Guido Adler (1855–1941) was appointed as professor of theory and history of music, a designation diverging markedly from Hanslick’s emphasis on aesthetics. Hanslick was established profoundly in the cultural and musical scenery of Vienna: he consulted in awarding public music grants and judged musical contests, was an official Austrian delegate at international conferences and world fairs, and he became the first chair of Denkmäler der Tonkunst in Österreich (Monuments of Musical Art in Austria) from 1893 to 1897, a society editing musical pieces of historic bearing on Austria until today. In addition to his academic activities, Hanslick experienced a widely successful career as a music critic (see the next section), which lasted until 1895, when Hanslick retired from his music editor post at Neue Freie Presse. Despite his retirement, Hanslick continued to publish criticism in this very journal until his death in 1904, with the last text to appear on April 7, two months before his passing—an event noted as far as the Musical Times and the New York Times (McColl 1995).

2. Early Works and Critical Writings

Except for his aesthetic treatise, Hanslick is renowned primarily for his activities as a music critic. As philosophical commentators usually concern themselves exclusively with OMB, the present section will briefly sketch Hanslick’s relevance in 19th-century musical discourse and will also indicate the diversity of his critical position. Today, Hanslick is known best for his skeptical attitude towards the New German School—a vague label for a loose group that is thought to comprise composers such as Hector Berlioz (1803–69), Franz Liszt (1811–86), and Richard Wagner (1813–83), but does also refer to influential journalists such as Franz Brendel (1811–68), editor-in-chief of Neue Zeitschrift für Musik. Hanslick’s career as a music critic started early on as an occasional contributor to Beiblätter zu Ost und West (Prague 1844) and—upon his move to Vienna in 1846—the Wiener Allgemeine Musik-Zeitung, ultimately transferring to the imperial Wiener Zeitung in 1848, prior to his music editor posts at Die Presse (1855–64) and its liberal offshoot Neue Freie Presse (1864–95). At that time, Hanslick proved to be an advocate of composers he would eventually disapprove of, such as Berlioz, who was called the “most magnificent phenomenon in… musical poetry,” and Wagner, who was proclaimed the “greatest dramatic talent among living composers” (Hanslick 1993, 40, 59; for the latter review, see Hanslick 1950, 33–45). Hanslick, who was acquainted personally with important composers of his era—he met Wagner as early as 1845 and acted as a local guide for Berlioz in 1846 (Payzant 1991 and 2002, 63–71)—at that time professed a romantic outlook (Yoshida 2001, 181–84) and deemed “pure” music a “language of the emotions” and the “revelation of the innermost world of ideas” (Hanslick 1993, 98, 115). For readers of an aesthetic theorist commonly associated with the “repudiation” of emotive musical meaning (Budd 1980) and the proponent of a classicist conception of music that does not refer to anything beyond itself, Hanslick’s 1848 essay on “Censorship and Art-Criticism” must seem particularly surprising. In this text, he condemns the “inadequate perspective that saw in music merely a symmetrical succession of pleasing tones.” Truly artful music, he continues, represents “more than music”; it is a “reflection of the philosophical, religious, and political world-views” of its time (Hanslick 1993, 157).

In the early 1850s, however, Hanslick’s outlook on music shifted considerably and eventually developed into a more “formalist” viewpoint that inverted his previously positive appraisal of Wanger’s operas. Although an exact date or a conclusive inducement for his “volte-face” (Payzant 1991, 107) is hard to determine definitively, the classicist writings of the Prague music critic Bernhard Gutt (1812–49), from whom he adopted multiple quotations (Payzant 1989), the failed political upheaval of 1848–49, and the resulting execution of his cherished colleague Alfred Julius Becher (1803–48) seem to be crucial reasons for Hanslick’s change of opinion (Bonds 2014, 153–54; Landerer and Wilfing 2018, sec. 2). Whereas Hanslick regarded “pure” music as an exhaustive repository for intellectual reflection that exerts tangible impact on the world of politics and religion in 1848, he from this time on develops a more formalistic conception of musical artworks that emphasizes their essentially autonomous nature. In making this move, Hanslick took part in the general erosion of Hegelian criticism, the political direction of which lost most of its appeal in the aftermath of 1848 (Pederson 1996), and entirely detached music and its aesthetic qualities from its involvement with worldly politics. Whereas the political activities of other critics ceased while they retained crucial elements of Hegelian aesthetics, such as emotivism or its focus on concrete content, Hanslick’s reversal was virtually complete. This turn is observable particularly in respect to the debate about external musical meaning that Hanslick declared the pivotal feature of art in 1848. A few years later, prior to the initial edition of OMB in 1854, he had reversed his attitude entirely by stating that “if an orchestral composition requires external means of conceptual understanding [that is, a literary program] in order to please… then its musical value already appears to be in question” (Hanslick 1994, 293). Hanslick’s notion of music’s nature thus shifted from a romantic position emphasizing conceptual meaning to an appraisal of internal musical meaning oriented towards formal issues such as the inherent potential of the main theme or the clarity of melodic figures (Payzant 2002, 88–91, 96–98, 117–19).

Although Hanslick therefore adopted a critical attitude towards the New German School in later years and took issue with its poetization of “pure” music (Larkin 2013), certain matters have to be kept in mind that challenge the widespread assumption of Hanslick being a “stodgy, pedantic spokesperson for ‘conservative’ musical causes” (Gooley 2011, 289). Hanslick’s criticism of Wagner and his followers generally concerned the musical aspects of their works and deplored an absence of motivic-thematic manipulation or an overly rigorous devotion to a literary program that supposedly interfered with the “organic” unfolding of melody. His general valuation of these works, however, often proves to be astoundingly differentiated (on Hanslick’s appraisal of Wagner, see Grey 1995, 1–50; Pederson 2013, 176–77; Bonds 2014, 237–46). Although Hanslick assessed Der Ring des Nibelungen in 1876 to be “a distortion, a perversion of basic musical laws,” he was at the same time able to realize that Wagner’s tetralogy represents “a remarkable development in cultural history” (Hanslick 1950, 139, 129). It is beyond serious debate that Hanslick preferred Beethoven (1770–1827), Brahms (1833–97), and Mozart (1756–91) to Mahler (1860–1911), Strauss (1864–1949), or the Wagner “school.” Hanslick, however, did not panegyrize his preferred musicians as he did not condemn his “opponents” without reservation. Although Hanslick bemoaned Wagner’s musical system, his continuous modulations, and the dubious semantic qualities of the Leitmotiv—which he called “musical uniforms”—he nonetheless appreciated his “genius for theatrical effect” (Hanslick 1950, 121, 151) and stressed the musical virtues of specific sections of Wagner’s operas. As he clarified in 1889: “Only a fool or dedicated factionist” would answer the question of Wagner’s qualities “with two words: ‘I idolize him!’ or ‘I abhor him!’” (Hanslick 1889, 56). Furthermore, Hanslick critically (and sometimes financially) supported more modernistic composers such as Bedřich Smetana (1824–84) or Antonín Dvořák (1841–1904) as long as their general artistic principles conformed to his aesthetic approach to a certain degree (Brodbeck 2007 and 2009; Larkin 2013).

3. Vom Musikalisch-Schönen / On the Musically Beautiful

a. Genesis and Conceptual Organization of OMB

From July 1853 to March 1854, Hanslick pre-published several chapters of OMB as stand-alone articles that deal with the subjective impression and (physiological) perception of music, as well as with the complex relations between music and nature. His three-piece essay “On the Subjective Impression of Music and its Position in Aesthetics” (Hanslick 1853) was eventually transformed into chapters 4 and 5 of the finalized manuscript, whereas “Music in its Relations to Nature” (Hanslick 1854)—itself based on a public lecture of 1851—turned into chapter 6, with both texts running through hardly any significant alterations. Scholarship on the actual genesis of OMB is rather sparse, as Hanslick’s private records were lost during the Second World War (Wilfing 2018, sec. 1), and has not yet reached a consensus regarding the chronological development of Hanslick’s momentous monograph. Whereas Geoffrey Payzant surmised that Hanslick’s articles were taken from the final version of OMB (Payzant 1985, 180), recent research points to the logical order of Hanslick’s argument that runs counter to the familiar sequence of published chapters in OMB and assumes that these three chapters (4–6) were indeed written prior to the more famous chapters 1 to 3, therefore presenting the nucleus of OMB (Landerer and Wilfing 2018, sec. 4; Hanslick 2018, xvii–xix). According to this view, Hanslick first lays the foundation for his aesthetic approach by clarifying an idea of tone (chapter 6) and the way in which tones are received from the standpoint of physiology and psychology (chapters 4 and 5). This analysis is followed by Hanslick’s concept of emotion, how emotions are predicated upon these physiological and psychological responses, and what role emotions play in musical aesthetics (chapters 1–2). Finally, following Hanslick’s hypothesis that emotion does not form a substantial component of objectivist aesthetics, he presents his positive thesis (chapter 3) and closes his argument with concluding comments that summarize his key findings and widen the conceptual framework of OMB (chapter 7).

b. Purpose, Methods, and General Outlook of OMB

Hanslick did not write any other academic works apart from OMB and the Geschichte des Concertwesens in Wien (History of Concert in Vienna, 1869) and focused his literary output almost entirely on reviews. Why did he decide to publish an aesthetic treatise at the age of 29? The reason given by Hanslick himself is to provide a critique of aesthetic emotivism that dominated mid-century discourse and to challenge the “advocates of the music of the future,” who supposedly endangered the “independent significance of music” (Hanslick 2018, lxxxv). By directly accusing Liszt and Wagner of belittling the inherent qualities of “pure” music, Hanslick contributed significantly to the view that OMB has to be read as a book directed against Wagner—a view that was conducive for the longevity of Hanslick’s treatise through the discussions surrounding the New German School. Even though there is some truth to this claim, scholars contest that Wagner’s music could be actually regarded as the prime spark for the production of OMB (Grey 2003, 169; Brodbeck 2014, 50), not least of all since Wagner’s later works that Hanslick specifically disapproved of were not yet written and Wagner’s name rarely appears in the initial edition of Hanslick’s treatise (several quotes from Wagner’s theoretical writings are belatedly included in the sixth edition of 1881). Wagner’s music—even though it was a useful target in order to remain relevant—thus does not seem to be the crucial reason for writing OMB, as the conceptual framework of Hanslick’s argument would have been very much the same “had the figure of Wagner not been there” (Bujić 1988, 8). A more tangible motive seems to be Hanslick’s very early aspiration towards an academic profession in order to leave behind his rather tedious employment as a public servant. We know from letters written around 1851 that Hanslick noticed the absence of musical aesthetics and musicology from the Viennese university curriculum and saw the opportunity to carve a niche for his unique talent. In light of Hanslick’s academic ambitions, it comes as no surprise that OMB does not start with a theoretical definition of art, music, or beauty. On the contrary, Hanslick’s examination commences with an exhaustive definition of musical aesthetics as a scientific discipline.

Whereas romantic aesthetic theorists had occupied themselves with music’s relation to affect states, feelings, and emotions, scientific aesthetics should focus on the object itself instead of its (historical) production or (arbitrary) reception. If musical aesthetics is to become scientific, Hanslick proclaims in a sentence that strikingly anticipates Edmund Husserl’s (1859–1938) phenomenology (Wilfing 2016, 24–25), it has to “approach the natural scientific method at least as far as trying to penetrate to the things themselves” (Hanslick 2018, 1). Furthermore, the specified aesthetics of music should detach itself from any theoretical dependency on a general concept of artistic beauty that is employed to categorize “pure” music ex post facto. German idealism typically contrived an aesthetic approach firmly rooted in an overarching philosophical framework. Art, regardless of the specific medium, thus must satisfy certain epistemic principles and ethical criteria derived from this general system in order to be classified as beautiful. Idealist aesthetics therefore typically identified universal conditions of artistic beauty that were binding equally for a poem, a tragedy, a painting, a sculpture, or a piece of music (Wilfing 2018, sec. 3.3). For Hanslick, this system-bound approach was completely misguided as he is concerned exclusively with musical beauty, the “musically-beautiful,” so that it is even hard to see how his notion of specific musical beauty is related to any general concept of beauty (Bonds 2014, 190). For him, the “laws of beauty of each art are inseparable from the characteristics of its material, of its technique” (Hanslick 2018, 2). For this reason alone, Payzant’s rendition of Vom Musikalisch-Schönen as On the Musically Beautiful captures Hanslick’s ideas much better than Cohen’s The Beautiful in Music that suggests an aesthetic approach contrary to Hanslick’s intentions: he did not propose an abstract principle of artistic beauty, administered retroactively to “pure” music, but was interested principally in beauty solely and explicitly manifest in the art of tones (Hamilton 2007, 81; Bonds 2014, 190).

c. Arousal, Expression, and the Cognitive Concept of Emotion

To this end, Hanslick develops two central theses: a positive one, explored in chapter 3, that attempts to show that musical beauty is dependent completely on the inherent qualities of music itself, and a negative one, defined in chapters 1–2, that challenges the familiar concept that music is supposed to represent feelings and that its emotive content forms the basis of aesthetic judgment. Both ideas share common ground in Hanslick’s objective approach: as the musical artwork and its material features represent the core of Hanslick’s aesthetics, the “subjective impression” of music, its emotive impact, is relegated to a secondary aftereffect of musical material. We must thus “stick to the rule that in aesthetic investigations primarily the beautiful object, and not the perceiving subject, is to be researched” (Hanslick 2018, 2–3). Hanslick specifically addresses two ways in which music is thought to be related to affect states: (1) The idea that music’s purpose is to arouse emotion and (2) that emotions represent the content of musical artworks (an assumption employed frequently to compensate for the lack of notional meaning in music alone). The first stance is countered by the classical argument of beauty having no purpose and “content of its own other than itself.” Beauty may very well arouse pleasant feelings in the perceiving individual, but to do so is not at all constitutive for the musically beautiful that exists apart from the listener’s cognition and remains beautiful “even if it is neither viewed nor contemplated. The beautiful is thus namely merely for the pleasure of the viewing subject, but not by means of the subject” (Hanslick 2018, 4). In an argument that anticipates Edmund Gurney’s (1847–88) renowned distinction between impressive music and expressive music (Gurney 1880, 314), Hanslick moreover maintains that music’s beauty and its emotive impact do not correlate inevitably. Thus, a beautiful composition may not arouse any specific feelings, whilst the strong emotive impact of another musical piece does not necessarily substantiate its aesthetic qualities (Hanslick 2018, 31–33; Robert Yanal 2006 dubs this idea the “third thesis” of OMB). In general, emotive arousal—for the most part depending on individual experience, musical edification, historical discourse, and so on—cannot provide a reasonable foundation for scientific aesthetics as it exhibits “neither the necessity nor the exclusivity nor the consistency” required to establish an aesthetic principle (Hanslick 2018, 9).

In chapter 2 of OMB, Hanslick presents his key argument against emotion forming the content of “pure” music by introducing his cognitive concept of emotion—a concept that brought his treatise to the forefront of analytical aesthetics. There was widespread consensus amongst idealist systems of art that art must have some sort of content. As “pure” music lacks tangible meaning, romantic theorists invoked the opposite of conceptual definiteness as the obvious candidate for music’s content: emotion (love, fear, anger, and the like). This claim, Hanslick maintains, represents the weak spot of musical emotivism. Emotion by no means forms the conceptless counterpart to literary meaning. On the contrary, emotions are “dependent on physiological and pathological conditions” and are invoked by “mental images, judgments, in short by the entire range of intelligible and rational thought” (Hanslick 2018, 15). The analytical philosopher Peter Kivy (1990, chap. 8) popularized this view with a practical example: If I assume that uncle Charlie is cheating during a card game, the anger I experience is contingent on the object of my emotion, Charlie. However, in order to be angry, a complex structure of cognitive parameters has to be in place. I must consider cheating an immoral or indecent behavior—a belief built upon some sort of ethical system—that is performed purposely by Charlie. As soon as I spot that Charlie is not deceitful wittingly and has played the wrong cards by accident, my anger is likely to evaporate, as its conceptual foundation disappears. Emotion, in short, needs an intentional object to be an emotion—an object that “pure” music is unable to provide. As music lacks the “cognitive mechanism” necessary to portray the objects of concrete emotions, the depiction of a specific feeling “does not at all lie within music’s own capabilities” (Hanslick 2018, 15–16). However, music alone can express the dynamic features of emotions via its own musical impetus and is thus able to portray “one aspect of feeling, not feeling itself” (Hanslick 2018, 18). Thus, even though music alone cannot express love, fear, or anger in a direct manner, its dynamic structure can reproduce the associated movement of concrete emotions or actual events (Hanslick 2018, 30), but not in ways that allow for definite meaning, as the dynamic character of love or anger could both be violent, desperate, or passionate in specific instances.

Hanslick’s exact stance on the relation of emotion and “pure” music represents a major point of contention in current research. Several scholars hold that Hanslick severed any relevant bonds between music and affect states, so that music itself “has nothing to do with emotion” (Zangwill 2004, 29) and emotions in turn have “nothing to do with musical beauty” (Lippman 1992, 299). Other scholars point to the preface of Hanslick’s treatise, in which he states that for him the value of beauty is based on “the direct evidence of feeling” and that his protest only pertains to the “mistaken intrusion of feelings in the domain of science” (Hanslick 2018, lxxxiv). In chapter 1, Hanslick makes the same move when it comes to musical arousal: he does not want to “underestimate” the “strong feelings that music awakens from their slumber,” but merely refutes the “unscientific assessment of these facts for aesthetic principles” (Hanslick 2018, 9). For Payzant, Hanslick accepts music’s capacity to arouse, express, or portray emotion; he only “says that to do so is not the defining purpose of music” (Hanslick 1986, xvi). Stephen Davies and Peter Kivy, who in 1980 concurrently established a concept of musical emotion based chiefly on the dynamic features of musical structure that readily suggest the outward features of expressive behavior (Trivedi 2011), regarded Hanslick as a historical precursor to their shared model of enhanced formalism. The crucial disparity between enhanced formalism and Hanslick’s aesthetics, both authors hold, is that they conceive of expressive properties as objective musical properties, whereas Hanslick was reluctant to take this step (Davies 1994, 204; Kivy 2009, 64). Based on numerous passages of OMB that suggest music’s ability to be “itself intellectually stimulating and soulful” and that show how music alone “absorbs” its creator’s feelings (Hanslick 2018, 45–46, 65), this view has been called into question. As Hanslick locates emotive meaning in music’s kinetic features that replicate the dynamic properties of affective conditions, his stance might come close to enhanced formalism (Cook 2001, 175). In view of Hanslick’s account of musical emotion as “silhouettes” (Hanslick 2018, 27) that open a certain variety of possible meaning whilst precluding capricious readings of music, he seems to regard musical elements as indefinitely expressive (Srećković 2014, 131)—an approach that anticipates Susanne K. Langer’s (1895–1985) theory of music as an “unconsummated symbol” (Wilfing 2016, 26–29).

d. The Musically Beautiful and Music’s Relation to History

Hanslick’s arguments regarding the complex relations between emotions and music, the indeterminate expressivity of musical gestures, as well as their debatable relevance for scientific aesthetics, however, merely apply to “pure” music. As vocal music forms an amalgam of music and poetry, the emotions aroused by it cannot be ascribed to any of its codependent components in arbitrary isolation. Thus, “pure” music—instrumental compositions without a literary program, title, or text—forms the basis of Hanslick’s aesthetics (Hanslick 2018, 23–26). This lopsided approach has led scholars to assume that Hanslick regarded vocal music as an impure blending of “absolute” art forms, whilst considering instrumental music to be the ideal form of music (Alperson 2004, 260; Gracyk, chap. 1). By contrast, other scholars stressed Hanslick’s statement that any leaning towards a specific subclass of music proves to be an “unscientific procedure” (Hanslick 2018, 24), and thus read Hanslick’s favoritism as a methodological consideration without normative implications (Bonds 2014, 12; Grey 2014, 44). For Hanslick, musical beauty is never based on the literary meaning or the emotive features of music but is rather found “solely in the tones and their artistic connection”: “The content of music,” as he famously proclaims, “is sonically moved forms” (Hanslick 2018, 40–41). The purport of Hanslick’s notorious sentence has evoked a wide array of possible readings. Although the “forms” he speaks about have been interpreted occasionally to refer to large-scale forms (concerto, sonata, rondo, and so on) and have thus been translated in the singular (Dahlhaus 1989, 130; Karnes 2008, 30), it seems likely that this term actually denotes musical elements and their structural conjunction (Wilfing 2018, sec. 3.3). In contrast, sonically or “tonally” (tönend), as Payzant renders this term (Hanslick 1986, 29), is an unclear concept that has been explained divisively. Whereas Payzant takes this term to refer to “tone” as part of the diatonic musical scale (2002, 44–46), Landerer and Rothfarb translate tönend as “sonically” and therefore emphasize its auditory features. Much of the question whether Hanslick perceived “pure” music to be captured entirely in the score itself (Subotnik 1991, 279; Alperson 2004, 266) or to require an auditory experience to be appreciated aesthetically (Bujić 1988, 10; Hamilton 2007, 82) hinges on the problematic translation of tönend.

Hanslick, however, willingly concedes that an assertive definition of the musically beautiful is virtually impossible to achieve because “pure” music cannot express concrete meaning. Any account of music’s content thereby amounts to “dry technical specifications” or “poetic fictions” (Hanslick 2018, 43). Music, in each case, must be understood musically and can be grasped only from within, as no verbal report can suffice. If we want to specify the content of a given theme for another person, “we have to play the theme itself for him” (Hanslick 2018, 113). Although Hanslick is unable to provide an exhaustive definition of musical beauty, he guards against potential fallacies: For him, the musically beautiful represents more than symmetry, regularity, proportion (Hanslick 2018, 57–59), or a pleasant sequence of tones, as these images neglect the crucial aspect of beauty: Geist (mind or intellect). The forms music consists of are “not empty but rather filled, not mere borders in a vacuum but rather intellect shaping itself from within” (Hanslick 2018, 43). Consequently, the act of composition is an “operation of the intellect in material of intellectual capacity” and the musically beautiful is produced primarily by the “intellectual power and individuality” of the composer’s imagination that has been absorbed by musical structure as a tonal idea that “pleases us in itself” (Hanslick 2018, 45–46). “Pure” music, Hanslick contends, has its own logic based on purely musical factors, the effect of which is governed by certain natural laws that have to be discovered, examined, and elucidated by aesthetic analysis (Hanslick 2018, 47–50). At this point, the tentative character of Hanslick’s approach becomes apparent, as he does not give any substantial indication as to how this goal could be realized beyond the idea that we must observe the efficacy of musical elements that are then reduced to general aesthetic categories that in turn lead to an ultimate principle. Although Hanslick cannot provide a conclusive treatment for scientific aesthetics, the pivotal insight of OMB seems clear: musical beauty depends on musical material and not on any concept or emotion. Thus, Hanslick wonders whether the divergent aesthetic qualities of musical artworks might hinge on the gradation or accuracy of emotional expression and answers in the negative: A piece shows more aesthetic qualities than another simply because it contains “more beautiful tone forms” (Hanslick 2018, 51).

Here, Hanslick mentions one of the few concrete examples of musical beauty by declaring creativity, originality, and spontaneity to be essential features of musical prowess. This view is notable because Hanslick’s notion of how musical beauty relates to history is one of the most divisive aspects of OMB. Hanslick’s emphasis on the intrinsic qualities of “pure” music, ruling out the various settings of creation, listening, or performance for aesthetic concerns, has led scholars to assume that Hanslick treats beauty ahistorically (Burford 2006, 172–73; Karnes 2008, 50–52; Bonds 2014, 176–77). This view is often based on Hanslick’s assurance that his concept of beauty applies to classicism as well as romanticism and thereby pertains to “every style in the same way, even in the most opposed ones” (Hanslick 2018, 55). Hanslick moreover advocates a categorical separation between historical reasoning and aesthetic judgment: whereas the historian’s exploration of the broader context of a given piece is undeniably warranted, aesthetic inquiry hears “only what the artwork itself articulates.” In regard to this hierarchy between the aesthetic relevance of artwork and context, Hanslick somewhat anticipates the New Criticism of 20th-century literary studies principally associated with Monroe C. Beardsley and William K. Wimsatt (Appelqvist 2010–11, 77–78). However, this idea is undermined immediately by Hanslick’s remarks on the indisputable connection of artworks to “the ideas and events of the time that produced them.” As music is created by an intellect, it stands in inextricable interrelation with concurrent productions of art and the “poetic, social, scientific conditions” of its time and place (Hanslick 2018, 55–56). For Hanslick, the aesthetic qualities of musical elements (particular cadences, intervallic progressions, modulations, and so on) are subject to historic decline and “wear out in fifty, even thirty years.” Eternal musical beauty is “little more than a nice turn of phrase” and we may say of compositions that “rank high above the norm of their time that they were once beautiful” (Hanslick 2018, 51, 58n). This theoretical contradiction prompted scholars to discern between Hanslick’s principle of scientific aesthetics, which is established ahistorically, and his concept of music itself and particular instances of the musically beautiful, which are subject to change (Landerer and Zangwill 2016, 490–92; Wilfing 2016, 17–18).

e. Listening, Music’s Relation to Nature, and Music’s Content

Although Hanslick openly rejects the listener’s relevance for the constitution of the musically beautiful that exists apart from the listener’s perception, the subjective impression of music forms the topic of chapters 4 and 5 of OMB. Hanslick is not at all interested in establishing a purely intellectual apprehension of musical structure. Beauty is rooted in (physical) sensation and engages the faculty of imagination as an intermediary between sensation, intellect, and feeling: listening to music in a purely rational fashion, Hanslick contends, is as far removed from aesthetic appraisal as mere affective arousal. The musical artwork acts as an “effective median between two animated forces,” the composer and the listener. The aesthetic exaltation of the composer’s imagination yields a theme shaped by the composer’s individuality, which is subsequently elaborated according to the artistic talents of its creator (Hanslick 2018, 63–64). The composer’s personality molds music’s “infinite capacity for expression” through his “consistent preference for certain keys, rhythms, [and] transitions” that transform the composer’s sensibility into a part of objective musical structure, which in turn is open to the listener’s perception (Hanslick 2018, 65). The listener’s judgment about the concrete meaning of a given piece is therefore affected heavily by performance, which allows the artist to release directly the emotion apparently perceived in music (Hanslick 2018, 67–69). For Hanslick, the genuine affective reaction of the listener, especially powerful in the case of music, is beyond dispute, but the ways in which it is constituted varies considerably. If the listener’s approach to “pure” music involves the attentive tracking of compositional development and therefore transcends emotional indulgence, the approach is aesthetical (Hanslick 2018, 88–90). If the emotive impact of music is received passively, however, the listener’s attitude is regarded as “pathological”—a term that carries medical connotations but derives chiefly from the Greek notion of “pathos,” thereby denoting purely passive experience (Hanslick 2018, 81–88). For Hanslick, this mode of listening originates from the physical aspects of sound and its direct effect on the human nervous system and thus lacks the necessary component of Geist to be considered aesthetical. It actually belongs to physiological, psychological, or medical research and is not subject to aesthetic inquiry (Hanslick 2018, 71–80).

Hanslick’s analysis of the complex interplay between composer, artwork, and listener is followed by an investigation of music’s relation to nature, arguably the oldest chapter of OMB. In general, artworks present a twofold relation to nature: first, through their physical material (sound, paint, stone); second, through the content nature affords to art. In the case of “pure” music, considered a cultural artefact, the physical material provided by nature merely amounts to “material for material” (wood, hide, hair) that is used to create actual musical material (tones, intervals, scales), already a product of culture (Hanslick 2018, 95). Nature thus merely offers physical material for acoustic material that in turn provides material for the creative activity of the individual composer, which builds upon the collective repository of music history. As musical content consists entirely of musical features, the origins of which are not natural, Hanslick moreover postulates that nature cannot provide content for “pure” music and thus does not have any relation to musical artworks. Whereas sculptors, painters, and writers are able to draw inspiration from human actions or nature itself, music finds no preceding prototype beyond the history of “artificial” musical material and is thus only akin to architecture. In blatant contrast to mimetic concepts of art, Hanslick thus holds that “the composer cannot transform anything, he has to newly create everything” (Hanslick 2018, 103). At this point, Hanslick once more illustrates the historical evolution of musical material, emerging gradually as a creation of intellect, by noting how certain modern intervals “had to be achieved individually” over multiple centuries. Music itself, in each of its various aspects, is created entirely by intellectual ingenuity and represents a “consequence of the endlessly disseminated musical culture.” Hanslick therefore overtly advises to “beware of the confusion as though this (present) tone system itself necessarily lies in nature” (Hanslick 2018, 95–97). As Hanslick’s concept of scientific aesthetics is based on material features of musical structure, this view has significant implications for his entire stance: since musical material will constantly undergo extension, any alteration pertaining to crucial aspects of musical technique will also affect the basics of aesthetic research (Hanslick 2018, 98–99).

Finally, Hanslick revisits the question of musical content in order to differentiate meticulously between distinct concepts of content usually lumped together indiscriminately. Content is defined as that “what something contains, holds within itself.” In the case of music, “content” denotes the tones and forms a piece of music is made of. This term is not to be confused with “subject matter” that typically indicates abstract literary content of which music has none: “music speaks not merely through tones, it speaks only tones” (Hanslick 2018, 108–109). In music, the concepts of content and form—musical material and its artistic design—mutually determine each other and are ultimately inseparable: “With music, there is no content opposed to form, because it has no form outside of the content” (Hanslick 2018, 111–12). A separation between musical content and its form does merely pertain to cases in which form is applied to large-scale structures, which is not the standard meaning of this term in OMB. Only then can the theme be called content, whereas the overall structure, the “architectonic of the joined individual components and groups of which the piece of music consists,” acts as form. The theme, which “develops in an organically, clearly organized, gradual manner, like luxuriant blossoms from a single bud,” constitutes the irreducible aesthetic “essence” of a piece of music. As everything in a specific musical structure is a “spontaneous consequence” of the initial theme, the multitude of prospects in which a theme could be developed determines its aesthetic substance or Gehalt: “whatever does not reside in the theme (overtly or covertly) cannot subsequently be organically developed” (Hanslick 2018, 113–14). Even though music does thus not present subject matter along the lines of literary meaning, “pure” music, animated by “thoughts and feelings,” does clearly exhibit intellectual “substance.” Generally speaking, “pure” music has content: purely musical content manifest in the distinct musical features of the theme, which Hanslick describes poetically as “spark of divine fire.” Musical content, Hanslick emphasizes in conclusion, purely derives from the “definite beautiful tone configuration” of a given piece as the “spontaneous creation of the intellect out of material of intellectual capacity” (Hanslick 2018, 114–16).

f. Conclusion: The Curious Nature of Hanslick’s Formalism

Hanslick’s aesthetics is frequently considered the “classical definition of formalistic aesthetics in music” (Yoshida 2001, 179) and the “inaugural text in the founding of musical formalism as a position in the philosophy of art” (Kivy 2009, 53). What is meant by musical formalism and which exact version of musical formalism Hanslick is supposed to represent, however, is one of the divisive questions of Hanslick scholarship and of the philosophy of music at large. The conceptual significance of the term ‘form’ and its relevance for Hanslick’s theory seem to be overrated in principle. Philosophical commentators typically overlook that Hanslick’s definition of beauty in music—the focal point of OMB—does not rely upon any idea of form and that this term is indeed absent from Hanslick’s description of music’s artistic quality: by specific musical beauty, Hanslick designates a “beauty that is independent and not in need of an external content, something that resides solely in the tones and their artistic connection” (Hanslick 2018, 40). Furthermore, Hanslick’s infamous statement of “sonically moved forms” did not correspond to music itself, as is surmised regularly, but much more narrowly to music’s content that is thereby equated with form, and vice versa. Even the more pointed version in the second edition of OMB, which states that forms are “solely and exclusively the content and subject of music” did not identify form with music itself but rather claims the identity of the content and forms of music (Hanslick 2018, 41). Thus, these forms are not without content or thought of as empty but rather are imbued by intellect (Geist) “shaping itself from within” (Hanslick 2018, 43), thereby linking beauty to mental activity (Bowman 1991, 47; Paddison 2002, 335; Burford 2006, 179). Hanslick therefore opposes one of the central claims of formalist aesthetics that usually stresses the primacy of formal features over some kind of content (Fisher 1993, 250; Kivy 2002, 67; Beard and Gloag 2005, 65). In music, he states, “we see content and form, material and design, image and idea fused in an obscure, indivisible unity,” which means that “there is no content opposed to form” as music “has no form outside of the content” (Hanslick 2018, 111–12). By stating that form and content are one, Hanslick is “almost alone among formalists” (Payzant 2002, 83) and OMB thus even “reads more like a traditional criticism of formalism” (Hamilton 2007, 88).

Whether Hanslick’s aesthetics is to be regarded as formalist, however, depends entirely on the definition of formalism espoused by scholars. The special variety of Hanslick’s approach is clarified by one of the customary definitions of formalist aesthetics, the conception of formalism as common denominator argument (Carroll 1999, chap. 3 and 2001). In this case, formalism is understood as a universal definition of art, such as in Clive Bell’s (1881–1964) formalist manifesto Art, which posits a circular concept (Gardner 1996, 238; Carroll 2001, 95; Stecker 2003, 141) of “aesthetic emotion” elicited by “significant form” that “distinguishes works of art from all other classes of objects” and thereby defines the fine arts as such (Bell 1914, 13). Formalists, as Dziemidok (1993, 192) states, “strive to determine general criteria of valuation universally applicable to all art forms” and thus miss the “values unique” to each artistic medium by commencing with “universalistic assumptions.” As we have seen in sec. 3.b, this definition of formalism contradicts Hanslick’s insistence on the idea that the criteria of the musically beautiful apply solely to music itself and not to the other art forms. Further concepts of general aesthetic formalism prove to be similarly debatable: Small (1998, 135), for example, describes formalist theories as denying that “emotions have anything to do with the proper appreciation of music” (form versus emotion/content), while Mothersill (1984, 222) emphasizes formalism’s conviction that “elements which suggest or establish a link between the artwork and the world should be disregarded” (form versus context). In view of OMB, both ideas seem somewhat applicable but at the same time miss something important about Hanslick’s viewpoint: whereas aesthetic analysis—conceived as an objectivist scientific approach—is indeed distinct from historical concerns and the stimulation, expression, or portrayal of definite emotion, music itself affects emotion and is connected intimately to concurrent productions of art and the “poetic, social, scientific conditions” of its time and place (Hanslick 2018, 9, 55; cf. Wilfing 2016, 15–18). In general, any detailed appraisal of Hanslick’s formalism does hinge upon the individual definition of aesthetic formalism and ‘form’ itself—a term that is as ambiguous as it is persistent (Tatarkiewicz 1973, 216), which might be of limited efficacy in describing Hanslick’s argument and must thus be employed carefully (Nattiez 1990, 109; Bowman 1991, 53; Payzant 2002, 58).

4. The Intellectual Background of Hanslick’s Aesthetics

a. Hanslick and German Idealism

Historical research on OMB is dominated primarily by questions of intellectual dependency: Who influenced Hanslick’s aesthetic approach and which philosophical movement stimulated the main ideas of his aesthetic approach (Landerer and Wilfing 2018)? Numerous candidates have been invoked as precursors to Hanslick’s “formalism,” ranging from idealist theorists—Kant (1724–1804), Herder (1744–1803), Hegel (1770–1831), Schelling (1775–1854), Vischer (1807–87)—and German poetry—Lessing (1729–81), Goethe (1749–1832), Schiller (1759–1805), or the German literary romantics—to the Austrian context of Hanslick’s aesthetics and “minor” figures such as Michaelis (1770–1834), Novalis (1772–1801), or Nägeli (1773–1836). Generally speaking, current scholarship situates Hanslick’s argument in the (ultimately antithetic) traditions of German idealism and Austrian realism. The most prominent contender as the crucial source of OMB, emphasized particularly in analytical philosophy (Gracyk, chap. 1; Appelqvist 2010–11, 76; Davies 2011b, 297), is Kant’s Kritik der Urteilskraft (Critique of the Power of Judgment, 1790). As OMB is typically regarded as the classical definition of formalistic aesthetics in music and Kant’s Kritik is widely thought to be the origin of general aesthetic formalism, this link appears entirely natural (Ginsborg 2011, 334). Their respective definition of aesthetic intuition as disinterested contemplation, standing apart from rational thought and affect states, as well as their general concept of beauty, which is not subject to an external purpose or definite concepts, establish Hanslick’s awareness of Kant’s theory. Whether Hanslick, who did not receive any formal training in philosophy, ever read Kant or whether he adopted certain notions from post-Kantian aesthetic discourse (Dambeck, Michaelis, Nägeli, and so forth) is open to debate. Although Hanslick’s reliance on Kant’s theory is frequently accepted as fact, this view is complicated by at least three issues: (1) Kant’s notion of music as a servant of poetry and as a language of affect states was criticized vigorously by Hanslick. (2) Hanslick’s concept of specific musical beauty directly opposes Kant’s idealist attitude, which stipulates an abstract principle of beauty, administered retroactively to each art form. (3) The objectivist approach of Hanslick’s aesthetics contradicts Kant’s transcendental methodology, the crucial premise of his entire system (Bonds 2014, 188–89; Wilfing 2018, sec. 3.3).

While Kant is mentioned only once in OMB as one of those “eminent people” who did reject any literary content when it came to music (Hanslick 2018, 107), a different contender as the pivotal source of Hanslick’s aesthetics is referred to on multiple occasions: Hegel. Although a large share of Hanslick’s comments on Hegel are intended as criticism—he accuses Hegelian theories of an “underevaluation” of sensuousness in favor of ideas, for example (Hanslick 2018, 42)—various quotes and his early music reviews confirm that Hanslick was familiar with Hegel’s aesthetic positions. The theoretical importance of Hegel’s Vorlesungen über die Ästhetik (Lectures on Aesthetics, 1835–38) for the basic tenets of Hanslick’s approach have been investigated particularly by Carl Dahlhaus, who supported his viewpoint by drawing attention to Hanslick’s persistent utilization of the term Geist, which also permeates Hegel’s philosophy. Dahlhaus, however, did not regard Hanslick’s treatise as an uncritical extension of Hegel’s theory of art as the corporeal incarnation of the idea, in which music itself is only form, whereas thoughts and feelings are the content (Dahlhaus 1989, 110). For him, Hanslick’s theory inverts Hegel’s system by making the idea purely musical and thereby turning “form” into a concept of the interior, not the exterior (Burford 2006, 170; Bonds 2012, 8). Although Hanslick’s definition of composing as “intellect shaping itself from within” is probably situated in a general setting of Hegelian reasoning, the whole extent of Hanslick’s awareness of Hegel’s writings is unknown, as no related records survive. The situation is different, however, if we turn to Hegelian aesthetic theorists: We know that he read parts of Vischer’s Aesthetik oder Wissenschaft des Schönen (Aesthetics or Science of Beauty, 1846–57), for example, which might have been the most likely source for his Hegelian leanings (Titus 2008). Hanslick candidly criticized Hegelian aesthetics for its historical orientation, which seemingly confused historical research with aesthetic analysis, but he nonetheless emphasized the historical evolution of musical material and the arbitrary appraisal of specific artworks. The idea that artistic material does not merely consist of physical elements (sound, paint, stone), but moreover comprises the entire historical evolution of each art form—the historical interplay between material and mind—was a central concept of Vischer’s theory, linking Hanslick’s approach to Hegelian aesthetics.

b. Hanslick and Austrian Realism

As an Austrian theorist raised in Prague who spent most of his career in Vienna, the delineated relevance of German idealism for the basic tenets of OMB has to be supplemented by an analysis of Hanslick’s Austrian contexts. In the 19th century, Austrian science policies were strongly opposed to philosophical “speculation” that was held responsible for the societal upheaval in the wake of 1789 and 1848. These events caused several reforms of the Austrian school system, the primary purpose of which should be to foster the restoration endeavors of the Habsburg leadership by confining education to propaedeutic instructions compatible with Catholic dogmas and state norms. This political strategy resulted in the preservation of Leibnizian philosophy, the flourishing of positivistic scholarship, and the inhibition of German idealism in favor of methods perceived as decidedly scientific. One intellectual, who consciously modernized the Leibnizian framework engrained in the academic landscape of Austria, was the Prague priest and philosopher Bernard Bolzano (1781–1848). Although Bolzano was forced to resign owing to an unfounded accusation of Kantianism in 1819, the general precepts of his writings prospered in Habsburg territories by way of his scientific successor and Hanslick’s close friend Robert Zimmermann (1824–98), who attained a tenured position at the University of Vienna in 1861. Bolzano published his aesthetic doctrines in Über den Begriff des Schönen (On the Notion of Beauty, 1843) and Über die Eintheilung der schönen Künste (On the Classification of the Fine Arts, 1849). In similar fashion to Hanslick, he defined aesthetic perception as disinterested contemplation, construed musical listening as an intentional monitoring of compositional development, and dismissed emotivist models whilst insisting on particular aesthetics for each art form. Bolzano’s most significant contribution to Hanslick’s aesthetics, however, was his drastically objectivist approach isolated entirely from psychological explanations that might derive from Bolzano’s theory of science. Here, Bolzano outlines his Platonic concept of a “truth as such,” which states something as is, no matter whether this fact has been or ever will be uttered or thought by anyone. The radically objective condition of Hanslick’s concept of musical beauty, which remains beauty “even if it is neither viewed nor contemplated,” matches Bolzano’s Platonic mindset (Bonds 2014, 162; Wilfing 2018, sec. 2).

Another important precursor to Hanslick’s aesthetics, who is significant particularly due to his influence on Austrian science policies in general, is Johann Friedrich Herbart (1776–1841). As Herbart declared natural science the operational benchmark for philosophy and demanded a separation between philosophy, religion, and politics, his approach blended perfectly with the positivistic endeavors of Habsburg authorities and thereby became the semi-official philosophy of Austria. This gradual process was completed by the school reform of 1849, the leading figures of which closely adhered to Herbartian teachings (Landerer and Wilfing 2018, sec. 4), including Zimmermann, Hanslick’s former teacher Franz Exner (1802–53) and his old associate Joseph von Helfert (1820–1910). Hanslick, who attained a position at the ministry of education in 1854, recognized the importance of employing Herbartian principles in OMB, which should set the stage for his academic profession (Payzant 2002, 131). It thus comes as no surprise that Hanslick declared himself a follower of Herbart in his successful habilitation petition of 1856. As recent studies demonstrated convincingly, however, this personal testimony is probably nothing more than an allusion provoked by careerist concerns (Karnes 2008, 31–34; Bonds 2014, 159; Landerer and Zangwill 2016, 90–91). An immediate reference to Herbart is totally absent from earlier editions of OMB, where he is belatedly included in the third edition of 1865 and the sixth edition of 1881 (Hanslick 1986, 77, 85). In spite of this lack of quotes and in view of Herbart’s bearing on Austrian science policies, it is difficult to imagine that Hanslick was completely unfamiliar with Herbart’s ideas prior to the initial edition of 1854. In regard to Hanslick’s argument, Herbartian teachings seem to be important specifically for his formalist approach, for his theory of autonomous instrumental music, for his refutation of emotivist aesthetics, for his emphasis on elemental components of “pure” music and their mutual relations, and for his appreciation of technical musical analysis (Bujić 1988, 7–8; Bonds 2014, 158–62; Wilfing 2018, sec. 2). Generally speaking, the writings of Bolzano and Herbart were similar in various respects—a fact that lead to the frequent blending of their work in post-1848 Austria. Specific features of OMB, however, are decidedly Herbartian, such as Hanslick’s concept of emotion deriving from Herbart’s cognitivist reductionism that regards feelings as a subclass of Vorstellungen or presentations (Landerer and Wilfing 2018, 49n).

c. Editorial Problems and Eclectic Origins of OMB

The Austrian contexts of Hanslick’s aesthetics were supremely important for the contentual alterations following the initial edition of OMB (Landerer and Wilfing 2018, sec. 4). The most striking example of these severe changes, owing to the scientific landscape of contemporary Austria, is the removed final paragraph of Hanslick’s classic treatise. OMB originally concluded in idealist fashion, linking the musically beautiful with “all other great and beautiful ideas.” As “pure” music ultimately represents a sounding portrayal of the motions of the cosmos, it eventually transcends its conceptual limitations, “allowing us to feel… the infinite in works of human talent.” The vital traits of musical structure (harmony, rhythm, sound), Hanslick proclaims, permeate the universe so that one can “find anew in music the entire universe” (Bonds 2012, 4; cf. Hanslick 2018, 120). This original ending of OMB evidently betrayed remnants of German idealism and therefore countered Austrian science policies. This discrepancy was pointed out to Hanslick by the foremost Herbartian philosopher of his time and place: Zimmermann. In an extensive review, published in 1854, he commended the positivistic orientation of Hanslick’s argument that apparently conformed to Herbartian aesthetics, but at the same time criticized the idealist notions present in OMB. According to Zimmermann, the idea that the musically beautiful is completely autonomous epitomized the crucial insight of Hanslick’s argument. For him, this advantage of Hanslick’s aesthetics was compromised by his concession to an aesthetics dependent on speculative metaphysics (Bonds 2012, 5–6). As this public review outlined the Herbartian sentiments of Habsburg authorities responsible for his future career, Hanslick deleted the closing remarks as well as additional passages evocative of his former idealist stances (Landerer and Zangwill 2016; Sousa 2017). It is for this reason that the historical reception of OMB in anglophone scholarship was impacted markedly by Hanslick’s alterations: whereas German-language discourse is based mostly on the initial edition of OMB, its translations utilized editions 7 (Cohen), 8 (Payzant), and 10 (Rothfarb and Landerer) that read more formalistic and positivistic than earlier versions. As the deleted ending of OMB was translated for the first time as late as 1988 (Bujić 1988, 39) and was not discussed seriously by anglophone academics prior to Bonds’s studies, one can get the impression that scholarship in German and English addresses quite different books (Payzant 2002, 44).

A relevant outcome of current research into Hanslick’s intellectual background, however, is the emerging realization that Hanslick’s aesthetics draws upon a wide array of assorted aesthetic discourses integrated into OMB. It is no contradiction that Hanslick’s emphasis on structural relations between musical elements is derived from Herbartian aesthetics, whilst his concurrent refutation of psychological considerations—supremely important for Herbartian aesthetics—appears to be closer to Bolzano. The same applies to Hanslick’s Vischerian concept of historical evolution, overtly opposing the ahistorical orientation of Herbartian aesthetics, and his anti-Hegelian insistence on a categorial distinction between the methods of historical and aesthetic research derived from Herbartian philosophy (Edgar 1999, 443–44; Landerer and Zangwill 2017, 93–94). Hanslick’s textual strategy frequently resembles a virtual collage as in a passage reworded for the second edition of 1858: Hanslick defends that beauty remains beauty “even when it arouses no emotions, indeed when it is neither perceived nor contemplated. Beauty is thus only for the pleasure of a perceiving subject, not generated through that subject” (Bonds 2014, 189; cf. Hanslick 2018, 4). The first part of Hanslick’s quotation is adopted directly from Zimmermann’s review and might even have an immediate antecedent in Bolzano, the former teacher of Zimmermann. Bolzano makes a similar objectivistic statement in On the Notion of Beauty by stating that beauty would remain beauty “even if there existed only one human being in the entire world or no one at all.” The first part of the second sentence, however, alludes to Vischer’s Aesthetics and his concept of Anschauung (perception), thereby directly linking the opposing approaches of Herbartianism and Hegelianism. Hanslick purposely disregards Zimmermann’s ensuing assertion that beauty is based on constant relations between aesthetic properties and thus does not change over time as he acknowledged the historical condition of music and beauty (Landerer and Wilfing 2018, sec. 3). Generally speaking, Hanslick’s argument comprises a multitude of diverse sources—which at times are blatantly antithetic—and his intellectual background is therefore difficult to reconstruct thoroughly. His “eclectic” approach, however, ensured the remarkable durability of Hanslick’s aesthetics, which was not bound by the rise and fall of isolated academic traditions (Bujić 1988, 8).

5. The Reception of Hanslick’s Aesthetics and Its Relevance to Current Discourse

a. A General Outline of Hanslick’s Reception by Austro-German Discourse

The historical reception of Hanslick’s aesthetics, stretching from Viennese Modernism, the beginnings of musicology, and numerous composers to significant philosophers such as Friedrich Nietzsche (1884–1900), Theodor W. Adorno (1903–69), Langer, and analytical aesthetics in general, for the most part represents “terra incognita” (Deaville 2013, 25). Scholarship on Hanslick’s reception is typically restricted to incidental references to conceptual similarities between Hanslick and certain later authors. OMB is mentioned by Karl Popper (1902–94), for example, and probably affected his objective aesthetic approach, his wariness regarding psychological argumentation, and his rejection of emotivism. Ludwig Wittgenstein’s (1889–1951) late work is similarly evocative of Hanslick’s approach, as he declares musical meaning to be purely musical and repudiates the idea that “pure” music could be translated adequately into other modes of expression (Ahonen 2005, 520–23; Szabados 2006, 651–53). Adorno’s adoption of Hanslick’s dynamism (Goehr 2008, 20; Paddison 2010, 131–34) and his distinction between different attitudes towards musical listening betray Hanslick’s impact as much as Adorno’s concept of the historical evolution of musical material (Edgar 1999, 441–44; Paddison 2002, 336), firmly rooted in Hegelian aesthetics. Hanslick’s influence on Nietzsche is particularly remarkable as it spans from his earliest writings to his late work. His vigorous criticism of Wagner in Der Fall Wagner (The Case of Wagner, 1888) and Nietzsche contra Wagner (1889) is inspired evidently by Hanslick’s writings, replicated virtually verbatim on numerous occasions. OMB similarly influenced young Nietzsche, who studied Hanslick’s treatise as early as 1865 and employed Hanslick’s argument in fragment 12[1] of 1871 on the relation between language and “pure” music. Here, Nietzsche verbalizes doctrines that are far more indicative of his eventual refutation of Wagner’s oeuvre than his Geburt der Tragödie (Birth of Tragedy, 1872), written at the same time, might suggest. Scholars have thus assumed a rather brief period of unwavering enthusiasm for the Bayreuth composer (Prange 2011). No philosophical movement, however, has addressed Hanslick’s aesthetics as fruitfully as analytical philosophy, particularly so due to its strong focus on the expressive capabilities of “pure” music.

b. Hanslick’s Reception by Analytical Aesthetics and the Direct Impact of OMB

The crucial feature of analytical philosophy is its methodic scientism as the foundation for all philosophy and all knowledge acquisition in general. Current research into the key attributes of analytical aesthetics regularly highlights its tendency to detach the targets of analysis from various contexts in order to establish the possibility of objective observation (Roholt 2017, 50–51). Hanslick’s positivist approach targeted towards scientific objectivity, his strong appeal to natural science as a guideline for objective aesthetics, and his procedural dissociation of musical artworks from external contexts that are not relevant for aesthetic purposes concurs with this provisional description of analytical philosophy of music. Historically, Hanslick’s aesthetics was perceived as an important corrective to the “fantastic nonsense” and “sentimental speculations” of idealist theories (Lang 1941, 978; Epperson 1967, 109–10) and therefore contributed to the anti-idealist movement of analytical philosophy aimed against Hegelians such as Francis Bradley (1846–1924), Bernard Bosanquet (1848–1923), or John McTaggart (1866–1925). Early analytical aesthetics of the 1950s and 1960s, which initially needed to cast off its widespread reputation of conducting unscientific guesswork, was concerned principally with abstract problems and attempted to determine an exhaustive definition of art, the quality and quantity of aesthetic properties, and the peculiarity of aesthetic perception (Goehr 1993; Lamarque 2000). Even though this focal point of anglophone philosophy left no room for OMB and its emphasis on musical artworks, Hanslick’s treatise gained traction the moment aesthetics redirected its inquiry towards more concrete subjects. Works on issues related to music, increasing strikingly in the 1980s (Lamarque 2000, 14; Davies 2003, 489), proceeded from influential publications by Budd, Davies, and Kivy (all 1980) that featured Hanslick’s aesthetics markedly and set the scene for ensuing decades of anglophone philosophy of music (Davies 2011b, 294). Each of their texts is focused on problems of musical expression and drew from Hanslick’s cognitive concept of emotion, resembling the approach developed by Stanley Schachter and Jerome Singer in the 1960s. Thus, the development of aesthetics concerned with specific objects and the establishment of cognitivist psychology coincide with and form the basis of Hanslick’s fruitful reception by analytical aesthetics.

Hanslick’s theories, the impact of which has even been compared to David Hume’s (1711–76) historic critique of speculative philosophy (Hanslick 1957, vii), shaped the general position on musical meaning in anglophone philosophy. Even though hardly any current approach concurs entirely with Hanslick’s aesthetics (Zangwill 2004 is a prominent exception), his momentous formulation of certain issues continues to dominate aesthetic discourse (Maus 1992, 273; Davies 2003, 492; Hamilton 2007, 82). This fact is exemplified particularly by authors who discard OMB and its cognitivist orientation, but nonetheless acknowledge that his views are permeating anglophone philosophy (Madell 2002, 1–9). His cognitive hypothesis, however, was not the only argument espoused by analytical academics, who also drew from more specific aspects of OMB. Hanslick’s rejection of basic forms of musical expression, treating affective features as a direct result of the composer’s emotional condition (Hanslick 2018, 63–65), for example, is basically accepted by modern research (Kivy 1980, 14–15; Davies 1986, 148; Naar, chap. 3b). Hanslick justifies this view with the theoretical redundancy of an aesthetic approach that traces the cause of emotional expression to a source located outside of art. Musical expression is successful principally in virtue of the expressive properties of music chosen to indicate a specific feeling and cannot be explained by reference to the artist’s affect states, already absorbed by his creation (Kivy 2009, 250; Davies 2011a, 23; Gracyk 2013, 78–79). Another argument aimed against arousal theories that has been discussed frequently by anglophone philosophers, and that was coined mainly by Budd (1985, 125), is the “heresy of the separable experience” (Ridley 1995, 38–49; Scruton 1997, 145–46; Madell 2002, 32, 57, 99). If musical expression is dependent on the response of the listener, music might become nothing more than a random medium of transference, which could be replaced by objects causing an identical response, and loses sight of the individuality of the composition (Hanslick 2018, 91–92). Hanslick proposes that causal theories cannot explain the unique quality of musical artworks as they tend to regard music as a device for affective arousal that could just as well be realized by a warm bath, a cigar, chloroform (Hanslick 2018, 83), or by a drug causing feelings (Kivy 1989, 218, 222, 242; Matravers 1998, 169–85; Robinson 2005, 351, 393, 397).

c. Bypassing Hanslick’s Cognitivist Arguments: Kivy, Davies, and Moods

As we have seen, important objections directed against current theories of musical arousal and expression propounded by anglophone philosophers stem from Hanslick’s aesthetics and extend beyond the cognitivist hypothesis of OMB. His cognitivism is therefore frequently considered the strongest argument that emotivist aesthetics has substantial weaknesses (Kivy 1989, 157; Davies 1994, 209). Hanslick’s (implicit) concept of indeterminate expressivity (Wilfing 2016, 26–29) suggests that emotion is an inherent property of musical structure—an idea that laid the ground for the enhanced formalism of Davies and Kivy, which is based on the similarity perceived between musical motions and the outward features of human emotion. Enhanced formalism does not hold that music refers beyond itself to occurrent emotions but considers expression an objective property of musical structure: music itself is the owner of the emotion it expresses (Davies 1980, 68; Kivy 1980, 64–66). Hanslick, however, had good reasons to abandon enhanced formalism as the theoretical foundation of scientific aesthetics—reasons that paved the way for another argument crucial to analytical aesthetics: the argument from disagreement (Gardner 1996, 245–46; Sharpe 2004, 19–20). While Davies (1994, 213–15) and Kivy (1990, 175–77) fully agree that “pure” music cannot express Platonic attitudes (emotions such as pride or shame that involve complex concepts), they hold that it is able to portray definite emotional properties of a lower order. Hanslick’s attitude is even more skeptical: As the dynamic character of affect states is only one moment of emotion, not emotion itself, music can merely allude to a certain variety of affect states, not to any sentiment in particular, and any survey among an audience regarding the emotion ascribed to a piece would thus yield varied results (Hanslick 2018, 23). As enhanced formalism is based on the semblance perceived between musical motion and emotive behavior, Davies and Kivy needed to dismiss Hanslick’s claim about considerable disagreement by gradually retreating to more and more general emotions, which serve as umbrella concepts for specific emotions (Kivy 1980, 46–48; Davies 1994, 246–52). Other scholars pointed to Hanslick’s metaphor of expressive silhouettes and construed his argument in terms of indeterminate expressivity along the lines of Rorschach’s inkblot testing, thereby updating Hanslick’s argument for modern debates (Ahonen 2007, 93).

Generally speaking, OMB introduced numerous important arguments to analytical aesthetics that remain the subjects of current research, such as the famous paradox of negative emotion, which Hanslick directed against theories of musical arousal. If every death march or every somber adagio, Hanslick declares, had the power to elicit grief in the listener, nobody would bother with such works (Hanslick 2018, 90–91). Solutions to Hanslick’s question vary from the rejection of emotive arousal (Kivy 1989, 234–59) and accounts of the way negative emotions have beneficial pedagogic effects (Levinson 1982; Davies 1994, 307–20; Ridley 1995, chap. 7) to revised arousal theories that hold that emotional reactions to music rarely mirror the feeling depicted by a given piece (thus, a somber adagio could arouse compassion instead of sorrow; Matravers 1991 and 1998, chap. 8). Finally, Hanslick’s cognitivist formalism has contributed to a noticeable reframing of the general approach to emotive musical meaning. Matravers, for example, asserted that a piece of music would depict a specific emotion if it arouses a feeling, the physiological components of which would correspond to the emotion depicted (Matravers 1998, 149). As music cannot portray the cognitive elements of genuine emotions, Hanslick’s argument is bypassed by an appeal to feeling as the somatic feature of emotion, which music is able to prompt directly (Matravers 1991, 328). Ridley, who endorses Hanslick’s cognitive objection to common arousal theories, shares this idea by considering “objectless passions” as feelings, the gestural character of which is evoked by the dynamic qualities of music (Ridley 1995). Thus, OMB and its cognitivist orientation occasioned a shift from issues of emotional expression to issues of music’s relation to non-cognitive affect states—a shift also made clear by an increased discussion on music and moods (Radford 1991; Carroll 2003; Sizer 2007). Although OMB has thus come under attack in anglophone philosophy, the constant rebuttal of Hanslick’s aesthetics at the same time illustrates the degree to which his approach is ingrained in analytical philosophy in regard to questions of musical meaning. The lion’s share of theorists continues to consider Hanslick’s cognitive argument to be accurate in principle and adjusts their models of expressivity accordingly. Hanslick’s influence on current debates thus goes beyond the assenting reception of OMB and thereby remains equally present in modern theories intentionally sidestepping the key argument of Hanslick’s approach.

6. References and Further Reading

a. Primary Sources

Hanslick, Eduard. 1950. Music Criticisms, 1846–1899. Translated by Henry Pleasants. Harmondsworth: Penguin Books.
Hanslick, Eduard. 1957. The Beautiful in Music: A Contribution to the Revisal of Musical Aesthetics. Edited by Morris Weitz. Translated by Gustav Cohen. Indianapolis: Bobbs-Merrill.
Hanslick, Eduard. 1986. On the Musically Beautiful: A Contribution Towards the Revision of the Aesthetics of Music. Translated by Geoffrey Payzant. Indianapolis: Hackett.
Hanslick, Eduard. 1993. Sämtliche Schriften: Historisch-kritische Ausgabe. Vol. 1, Aufsätze und Rezensionen 1844–1848. Edited by Dietmar Strauß. Vienna: Böhlau.
Hanslick, Eduard. 1994. Sämtliche Schriften: Historisch-kritische Ausgabe. Vol. 2, Aufsätze und Rezensionen 1849–1854. Edited by Dietmar Strauß. Vienna: Böhlau.
Hanslick, Eduard. 2018. On the Musically Beautiful: A New Translation. Translated by Lee Rothfarb and Christoph Landerer. Oxford: Oxford University Press.

b. Secondary Sources

Ahonen, Hanne. 2005. “Wittgenstein and the Conditions of Musical Communication.” Philosophy 80: 513–29.
Ahonen, Hanne. 2007. “Wittgenstein and the Conditions of Musical Communication.” PhD diss., University of Columbia.
Alperson, Philip. 1984. “On Musical Improvisation.” Journal of Aesthetics and Art Criticism 43, no. 1: 17–29.
Alperson, Philip. 2004. “The Philosophy of Music: Formalism and Beyond.” In The Blackwell Guide to Aesthetics, edited by Peter Kivy, 254–75. Malden: Blackwell.
Beard, David, and Kenneth Gloag. 2005. Musicology: The Key Concepts. London: Routledge.
Bell, Clive. 1914. Art. London: Chatto & Windus.
Bonds, Mark Evan. 2012. “Aesthetic Amputations: Absolute Music and the Deleted Endings of Hanslick’s Vom Musikalisch-Schönen.” 19th-Century Music 36, no. 1: 3–23.
Bonds, Mark Evan. 2014. Absolute Music: The History of an Idea. Oxford: Oxford University Press.
Bowman, Wayne D. 1991. “The Values of Musical ‘Formalism’.” Journal of Aesthetic Education 25, no. 3 (1991): 41–59.
Brodbeck, David. 2007. “Dvořák’s Reception in Liberal Vienna: Language Ordinances, National Property, and the Rhetoric of ‘Deutschtum’.” Journal of the American Musicological Society 60, no. 1: 71–132.
Brodbeck, David. 2009. “Hanslick’s Smetana and Hanslick’s Prague.” Journal of the Royal Musical Association 134, no. 1: 1–36.
Brodbeck, David. 2014. Defining ‘Deutschtum’: Political Ideology, German Identity, and Music-Critical Discourse in Liberal Vienna. Oxford: Oxford University Press.
Budd, Malcolm. 1980. “The Repudiation of Emotion: Hanslick on Music.” British Journal of Aesthetics 20, no. 1: 29–43.
Budd, Malcolm. 1985. Music and the Emotions: The Philosophical Theories. London: Routledge.
Bujić, Bojan. 1988. Music in European Thought, 1851–1912. Cambridge: Cambridge University Press.
Burford, Mark. 2006. “Hanslick’s Idealist Materialism.” 19th-Century Music 30, no. 2: 166–81.
Carroll, Noël. 1999. Philosophy of Art: A Contemporary Introduction. London: Routledge.
Carroll, Noël. 2001. “Formalism.” In The Routledge Companion to Aesthetics, edited by Berys Gaut and Dominic McIver Lopes, 87–96. London: Routledge.
Carroll, Noël. 2003. “Art and Mood: Preliminary Notes and Conjectures.” The Monist 86, no. 4: 521–55.
Cook, Nicholas. 2001. “Theorizing Musical Meaning.” Music Theory Spectrum 23, no. 2: 170–95.
Dahlhaus, Carl. 1989. The Idea of Absolute Music. Translated by Roger Lustig. Chicago: University of Chicago Press.
Davies, Stephen. 1980. “The Expression of Emotion in Music.” Mind 89: 67–86.
Davies, Stephen. 1986. “The Expression Theory Again.” Theoria 52, no. 3: 146–67.
Davies, Stephen. 1994. Musical Meaning and Expression. Ithaca: Cornell University Press.
Davies, Stephen. 2003. “Music.” In Levinson 2003, 489–515.
Davies, Stephen. 2011a. Musical Understandings and Other Essays on the Philosophy of Music. Oxford: Oxford University Press.
Davies, Stephen. 2011b. “Analytic Philosophy and Music.” In Gracyk and Kania 2011, 294–304.
Deaville, James. 2013. “Negotiating the ‘Absolute’: Hanslick’s Path Through Musical History.” In Grimes, Donovan, and Marx 2013, 15–37.
Downes, Stephen, ed. 2014. Aesthetics of Music: Musicological Perspectives. New York: Routledge.
Dziemidok, Bohdan. 1993. “Artistic Formalism: Its Achievements and Weaknesses.” Journal of Aesthetics and Art Criticism 51, no. 2: 185–93.
Edgar, Andrew. 1999. “Adorno and Musical Analysis.” Journal of Aesthetics and Art Criticism 57, no. 4: 439–49.
Epperson, Gordon. 1967. The Musical Symbol: An Exploration in Aesthetics. Ames: Iowa State University Press.
Fisher, John Andrew. 1993. Reflecting on Art. Mountain View: Mayfield.
Gardner, Sebastian. 1996. “Aesthetics.” In The Blackwell Companion to Philosophy, edited by Nicholas Bunnin and E. P. Tsui-James, 229–56. Oxford: Blackwell.
Ginsborg, Hannah. 2011. “Kant.” In Gracyk and Kania 2011, 328–38.
Goehr, Lydia. 1993. “The Institutionalization of a Discipline: A Retrospective of the Journal of Aesthetics and Art Criticism and the American Society of Aesthetics, 1939–1992.” Journal of Aesthetics and Art Criticism 51, no. 2: 99–121.
Goehr, Lydia. 2008. Elective Affinities: Musical Essays on the History of Aesthetic Theory. New York: Columbia University Press.
Gooley, Dana. 2011. “Hanslick and the Institution of Criticism.” Journal of Musicology 28, no. 3: 289–324.
Gracyk, Theodore. 2013. On Music. London: Routledge.
Gracyk, Theodore and Andrew Kania, eds. 2011. The Routledge Companion to Philosophy and Music. London: Routledge.
Grey, Thomas S. 1995. Wagner’s Musical Prose: Texts and Contexts. Cambridge: Cambridge University Press.
Grey, Thomas S. 2002. “Hanslick, Eduard.” In The New Grove Dictionary of Music and Musicians, edited by Stanley Sadie, 10:827–33. London: Macmillan.
Grey, Thomas S. 2003. “Masters and Their Critics: Wagner, Hanslick, Beckmesser, and Die Meistersinger.” In Wagner’s “Meistersinger”: Performance, History, Representation, edited by Nicholas Vazsonyi, 165–89. Rochester: University of Rochester Press.
Grey, Thomas S. 2011. “Hanslick.” In Gracyk and Kania 2011, 360–70.
Grey, Thomas S. 2014. “Absolute Music.” In Downes 2014, 42–61.
Grimes, Nicole, Siobhán Donovan, and Wolfgang Marx, eds. 2013. Rethinking Hanslick: Music, Formalism, and Expression. Rochester: University of Rochester Press.
Hamilton, Andy. 2007. Aesthetics and Music. London: Continuum Books.
Karnes, Kevin. 2008. Music, Criticism, and the Challenge of History: Shaping Modern Musical Thought in Late Nineteenth-Century Vienna. Oxford: Oxford University Press.
Kivy, Peter. 1980. The Corded Shell: Reflections on Musical Expression. Princeton: Princeton University Press.
Kivy, Peter. 1989. Sound Sentiment: An Essay on the Musical Emotions. Philadelphia: Temple University Press.
Kivy, Peter. 1990. Music Alone: Philosophical Reflections on the Purely Musical Experience. Ithaca: Cornell University Press.
Kivy, Peter. 2002. Introduction to a Philosophy of Music. Oxford: Clarendon Press.
Kivy, Peter. 2009. Antithetical Arts: On the Ancient Quarrel Between Literature and Music. Oxford: Clarendon Press.
Lamarque, Peter. 2000. “The British Journal of Aesthetics: Forty Years On.” British Journal of Aesthetics 40, no. 1: 1–20.
Landerer, Christoph and Nick Zangwill. 2016. “Contemplating Musical Essence.” Journal of the Royal Musical Association 141, no. 2: 483–94.
Landerer, Christoph and Nick Zangwill. 2017. “Hanslick’s Deleted Ending.“ British Journal of Aesthetics 57, no. 1: 85–95.
Lang, Paul Henry. 1941. Music in Western Civilization. New York: Dent & Sons.
Larkin, David. 2013. “Battle Rejoined: Hanslick and the Symphonic Poem in the 1890s.” In Grimes, Donovan, and Marx 2013, 289–310.
Levinson, Jerrold. 1982. “Music and Negative Emotion.” Pacific Philosophical Quarterly 63: 327–46.
Levinson, Jerrold, ed. 2003. The Oxford Handbook of Aesthetics. Oxford: Oxford University Press.
Lippman, Edward A. 1992. A History of Western Musical Aesthetics. Lincoln: University of Nebraska Press.
Madell, Geoffrey. 2002. Philosophy, Music, and Emotion. Edinburgh: Edinburgh University Press.
Matravers, Derek. 1991. “Art and the Feelings and Emotions.” British Journal of Aesthetics 31, no. 4: 322–31.
Matravers, Derek. 1998. Art and Emotion. Oxford: Clarendon Press.
Maus, Fred Everett. 1992. “Hanslick’s Animism.” Journal of Musicology 10, no. 3: 273–92.
McColl, Sandra. 1995. “To Bury Hanslick or to Praise Him? The Obituaries of August 1904.” Musicology Australia 18, no. 1: 39–51.
Mothersill, Mary. 1984. Beauty Restored. Oxford: Clarendon Press.
Nattiez, Jean-Jacques. 1990. Music and Discourse: Toward a Semiology of Music. Translated by Carolyn Abbate. Princeton: Princeton University Press.
Paddison, Max. 2002. “Music as Ideal: The Aesthetics of Autonomy.” In The Cambridge History of Nineteenth-Century Music, edited by Jim Samson, 318–42. Cambridge: Cambridge University Press.
Paddison, Max. 2010. “Mimesis and the Aesthetics of Musical Expression.” Music Analysis 29, no. 1–3: 126–48.
Payzant, Geoffrey. 1985. “Eduard Hanslick’s Vom Musikalisch-Schönen: A pre-publication excerpt.” The Music Review 46: 179–85.
Payzant, Geoffrey. 1989. “Eduard Hanslick and Bernhard Gutt.” The Music Review 50: 124–33.
Payzant, Geoffrey. 1991. Eduard Hanslick and Ritter Berlioz in Prague: A Documentary Narrative. Calgary: University of Calgary Press.
Payzant, Geoffrey. 2002. Hanslick on the Musically Beautiful: Sixteen Lectures on the Musical Aesthetics of Eduard Hanslick. Christchurch: Cybereditions.
Pederson, Sanna. 1996. “Romantic Music Under Siege in 1848.” In Music Theory in the Age of Romanticism, edited by Ian Bent, 57–74. Cambridge: Cambridge University Press.
Pederson, Sanna. 2014. “Romanticism/Anti-Romanticism.” In Downes 2014, 170–87.
Prange, Martine. 2011. “Was Nietzsche Ever a True Wagnerian? Nietzsche’s Late Turn to and Early Doubt About Richard Wagner.” Nietzsche Studien 40: 43–71.
Pryer, Anthony. 2013. “Hanslick, Legal Processes, and Scientific Methodologies: How Not to Construct an Ontology of Music.” In Grimes, Donovan, and Marx 2013, 52–69.
Ridley, Aaron. 1995. Music, Value, and the Passions. Ithaca: Cornell University Press.
Radford, Colin. 1991. “Muddy Waters.” Journal of Aesthetics and Art Criticism 49, no. 3: 247–52.
Robinson, Jenefer. 2005. Deeper Than Reason: Emotion and Its Role in Literature, Music, and Art. Oxford: Clarendon Press.
Roholt, Tiger C. 2017. “On the Divide: Analytic and Continental Philosophy of Music.” Journal of Aesthetics and Art Criticism 75, no. 1: 49–58.
Rosengard Subotnik, Rose. 1991. Developing Variations: Style and Ideology in Western Music. Minneapolis: University of Minnesota Press.
Scruton, Roger. 1997. The Aesthetics of Music. Oxford: Clarendon Press.
Sharpe, R. A. 2004. Philosophy of Music: An Introduction. Chesam: Acumen.
Sizer, Laura. 2007. “Moods in the Music and the Man: A Response to Kivy and Carroll.” Journal of Aesthetics and Art Criticism 65, no. 3: 307–12.
Small, Christopher. 1998. Musicking: The Meanings of Performing and Listening. Hanover, NH: Wesleyan University Press.
Sousa, Tiago. 2017. “Was Hanslick a Closet Schopenhauerian?” British Journal of Aesthetics 57, no. 2: 211–29.
Stecker, Robert. 2003. “Definition of Art.” In Levinson 2003, 136–54.
Szabados, Béla. 2006. “Wittgenstein and Musical Formalism.” Philosophy 81: 649–58.
Tatarkiewicz, Władysław. 1973. “Form in the History of Aesthetics.” Dictionary of the History of Ideas 2: 216–25.
Titus, Barbara. 2008. “The Quest for Spiritualized Form: (Re)positioning Eduard Hanslick.” Acta Musicologica 80, no. 1: 67–98.
Trivedi, Saam. 2011. “Resemblance Theories.” In Gracyk and Kania 2011, 223–32.
Yanal, Robert J. 2006. “Hanslick’s Third Thesis.” British Journal of Aesthetics 46, no. 3: 259–66.
Yoshida, Hiroshi. 2001. “Eduard Hanslick and the Idea of ‘Public’ in Musical Culture: Towards a Socio-Political Context of Formalistic Aesthetics.” International Review of the Aesthetics and Sociology of Music 32, no. 2: 179–99.
Zangwill, Nick. 2004. “Against Emotion: Hanslick Was Right About Music.” British Journal of Aesthetics 44, no. 1: 29–43.

Research for this article was supported financially by the Austrian Science Fund (FWF, project number P30554-G30).

Author Information

Alexander Wilfing
Email: alexander.wilfing@oeaw.ac.at
Austrian Academy of Sciences
Austria

and

Christoph Landerer
Email: chlanderer@gmail.com
Austria

The Semantic Theory of Truth

The semantic theory of truth (STT, hereafter) was developed by Alfred Tarski in the 1930s. The theory has two separate, although interconnected, aspects. First, it is a formal mathematical theory of truth as a central concept of model theory, one of the most important branches of mathematical logic. Second, it is also a philosophical doctrine which elaborates the notion of truth investigated by philosophers since antiquity. In this respect, STT is one of the most influential ideas in contemporary analytic philosophy. This article discusses both aspects.

The STT is designed to define truth without circularity and to satisfy certain minimal conditions that must be met by any adequate theory of truth.

STT as a formal construction is explicated via set theory and the concept of satisfaction. The prevailing philosophical interpretation of STT considers it to be a version of the correspondence theory of truth that goes back to Aristotle. This theory is presented here in its modern shape, that is, as associated with first-order logic. Tarski’s original account used the elementary theory of classes (a theory similar to the simple theory of types).

One of Tarski’s most important results was to show that a theory of truth for set theory cannot be given within set theory itself, and that any truth definition for a formal language L must be given in a language which is essentially stronger than L.

Historical Introduction
Outline of STT
Informal Presentation of STT
Formal Presentation of STT
Philosophical Comments
Final Remarks
References and Further Reading

1. Historical Introduction

Alfred Tarski (1901–1983) was a Polish mathematician, logician and philosopher. He lived in the U.S.A. from 1939 onward and became an American citizen in 1945. He was a member of the Polish Mathematical School, the Warsaw School of Logic and the Lvov-Warsaw Philosophical School. These schools flourished in the interwar period (1918-1939).

While investigating problems associated with the definability of real numbers, Tarski came to the conclusion that the concept of satisfaction informally used in mathematics can help in defining the concept of truth. In 1930, he delivered two lectures (one in Warsaw. the second in Lvov) devoted to the concept of truth. In 1931, he began to work on a monograph on this topic. It was published in 1933 (see Tarski 1933) as Pojęcie prawdy w językach nauk dedukcyjnych (The Concept of Truth in Languages of Deductive Sciences). This book was well-received in Poland.

Due to Tarski’s contacts with the Vienna Circle, his semantic ideas became known abroad. The German translation (Der Wahrheitsbegriff in den formalisierten Sprachen) of Tarski’s Polish book appeared in 1935 (see Tarski 1935). In the same year, Tarski lectured at the Paris Congress for Scientific Philosophy; his lectures on the foundations and semantics and on the concept of logical consequence were applauded; (see Tarski 1936 and Tarski 1936a). His popular paper on the concept of truth appeared in Philosophy and Phenomenological Research in 1944 (see Tarski 1944). The English translation based on the German version of the book on truth (see Tarski 1956a) was included in Tarski’s famous collection Logic, Semantics, Metamathematics (1956). The last Tarski’s essay on truth (rather more popular than formal), namely “Truth and Proof”, was published in 1969 (see Tarski 1969). Since all Tarski’s writings on truth present principally the same ideas, this article does not refer to his particular works, except in some places.

2. Outline of STT

The Semantic theory of Truth (STT) has many ingredients. The most important are as follows:

(A) Truth as a property of sentences;
(B) Relations between truth and meaning;
(C) Diagnosis of semantic paradoxes;
(D) Resolution of semantic paradoxes;
(E) Relativization to languages;
(F) T-scheme (A is true if and only if A);
(G) The principle BI of bivalence;
(H) Material and formal adequacy of a truth-definition;
(I) Conditions imposed on a metalanguage in order to obtain a proper truth-definition;
(J) The relation between language and metalanguage;
(K) The truth-definition itself;
(L) Maximality of the set of truths in a given language;
(M) The undefinability theorem.

These points are gradually elaborated in the next remarks, with capital letters referring back to the above list.

(A)–(B). For Tarski, sentences are truth-bearers. However, sentences are always equipped with meanings. Tarski avoided explaining what the meaning of an expression is. On the other hand, he explicitly said that the problem of defining truth is meaningless for purely informal languages. Roughly speaking, the semantic truth-definition (SDT, for brevity) is formulated for formalized languages.

(C)–(D). The Liar Paradox is a serious problem for any truth-definition. The ancient version attributed to Epimenides runs as follows. A Cretan says “I am lying now”. If he is actually lying, his sentence is true, but if he is not lying, the sentence in question is false. Contradiction! For the modern version, consider the sentence

(S) The sentence denoted by (S) is false.

Observe that (S) = ((S) is false). Since, (S) and ‘(S) is true’ are equivalent, we obtain a contradiction expressed by

(LP) (S) is true if and only if (S) false.

What are sources of the Liar Paradox (LP)? First, it employs the sentence (S) which asserts its own falsity. Such a situation is called a self-referential use of a semantic concept; the semantic concept in this case is falsehood. Second, the Paradox uses a rule that a sentence, let us say A, is true if and only if A (which Tarski called the T-scheme). Third, we apply, classical logic, in particular, the law of bivalence, that is, (BI).

This diagnosis, which was proposed by Stanisław Leśniewski (Tarski’s teacher in Warsaw) and adopted by Tarski, offers three ways out of the Paradox. First, one could eliminate self-referentiality from the language. Second, reject the T-scheme. Third, change logic, in particular, reject (BI). The third strategy is popular in the twenty-first century, and it uses the techniques of many-valued logic, logic with truth-value gaps, or paraconsistent logic. These solutions will not be commented upon in this article. Anyway, Tarski considered them to be too complex and too narrow because they require the rejection of what should be retained. The T-scheme, according to him, is so intuitive that it cannot be rejected. Thus, the proper solution is to eliminate self-referentiality, he said.

(E)–(F). How to eliminate self-referentiality? The main idea is that the concept of truth should be relativized to a language. More specifically, we deal with the context ‘the sentence A is true in a language L’. However, this move is still insufficient, because if self-referentiality is to be banished, the adjective ‘true’ must belong to another language. This new language is called the metalanguage and is abbreviated by the symbol ML (we assume that L is a corresponding language). The simplest and the most popular situation is that L is an object-language (used to speak about the world) and ML forms its metalanguage, suitable for speaking about L. Here is an example. Assume that German is our object-language, but English serves as the associate metalanguage. We write in L ‘Schnee ist weiss’, but in ML we write ‘The German sentence “Schnee ist weiss” means that snow is white’. We see that ML must contain resources for speaking about expressions belonging to L. In order to indicate that we are speaking about L-expressions, we use quotation marks, but many other devices can be employed. For instance, we can use italics and write that the sentence Schnee is weiss means that snow is white. The most important observation is that expressions like ‘Schnee ist weiss’ and Schnee ist weiss are (metalinguistic) names in ML of the corresponding German sentence that is in L. The standard way of capturing the reported distinction is to say that expressions are used in L, but mentioned in ML.

The above conventions function as the part of STT. A simple example is

(1) ‘Schnee ist weiss’ in German is true if and only if snow is white.

The interaction of two languages in (1) consists in the fact that the name of the german sentence is on left, and its English translation is on the right. If the same language functions as both L and ML, one should speak about self-translation. According to the foregoing explanations we can generalize (1) into

(TS) ‘A’ is true in L if and only A*,

where the symbol A* refers to a translation of the sentences denoted by the name ‘A’. It is the general form of the T-scheme. (For additional discussion of the T-scheme, see the Liar Paradox.) Note that we cannot replace (TS) by

(2) For any A, ‘A’ is true if and only if A,

because the letter A is not free in the expression ‘A’. Quotes can be regarded as a name-forming operator. Anyway, concrete biconditionals (T-sentences, T-equivalences) arising from (TS) play the crucial role in STT. Roughly speaking, they capture the following intuition: a sentence saying so and so is true if so and so.

All explanations given above are formulated in ordinary English. It is easy to see several inconveniences of this approach. For instance, we should multiply quotes, when we pass from using to mentioning, for instance to write ‘‘A’’, when ‘A’ is mentioned. To simplify the issue, we replace some occurrences of quotes by such expressions as ‘name’, ‘sentence’, and so forth. Also, the concept of translation as applied to ordinary languages is not precise. The most important thing is that ordinary languages contain their own metalanguages, that is they are (to use Tarski’s way of speaking), semantically closed. This circumstance causes semantic paradoxes; the Liar is only one of them, but we will not consider others.

Tarski was very sceptical about the possibility of successfully providing a coherent truth-definition for ordinary language. Hence, he worked with a formal language. Such a language must have a well-defined alphabet (the set of elementary expressions), a well-defined set of formulas and a logical basis. If L is a formalized language, its ML is only partially formal, usually a part of ordinary mathematics. The following example illustrates the issue. Let ‘P(a)’ be the considered formula. It is an atomic formula of first-order language and says that a is P (the object a has a property P). The truth conditions of this sentence should be formulated by

(3) ‘P(a)’ is true if and only if a is a member of the set P,

where the non-italics letter P refers to the set that is denoted by the italicized predicate letter P. When (3) is expressed more formally in set theory, the binary relation “is a member of” is usually represented by the Greek letter epsilon, namely $\in$. In this example, the language of set theory serves as the metalanguage ML. To finish this part, note that Tarski liberalized his early negative attitude to ordinary speech. In his later works, he introduced the concept of languages having specified structure (see Tarski 1944). They are not semantically closed formalized languages, but are well-described by specification of their units, complex expression and the underlying logic.

3. Informal Presentation of STT

As noted earlier, Tarski considered the concept of satisfaction (more precisely, the satisfaction relation) as basic for defining truth. In particular, truth is to be defined as a special case of satisfaction. Assume that L is given – it is a first-order formal language. Open formulas are defined as containing free variables. By contrast, closed formulas have no free variables – for instance, P(a) or $\exists xPx$. Open formulas are satisfied or not, depending upon how the free variables are interpreted in a given domain D, but sentences are true or false. Take the formula ‘x is a city’. Let D consist of cities and rivers. Our formula is satisfied by London, but not by Thames (we assume that the name ‘Thames’ refers to the river Thames). Furthermore, the sentence ‘London is a city’ is true in D, but the sentence ‘Thames is a city’ is false in D. Roughly speaking, satisfaction converts open formulas into true sentences, but non-satisfaction into false ones. Moreover, these considerations show that an instance of the T-scheme, namely the equivalence ‘the sentence ‘London is a city’ is true if and only if London is a city’ correctly displays the main ordinary intuition associated with the predicate ‘is true’.

The above explanations do not provide a definition of truth. Consider now two collections of ideas:

(A) (General case): open formulas,
satisfaction by some objects from D_;
non-satisfaction by some objects from D;

(Special case): closed formulas (sentences), satisfaction by ?;
non-satisfaction by ?

Inspecting the formulas ‘x is a city’ and ‘London is a city’ leads to the conclusion that although satisfaction depends on valuation (valuation given by a valuation function consists in attributing denotations from D to expressions of L) of free variables, truth and falsehood do not. The reason is very simple and even trivial, namely that sentences have no free variables. Consequently, truth and falsehood should (even must) be independent of how the valuation function acts with respect to terms that are free variables. On the other hand, logical values are determined by valuations of constants (individual names, such as ‘London’) and predicates (such as ‘is a city’) as well as by the understanding of logical constants (propositional connectives, quantifiers and identity).

The last observation motivates the following formulation of SDT assuming that the domain of interpretation D is fixed:

(3) (a) ‘A’ is true if and only if ‘A’ is satisfied by any object in D;

(b) ‘A’ is false if and only if ‘A’ is satisfied by no object in D.

Using ‘London is a city’ as an example we have that this sentence is true if and only if it is satisfied by any object from D (this formulation will be commented upon below). Now, (A) can be corrected by dropping question-marks as

(B) Open formulas: satisfaction by some objects from D, but not others;

sentences: satisfaction by all objects from D (truth);

open formulas: non-satisfaction by some objects from D;

sentences: satisfaction by no objects from D (falsity).

The formal version of (B) is formulated in the next section.

The definition of sentences as open formulas without free variables looks at first sight like an artificial mathematical trick, but such constructions frequently occur in mathematical practice as useful simplifications. For example, the straight line can be considered as a special case of a curve, or Euclidean space as a special instance of Riemannian space, and so forth. Consequently, (B) can be charged with being a result of a purely formal game, completely alien to ordinary and philosophical intuitions. Tarski did not conceal that his explanations pertaining to truth employ mathematical concepts and techniques that are perhaps fairly obvious for practising mathematicians, but that are not convincing as tools of a reasonable philosophical analysis. This article does not do that. However, one can also try to argue that this definition fulfills some intuitive constraints. For instance, it entails that no sentence is true and false at the same time (the metalogical principle of contradiction). On the other hand, if A is an open formula, it is not the case that either A is satisfied or $\neg$A is satisfied. The formulas P(x) and $\neg$P(x) can serve as an example – both can be satisfied, for instance, ‘x is a city’ and ‘x is not a city’ can be satisfied though not by the same city. This example shows that generally speaking satisfaction of open formulas has some other properties than truth attributed to sentences, although, both concepts are related in many ways. By definition, every sentence is satisfied by all objects or by no object. Assume that the formula $\forall$xP(x) is true and, thereby, satisfied by every object. Its negation, the formula $\exists$x$\neg$P(x), is satisfied by no object. This assertion implies the metalogical principle of the excluded middle. Thus, we reach (BI) (the principle of bivalence).

Let us try to come up with a philosophical paraphrase of the statement that if truth and falsehood are independent of valuations of free variables, then having logical values by sentences depends on how things are in considered universes, in our example, in D. It is time to introduce (informally, but it suffices) the concept of model. Models are algebraic structures consisting of a universe U (that is, a set of objects; some items can be distinguished and named by special names – individual constants) and relations, defined on U (other elements of model are omitted). If X is a set of sentences and M is its model, then all sentences belonging to X are true in M. Perhaps we could say that if truth and falsehood are indeed free of such valuations, then whether sentences have definite logical values is how things are in a relevant model.

Two additional remarks are in order. First, satisfaction by all objects cannot be regarded as equivalent to being a logical tautology. Satisfaction is always relative to a chosen (fixed) universe. In particular, all conclusions made in this section assume that the stock of predicates – such as ‘is a city’ is established in advance and its elements have a definite meaning that stems from a specific interpretation. If A is a logical tautology this means that A is true (now in the outlined sense) in all models Second, truth and falsehood relativizes truth (and falsehood) not only to L, but also to M. To sum up, SDT considers truth as relativized to an interpretation of L via M. In fact, SDT defines the set of true sentences in a given L. This literally means that the definition in question is extensional, that is, determines the scope of the predicate ‘is true’. However, taking into account that every definition of a given set X as a reference of a predicate P, directly or indirectly, deals with the content of P, SDT offers an understanding of the property expressed by P.

To be satisfactory SDT must conform to so-called conditions of adequacy. More specifically, this definition must be (a) formally correct, and (b) materially correct Condition (a) means that the definition does not lead to paradoxes and it is not circular. These requirements involve the interplay of L and ML functioning as insurance against semantic inconsistencies. Moreover, SDT does not appeal to the concept of truth for ML. Condition (b) is formulated as the Convention T (CT, for brevity) stating that (a) a formally correct truth-definition should logically entail all instances of T-scheme available in L; (b) Tr $\subseteq$L (the set of true sentences of L is a subset of the entire L). CT shows that the T-scheme is not a required T-definition. On the other hand, Tarski underlined that every particular T-sentence provides a partial definition of truth for a given sentence. One could possibly form the conjunction of all T-equivalences as the definition, but this formula would to be infinite in length (thus, this maneuver is limited to finite languages). Moreover, the T-scheme does not imply (BI).

A standard objection against STT points out that it stratified the concept of truth. It is because we have the entire hierarchy of languages L_o (the object language), L₁ ( = ML_o), L₂ (= ML₁), L₃ (M L₂), …. Denote this hierarchy by the symbol HL. It is infinite and, moreover, there is no universal metalanguage allowing a truth-definition for the entire HL. Such a language would be semantically closed and, thereby, inconsistent. STT generates the hierarchy ‘truth in L₀’, ‘truth in L₁’ ‘truth in L₂’, …, contrary to the ordinary use of ‘is true’ which is not stratified. Thus, SDT must be separately performed at every level of HL. Two observations are in order in this context. Firstly, we have that Tr(L_n) $\subset$ Tr(L_n+1), for every n, due to the fact that every L_nis translated into its metalanguage L_n+1. Consequently, HL is cumulative, that is, Tr(L_n+1) includes all truths of L_n. Secondly, taking first-order logic as the foundation and the Hilbert thesis (every theory can be formalized in the first-order language), we define ‘true in the first order L’ in ML. This second language is partially informal. In fact, SDT for first-order languages requires tools from weak-second order logic (but it is too formal issue to be explained in this survey). Thus, the stratification objection (originally formulated for Tarski’s construction via a simple theory of types) can be easily discarded and we can stay with one concept of truth. The price is that the concept of truth cannot be used for sentences formulated in ML.

4. Formal Presentation of STT

The earlier explanations concerned the simplest case, namely satisfaction of monadic open formulas, that is, of the form P(x). What about the formula (a) ‘x is a larger city than y’, which expresses the relation of being a larger city? We can say that the sequence <London, Manchester> satisfies (a), but not the sequence <Manchester, London>. (This article assumes the reader knows logical notations and elementary set-theoretical concepts, particularly the concept of sequence.) Since formulas can have arbitrary length, we need a generalization of this procedure in order to have a uniform way of dealing with all cases. This was Tarski’s motivation for introducing the concept of satisfaction by means of infinite sequences of objects. Since formulas are of arbitrary but always finite length, infinite sequences have a sufficient number of members to cover the satisfaction of all possible cases of particular formulas. This intuition is articulated by

(4) A is satisfied by an infinite sequence s = <s₁, s₂, s₃,…>, where s_n(n $\geq$1) refers to the nth term of s.

The definition of satisfaction (SAT; the symbol I refers to an interpretation) is as follows (This article simplifies indexing, and it restricts terms to individual variables and individual constants; the knowledge of this logical notation is assumed):

(5) (a) ‘P_j (t₁, …., t_k)’ $\in$ SAT(s, I) ⇔ <ℑ (‘t₁’), …, ℑ(‘t_k’)> $\in$ R_j (=I(‘P_j’);

(b) ‘$\neg$A’ $\in$ SAT(s, I) ⇔ ‘A’ $\not \in$ SAT(s, I);

(d) ‘A $\vee$ B’ $\in$ SAT(s, I) ⇔ ‘A’ $\in$ SAT(s, I) or ‘B’ $\in$ SAT(s, I);

(e) ‘A ⇒ B’ $\in$ SAT(s, I) ⇔ ‘$\neg$A’ $\in$ SAT(s, I) ‘B’ $\in$ SAT(s, I);

(f) ‘A ⇔ B’ $\in$ SAT(s, I) ⇔ ‘A ⇔ B’ $\in$ SAT(s, I) and ‘B ⇒ A’ $\in$ SAT(s, I);

(g) ‘$\forall$x_iA(x_i)’ $\in$ SAT(s, I) ⇔ ‘A(x_i)’ $\in$ SAT(s’, I), for every sequence s’, which differs from the sequence s at most at the i^th place;

(h) ‘$\exists$x_iA(x_i)’ $\in$ SAT(s, I) ⇔ ‘A(x_i)’ $\in$ SAT(s’, I), for some sequence s’, which differs from the sequence s at most at the i^thplace.

The first clause establishes the satisfaction-conditions for atomic formulas that refer to relations (sets can be considered as one-placed relation). Conditions (b)–(f) repeat the semantic definitions of propositional connectives, (g) and (h) concern quantifiers and say that an (open) universal formula is satisfied by every sequence, but an existential formula by some sequence (‘differs at most at most i^thplace’ is a technical phrase to capture the intended meaning). The reference to an interpretation ℑ indicates its role in correlation of expressions and their references, for instance predicates and relations. Since I is always associated with a model M, the expression ‘A’ $\in$ SAT(s, I) can be replaced by the phrase ‘A’ $\in$ SAT(s, M) (a formula A is satisfied by a sequence s in a model M). If s is an infinite sequence and A has n free variables, only n terms of s are relevant to A’s being satisfied or not. Another formal possibility to define the satisfaction relation consists in introducing sequences of a sufficient finite length.

What about sentences? Consider the example with London and Manchester. The formula (*) ‘x₁ is a larger city than x₂’ is satisfied by every ordered pair <s₁, s₂> such that s₁ = I(x₁) and s₂= I(x₁) are cities, and s₁ is larger than s₂. In particular, the pair <London, Manchester> satisfies (3). Note that the sequence <s₁, s₂> can be enlarged by adding an arbitrary number of terms in order to have an infinite sequence <s₁, s₂, s₃, …, s_k, …>, but this operation is irrelevant to satisfaction or lack thereof. Informally speaking, if a sequence <s₁, s₂> satisfies (or not) the formula (*), the same applies to the sequence <s₁, s₂, s₃, …, s_k, …>, because the terms s₁, s₂ are the only one that are significant for the satisfaction business in question. Now substitute Manchester. That gives (**) ‘x₁ is a larger city than Chicago’. This formula is satisfied by the sequence < s₁> such that s₁ = I(x₁), is a city and s₁ being larger than Chicago, in particular by the object <London>. Enlarging the sequence <London> by adding an arbitrary number of terms does not change the situation. Every sequence of the form <London, s₂, s₃, …, s_k, …> satisfies the formula (**). Finally, consider (***) ‘London is a larger city than Manchester’, which is just a sentence, not an open formula. Since it has no free variables, its satisfaction does not depend on valuations of free variables. Hence, every infinite sequence of the form <s₁, s₂, s₃, …, s_k, …> satisfies (***). In other words, we can replace s_k by an arbitrary object and this step has no relevance for the satisfaction of (***). It is satisfied, because London is a larger city than Manchester. Another way to the same result consists in using a theorem of first-order logic ‘if A is a sentence, $\forall$x_i A ⇔ A’. Assume that a sequence s satisfies (***). By clause (5g), formula A is also satisfied by every sequence s’ which differs from s at most at the i^th place. Since A has no free variables, the i^th place can be arbitrarily chosen from terms of s’. This means, that every sequence satisfies A. This reasoning implies that if a sentence A is satisfied by at least one sequence, it is also satisfied by any other sequence. Conversely, if a sentence is not satisfied by at least one infinite sequence, it is also not satisfied by any other infinite sequence.

Accordingly, the following statements are obtained

(6) A sentence is satisfied by all sequences if and only if it is satisfied by at least one sequence.

(7) A sentence is not satisfied by all sequences if and only if it is satisfied by no sequence.

Both assertions lead to

(8) If A is a sentence it is satisfied by all sequences or is satisfied by no sequence.

(6) and (7) lead to the following definition:

(SDT) (a) ‘A’ is true in M if and only if ‘A’ is satisfied by every infinite sequence of objects M (equivalently: by at least one such sequence);

(b) ‘A’ is false in M if and only if ‘A’ is not satisfied by some infinite sequence of objects from M (or by no sequence).

However, we can also prove that if a sentence is satisfied by any infinite sequence of objects (or by one such sequence), it is also satisfied by the empty sequence of objects. Thus, SDT can also be formulated by saying that the sentence A is true if and only if it is satisfied by the empty sequence of objects (the notion of the empty sentence is a generalization of the usual definition of sequence. This definition is model-theoretic and explicitly appeared in (Tarski, Vaught 1957). Tarski’s original treatment assumed that satisfaction and truth refer to the one domain in which expressions are interpreted. One can eventually say that the concept of model was implicitly involved in Tarski 1933.

Let us look at the consequences of SDT in the above formulation. Since it assumes resources to meet (LP) and similar paradoxes, its consistency against semantic antinomies is guaranteed. Since SDT does not use the concept of truth, it is not circular. On the other hand, we must suppose that out metatheory (weak second-order arithmetic) is correct in an intuitive sense. According to Tarski, SDT is formulated in the morphology (syntax) of ML. Due to the understanding of logic around 1930, it covered set theory or the theory of logical types. Thus, Tarski was justified in his view that the correctness of metatheory is reduced to that of pure logic.

Today, the situation is more complicated. One can say that SDT proceeds as a typical mathematical construction based on a portion of set theory. Although some philosophers – for instance, Husserl and his followers – will probably be dissatisfied by this situation vis-a-vis their claim that philosophical constructions have to be free of presuppositions, the defenders of SDT (and similar constructions) can reply that (a) conformity to mathematical practice is more important than established a priori metaphilosophical postulates, and that (b) an informal understanding of ML is inevitable for logical constructions pertaining to L. Since ML exceeds L in expressive means, we have also a good articulation of the claim that ML must be richer than L in order for truth for the latter to be defined in the latter. SDT satisfies CT and implies (BI).

The set Tr(L) has various metamathematical properties. It is consistent, forms a deductive system, which is maximal (no sentence can be added without losing consistency), compact (Tr(L) is consistent if and only if its every finite subset is consistent) and syntactically complete (for any A, A $\in$ Tr(L) or $\neg$A $\in$ Tr(L). On the other hand, sets of truths are not always finitarily axiomatizable, In other words, it is not so that for any Tr(L), there exists a finite set X $\subset$ Tr(L), such that Cn(X) = Tr(L) (the symbol Cn refers to the consequence operation). SDT leads to a very elegant account of logical consequence (see Tarski 1936a). We say that the sentence A belong to the set of consequences of the set X if and only if every model of X is also a model of A. In symbols, A $\in$ CnX if and only for every M, if M is a model of X (every sentence from X is true in M), then A is true in M.

STT, claiming that ‘is true in L’ is defined in ML, raises the question whether we can define truth inside L. The Tarski Undefinability Theorem (TUT) says that if a consistent theory T contains the arithmetic of natural numbers, the set of T-truths is not definable in T. In other words, the truth-predicate is not definable in languages sufficiently rich for expressing the arithmetic of natural numbers. So, TUT is a limitative theorem. Gödel’s first incompleteness theorem (GFT) is perhaps the most famous example of a limitative theorem. If states that if AR (the formal arithmetic of natural numbers) is consistent, it is also incomplete, that is, there are arithmetical sentences A and $\neg$A, such that they are not provable in AR.

The informal proof of GFT proceeds in the following way. Consider the sentence (i) ‘the sentence (i) is not provable’. If (i) is true, it is unprovable, but if it is false, it is unprovable as well, because logic cannot lead to false consequences (we tacitly assume that axioms of AR are true). Using the law of excluded middle, we obtain that there exists a true but unprovable theorem.

The above reasoning is semantic. The formal proof of GFT is purely syntactic and uses arithmetization that is, translation of metamathematical concepts and theorems into the language of AR.

Assume that STT^L is a correct (consistent) truth-theory for L formulated in this language and that a formula A $\in$ L mentions itself and says ‘A does not define truth’. If A $\in$ Tr(L), truth is undefinable by A. Now, A is not a theorem of STT^L, that is $\neg$(STT^L ├ A) (or A $\in$ Cn(STT^L). This assertion is justified by the reductio argument. Assume that STT^L ├ A. Hence, ($\neg$A $\not \in$ Cn( STT^L). Hence, $\neg$A can be either false or independent of STT^L. The first-case is impossible, because it would mean that STT^L defines truth for L. Thus, the second possibility remains, namely that STT^L does not define truth for L. Assume that A is false. This means that STT^L defines truth of L. However, it is impossible, because A would be a false theorem of STT^L, but we assumed that this theory is materially correct and so contains not falsehoods. Thus, we proved that STT^L does not define the truth- predicate for L (the informal version of Tarski’ undefinability theorem (TUT)). A more technical version of this theorem says that there is no formula Tr(A) $\in$ L_AR such that for any A $\in$ L_AR, AR ├ A ⇔ Tr(‘A’). The proof of TUT in this formulation uses the fixed-point lemma (FPL), which says that if A(x) $\in$ L_AR and A(x) has one free variable, then $\exists$B $\in$ L_AR (AR ├ B ⇔ A(‘B’). The proof is remarkably brief. Assume that there is a formula mentioned in the first part of (TUT). By (FPL), there exists a sentence A such that AR ├ A ⇔ $\neg$T(‘L’). By our assumption, we obtain AR ├ T(L) ⇔ $\neg$T(L), but it conflicts with consistency of AR.

Formulations and proofs of GFT and TUT essentially appeal to self-referentiality. However, the former theorem does not demonstrate that the sentence ‘I am not provable’ is paradoxical, but only that it is independent of AR. The situation in the context of TUT is radically different. In particular, the second part of the informal proof of this theorem shows that adding the formula A (in the indicated meaning) results in the contradiction. The formal proof TUT via FPL confirms this assertion. In fact, FPL can be considered as a metalogical (metamathematical) pointing out of what is wrong with the Liar Paradox. This outcome is important because shows that paradoxes related to self-reference are not curiosities but that they have deep connections with general mathematical results. Finally, one should see a fundamental difference between GFT and TUT. Although both have similar informal formulations appealing to the concept of truth, the forms can be replaced by its syntactic version, the latter not. In the language of recursion, the set of provable sentences of AR is not recursive (a set is recursive if and only if it is computable; it implies that the complement of recursive set is recursive as well), but recursively enumerable (a set is recursively enumerable provided that it can be enumerated by natural numbers; it does not implies that is, complement can be enumerated as well), but the set of arithmetical truths does not fulfils the condition of recursive enumerability. Thus, semantic cannot be reduced to syntax. This fact is particularly important in metamathematics, because doing formal semantics for theories sufficient for expressing AR require infinitistic methods, but syntax of such systems is finitary.

5. Philosophical Comments

Tarski explicitly asserted that he considered STT as an answer to one of the central problems of epistemology. This claim motivates several philosophical comments about the truth-theory. However, we enter here risky territory, because philosophy is full of conflicts and polemics. Limiting attention to analytic philosophy, STT has (had) radical critics such as Otto Neurath and Hilary Putnam, radical defenders such as Rudolf Carnap and Karl Popper, sceptics maintaining that it is philosophically sterile, and an army of more or less followers trying to improve or reinterpret it such as Donald Davison, Hartry Field, Paul Horwich and Saul Kripke. At least three important contemporary philosophers radically changed their views under Tarski’s influence, namely Kazimierz Ajdukiewicz (who rejected radical conventionalism), Carnap (who changed his early view that logical syntax is the core of philosophy and defended semantics as the foundation of philosophical analysis) and Popper (who adopted scientific realism as the most proper philosophy of science).

The above brief survey focused on positive as well as negative influences of Tarski’s ideas. Both indicate that STT is a contemporary philosophical tool, at least in the camp of analytic philosophy. (Continental philosophy is ignored here, although a longer treatment should also refer to this tradition.)

Without pretence to completeness, here are the problems which should be touched upon by any philosophically reasonable truth-theory in philosophy. Being philosophically reasonably does not mean correct, but rather deserving attention in the world of philosophy).

What are the bearers of truth?
What are the initial intuitions associated with a given truth-definition?
How to define truth, and what about the consequences of SDT?
Is the division of truth-bearers stable, that is, do at least some truth-bearers sometimes change their truth-values (briefly: is truth relative or absolute)?
What is a truth-criterion and what is the relationship between truth-criteria and truth-definition?
What is the relation of a particular truth-theory to its rivals?
How can a given truth-theory be defended against various objections?
What is the relation of truth to other philosophical problems?

So, there is much for a theory of truth to accomplish. This article tries to show how the STT of truth is related to these questions, or at least to some of them.

(1) STT assumes that truth-bearers are sentences in the syntactic sense. Yet there are several more concrete possibilities. Sentences? Propositions? Statements? Judgments? These entities can be either linguistic units or objects expressed by linguistic utterances. By contrast, concepts are not truth-bearers, contrary to what Hegelians say. To have a convenient label, we can say that, according to STT, entities qualified as true or false are of the propositional syntactic category. This way of speaking has nothing to do with the question of the ontological nature of propositions, for instance, as abstract objects. Tarski himself chose meaningful sentences as entities on which truth is predicated.

(2) Tarski always stressed that his definition follows the intuitions of Aristotle. Tarski was influenced by the Stagirite himself as well as his Polish teachers, particularly Tadeusz Kotarbiński. Tarski, like most Polish philosophers, uses the label ‘classical truth definition’ as referring to Aristotelian ideas. At the beginning, Tarski identified the classical and correspondence theory of truth, but later he expressed greater reservations with respect to explanations via expressions, such as “agreement” or “correspondence” than to Aristotle’s original formulation. It is not controversial that a T-equivalence says of a true sentence that it states how things are.

What about SDT? We have two options, first, having some justifications in Tarski’s explanations that satisfaction by all sequences of objects is a mathematical trick, and, second, that the official definition corresponds to some ordinary intuitions. The second option is based on some facts, for instance, that SDT entails T-sentences and BI. Anyway, SDT suggests that truth depends on the domain (model) and how it is. This definition does not appeal to terms such as ‘agreement’ (of a truth-bearer and the world, fact, state of affairs, and so forth.), ‘picturing of the world by minds, thought, and so forth.’, ‘structural similarity’, and so forth. One can propose to distinguish the strong correspondence theory, as in the famous formulation veritas est adequatio rei et intellectus, and the weak (semantic) correspondence. Presumably STT might be interpreted as a weak correspondence theory.

(3) Tarski decided to define truth by a single formula (the definition satisfaction is recursive). He considered introducing truth by axioms, but he rejected this possibility for philosophical reasons. More specifically, he was afraid of being criticized by philosophers from the Vienna Circle for advocating physicalism (see Tarski 1936). This motivation is presently completely historical. Today, the axiomatization of the concept of truth is commonly applied.

TUT has some intriguing consequences for philosophy. Assume what is natural and philosophically tempting, namely that the collection TRUTH of all truths is infinite. By TUT, TRUTH is not definable by resources conceptually available within it. The only admissible way out within set theory consists in considering TRUTH to be too big a set (Zermelo-Fraenkel system), a class as distinct from sets (Bernays-Gödel-von Neumann) or a category. All these outcomes are formally correct, but lead to not quite pleasant consequences, at least for philosophers who like to say something about the set of all truths. However, set theory and TUT seriously limit such theoretical ambitions. On the other hand, this fact gives a precise meaning for the assertion that truth is transcendental in the sense of the medieval theory of transcendentalia (verum omnia genera transcendit).

(4) The classical concept of truth is commonly considered as absolute, that is, if A is true then it is true eternally (for ever) and sempiternally (since ever). On the other hand, SDT indexes truth by L and M. Does this deprive truth of its absolute character? This question is connected with such issues as bivalence, logical determinism and many-valued logic. Without entering into details concerning this fairly complex stock of ideas, it might be suggested that one can model-theoretically prove that truth is eternal if and only if it is sempiternal. Thus, the classical theory of truth in the semantic setting can be considered as associated with the absolute concept of truth. Even if this conclusion encounters reservations, the possibility of analysing the absolutism/relativism controversy within the philosophical theory of truth via SDT is a remarkable fact.

(5) Clearly, SDT is a-criterial. This means that the definition in question does not generate any truth-criterion, although it says what truth is. If mathematics is taken into account, proof can be regarded as a measure of truth. However, there arises a problem. Let the symbol Pr denote the provability operator. By the Löb theorem, we have PrA ⇒ A, a theorem very similar to TrA ⇒ A. But, due to the first incompleteness theorem, the formula A ⇒ PrA cannot be consistently added to the provability logic. Hence, there is no counterpart of the T-scheme with Pr instead Tr, that is, the scheme PrA ⇔ A. So, we must conclude that proof is not a complete truth-criterion even in mathematics. This fact can motivate various ways out, for instance, modifying the concept of proof (every true mathematical assertion can be proved in a formal system; this assertion does not contradict the incompleteness theorem) or replacing truth by proof, eventually with additional constraints, for instance, that proofs must be constructive. However, such proposals are restricted to mathematics. Another suggestion is that truth-criteria consist of procedures which justify satisfaction of open formulas by some objects.

(6) Tarski grew up in the tradition of division of truth-theories into the classical theory and so-called non-classical theories, namely the evidence theory (A is true if A is evident), the coherence theory (A is true if it can be embedded in a coherent system without destroying its coherence), the common agreement theory (A is true if specialists agree about its correctness) and the utilitarian theory (A is true if A is useful). The non-classical theories are criteria, because they appeal to procedures assuring that something is true. Tarski himself mentioned the last definition and the coherence account. In general, he considered non-classical theories as lacking precision and he did not discuss them as serious alternatives for STT.

Another issue involving the relation between various truth-theories concerns substantial and minimalist accounts. The latter approach (the redundancy theory, the deflationary theory, and so forth.) reduces the truth-definition to the T-scheme. Under this view, STT is a minimalist theory. Tarski himself discussed this question. His counterexample was the sentence ‘All consequences of true sentences are true.’ It is not justified by the T-scheme, and it does not justify asserting that all consequences of true sentences are true. There are much more complicated examples, for instance, the sentence ‘There exist true but not provable sentences’, which looks not to be subject to a minimalist translation. If so, STT is essentially richer than any minimalist theory of truth.

(7) Consider three objections stated by Franz Brentano against the classical theory, and consider trying to show that STT meets them successfully. First, the concept of correspondence is obscure and cannot be satisfactorily explained. More precisely, in order to establish what a truth-bearer corresponds to in reality, one must compare the former with the latter. But it is impossible, due to relata of such a comparison. However, this objection applies to the strong notion of correspondence, not to its weak form. The second objection is more serious. Assume that we define truth by a definition D. Yet D is a sentence. In order to have a good definition D must be true. Now, the definition is either circular (if it uses itself) or falls into the regressum ad infinitum, because in order to formulate D, we must appeal to D’ related to D, and so forth. Third, the concept of correspondence does not explain the truth of negative sentences. The answers to these objections depend on the relation of L to ML. These relations do not entail that SDT is circular or leads to an infinite regress. The problem of negative sentences has a simple solution in STT because they are true (or false) under the same definition as positive ones.

(8) Tarski underlined that one can accept STT without being committed to strong ontological or epistemological views such as idealism or realism. In other words, STT is independent of such philosophical assumptions or consequences. Independently of Tarski’s intentions, it is easy to give an example of a philosophical problem closely related to STT, namely the semantic realism / semantic anti-realism debate. Generally speaking, (semantic) realists, such as Donald Davidson, use STT; but (semantic) anti-realists (such as Michael Dummett) reject this account of truth. This controversy concerns the mutual relation of the condition of truth and condition of assertibility. The realist says that the meaning of a sentence (MS) is given by its truth-conditions (TC), but the anti-realist says the meaning is given by assertibility-conditions (AC). Thus, we have two equalities:

(i) MS = TC (realism);

(ii) MS = AC (anti-realism).

However, (i) and (ii) are still too vague. In fact, (i) and (ii) should be transformed into

(iii) MS = TC $\wedge$ TC ⇒ AC;

(iv) MS = AC $\wedge$ TC = AC.

The antirealist says that truth-conditions exceed assertibility-conditions, but the antirealist identifies truth-conditions with the assertibility conditions. How does STT work here? It justifies (iii), but it refutes (iv). If, as many anti-realists claim, the conditions of assertibility are governed by intuitionistic logic, it does not generate sufficient and necessary conditions for asserting any mathematical sentence. The point is that the incompleteness theorem constructively holds for Heyting arithmetic (Peano arithmetic based on intuitionistic logic). If so, the anti-realist cannot say that there are true, but unprovable sentences; but the realist can by appealing to STT. As far as the issue concerns more general (that is, ontological and/or epistemological) forms of realism and anti-realism, some insights are provided by results about the full expressibility of semantics in syntax. The general philosophical problem considers the relation between the knowing subject and the object of knowledge. Following a modernized Ajdukiewicz’s proposal, the former is represented by syntax, that is, defines the subject inside language, but the latter can be identified with a model of this language. Since, due to TUT, models transcend languages or cannot be defined within them, the realists’ view on knowledge and reality, has some justification.

6. Final Remarks

STT employs logical tools throughout. Yet this theory is not a logical calculus in the sense in which propositional or predicate logic are. STT is metamathematical, and eventually axiomatic, if this approach is chosen. The status of T-equivalences provides a good illustration in this respect. They are neither logical tautologies nor material biconditionals. As consequences of SDT they have the status of mathematical theorems provable from axioms. This remark does not end the discussion about the character of T-equivalences, but at least it outlines the direction which seems correct. Anyway, STT belongs to logic in a broad sense.

The philosophical content of STT plays an important role in philosophy of language, logic and mathematics, at least in clarifying some issues. On the other hand, the belief that STT can ultimately solve various problems of these parts of philosophy would be exaggerated. This statement even more concerns epistemology and ontology. On the other hand, as this article documents, although philosophical uses of the semantic theory of truth are problematic, Tarski’s semantic ideas are not philosophically sterile.

7. References and Further Reading

The readings below include only general books on Tarski and his basic writings. Further bibliographical references are available in the books mentioned.

Beeh, V., 2003, Die halbe Wahrheit. Tarskis Definition & Tarski’s Theorem, Paderborn, Mentis.
Butler, M. K. ,2017, Deflationism and Semantic Theories of Truth, Manchester, Pendlebury Press.
Casari, E.,2006, La matematica della verità. Strumenti matematici della semantica logica, Torino, Bollati.
Cieśliński, C., 2017, The Epistemic Lightness of Truth. Deflationism and its Logic, Cambridge, Cambridge University Press.
David, M.,1994, Correspondence and Disquotation. An Essays on the Nature of Truth. Oxford, Oxford University Press.
De Fioro, C., 2013, La forma della verità. Logica e filosofia nell’opera di Alfred Tarski, Milano, Mimesis.
Glanzberg, M., 2018, ed. The Oxford Handbook of Truth, Oxford, Oxford University Press.
Gruber, M., 2016, Alfred Tarski and the “Concept of Truth in Formalized Languages. A Running Commentary with Consideration of the Polish Original and the German Translation, Dordrecht, Springer.
Halbach,V., 2011, Axiomatic Truth Theories, Cambridge, Cambridge University Press.
Horsten, L., 2011, The Tarskian Turn. Deflationism and Axiomatic Truth, Cambridge, Mass., The MIT Press, Cambridge, Mass.
Kirkham, R. L., 1992, Theories of Truth. A Critical Introduction, Cambridge, Mass, The MIT Press.
Künne, W., 2005, Conceptions of Truth, Oxford, Oxford University Press.
Martin, R. L., 1984, ed., Recent Essays on Truth and the Liar Paradox, Oxford, Clarendon Press.
Moreno, L. F., 1992, Wahrheit und Korrespondenz bei Tarski. Eine Untersuchung der Wahrheitstheorie Tarskis als Korrepondenztheorie der Wahrheit, Würzburg, Köningshausen & Neumann.
Pantsar, M., 2009, Truth, Proof and Gödelian Arguments. A Defence of Tarskian Truth in Mathematics, Helsinki, University of Helsinki.
Patterson, D., 2012, Alfred Tarski Philosophy of Language and Logic, Hampshire, Palgrave Macmillan.
Patterson, D. 2008, ed., New Essays on Tarski and Philosophy, Oxford, Oxford University Press.
Puntel, L. B.,1990, Grundlagen einer Thorie der Wahrheit, Berlin, de Gruyter.
Rojszczak, A., 2005, From the Act of Judging to the Sentence. The Problem of Truth Bearers from Bolzano to Tarski, Dodrecht, Springer.
Simons, P., 1992, Philosophy and Logic in Central Europe from Bolzano to Tarski. Selected Essays, The Hague, M. Nijhoff.
Stegmüller, W., 1957, Das Wahrheitsbegriff und die Idee der Semantik, Springer, Wien.
Tarski, A., 1933, Pojęcie prawdy w językach nauk dedukcyjnych, Warszawa, Towarzystwo Naukowe Warszawskie, Warszawa; Germ. tr. (with additions), Tarski 1935, Eng. tr. Tarski 1956a.
Tarski, A., 1935, Der Wahrheitsbegriff in den formalisierten Sprachen, Studia Philosophica I (1935), pp. 53–198 [German tr. of Tarski 1933).
Tarski, A. 1936, Grundlegung der wissenschaftlichen Semantik. In Actes du Congrès international de philosophie scientifique, Paris 1935, fasc. 3: Semantique, Paris, Herman, 1–14; Eng. tr. in Tarski 1956, p. 401–408.
Tarski, A. 1936a, Über den Begriff der logischen Folgerung. In Actes du Congrès international de philosophie scientifique, Paris 1935, fasc. 7: Logique, Paris, Herman, p. 1–11; Eng. tr. in Tarski 1956, 409–420.
Tarski, A. 1944, The Semantic Conception of Truth and the Foundations of Semantics., Philosophy and Phenomenological Research 4, 341-395; reprinted in Tarski 1 Collected Papers, v. 2, Birkhäuser, Basel, pp. 665¬–699.
Tarski, A. 1956, Logic, Semantics, Metamathematics. Papers of 1923 to 1938, Oxford, Clarendon Press; 2nd ed., Hackett Publishing Company, Indianapolis,
Tarski, A., 1956a, The Concept of Truth in Formalized Languages. In Tarski 1956, 152–278 [Eng. tr. of Tarski 1935].
Tarski, A., 1969. Truth and Proof. L’age de la Science 1, 279–301; reprinted in Tarski 1986, v. 4, 399–422.
Tarski, A., 1986, Collected Papers, v. 1–4, Basel, Birkhäuser.
Tarski, A., Vaught, R., 1957, Arithmetical Extensions of Relational Systems. Compositio Mathematica, 13, 81–102; reprinted in Tarski 1986, v. 4, 651–682.
Woleński, J., Köhler, E., 1999, eds., Alfred Tarski and the Vienna Circle. Austro-Polish Connections in Logical Empiricism, Dordrecht, Kluwer.

Author Information

Jan Woleński
Email: jan.wolenski@uj.edu.pl
University of Information Technology, Management and Technology
Poland

Paradigm Case Arguments

From time to time philosophers and scientists have made sensational, provocative claims that certain things do not exist or never happen that, in everyday life, we unquestioningly take for granted as existing or happening. These claims have included denying the existence of matter, space, time, the self, free will, and other sturdy and basic elements of our common-sense or naïve world-view. Around the middle of the twentieth century an argument was developed that can be used to challenge many such skeptical claims based on linguistic considerations, which came to be known as the Paradigm Case Argument (henceforth, the PCA).

Consider, for instance, the following argument from a skeptic who denies that there are cases of seeing people. First, it cannot be said that we see the people who walk our streets, since they are mostly covered with clothes. All that we see, strictly speaking, are their faces and hands. But to see any such people stripped naked would be little better, since we then would be seeing only their facing surfaces while only imagining or anticipating, not seeing, their rear sides. With well-placed mirrors we might be able to see all their sides at once, but we are still seeing only their exterior, which does not constitute the whole person. No, to see these people proper we would need to have them opened up, with all their interior parts displayed for us too. But then we would no longer have a person, but a corpse or a display of people-parts. Hence there are no cases of seeing people.

A philosopher using the PCA could then counter this by pointing out that it is in fact a perfectly natural and proper use of the word ‘see’ to say that you see a person in ordinary cases where you are looking at a fully intact person with his or her clothes on. She might then, if necessary, describe situations where we do or would say this. She might point out that we teach or train children and also adults who are learning English how to use the expression ‘see a person’ with reference to everyday cases when one sees them clothed. (Teacher: ‘What do you see on page seven?’ Learner: ‘A person.’ Teacher: ‘That’s correct.’) These are paradigm cases of seeing people, exemplars that we use when teaching and explaining the meaning of that expression. That being so, there is no logical room for a philosophical argument showing that these are not cases of seeing people. Trying to argue that they are not would be like trying to argue that the paintings of Picasso that the term ‘cubism’ was coined to denote are not cubist (because they do not depict geometrically exact cubes, say).

This article shows the PCA being applied to the more controversial topic of free will skepticism, examines its logical structure, and looks at some common objections to it. The appraisal of the PCA leads to issues of some depth and importance.

History and Significance of the Argument
Paradigm Cases
The PCA as Part of a Wider Response to the Skeptic
Malcolm’s Version of the PCA
Flew’s Version of the PCA
Critical Responses to Flew’s PCA
“Ordinary Language is Correct Language”
Ordinary Usage as Practices
Conclusion
References and Further Reading

1. History and Significance of the Argument

The PCA is closely associated with the linguistic philosophy movement that peaked in the mid-twentieth century, when many philosophers were urging that philosophical questions and problems should be approached by paying careful attention to the language that we use for expressing them. More specifically, it was associated with the ordinary language philosophy approach within that broader movement, where the emphasis was on examining the ordinary use of terms. Both advocates and critics of the PCA have claimed that it is foundational to those philosophical outlooks and key to understanding them (for example, Flew 1966, p. 261; Gellner 1959, pp. 30–32; Parker-Ryan 2010, p. 123).

The first explicit presentation of the PCA was in a classic paper of the ordinary language philosophy tradition by Norman Malcolm, originally published in 1942, called ‘Moore and Ordinary Language’ (also see Malcolm 1963). Malcolm studied under and was influenced by G. E. Moore and Ludwig Wittgenstein at Cambridge. He then returned to the USA and became a leading exponent of Wittgenstein’s philosophy there. He believed that the PCA was inchoate in Moore’s famous ‘proof’ (1939) of an external world, and he also stated (1963, p. 183) that grasping it was essential for understanding some of Wittgenstein’s most distinctive remarks on the nature of philosophy, such as, ‘Philosophy must not interfere in any way with the actual use of language, so it can in the end only describe it. For it cannot justify it either. It leaves everything as it is’ (Wittgenstein 2009/1953, §124). Anthony Flew was another prominent early exponent of the PCA, who applied and defended it in a series of articles beginning in the 1950s.

The argument was employed by Malcolm, Flew, and others to defend the existence of a variety of things from skeptical attack, such as cases of acting freely (Black 1958; Danto 1959; Flew 1954 & 1955a; Hanfling 1990; Hardie 1957), causation (Black 1958), solidity (Stebbing 1937; Urmson 1953), space and time (Malcolm 1992/1942), material things and perceptions of material things (Malcolm 1992/1942; 1963), and certain knowledge of empirical propositions (Malcolm 1992/1942). For convenience, in what follows people who argue against the existence of such things are called ‘skeptics’, and people who use the PCA to counter such arguments are called ‘defenders’.

2. Paradigm Cases

The PCA exploits the idea of a paradigm case. Minimally, a paradigm case of something is a case that is supposed to come within the denotation or extension of the relevant word. But what is more, it is supposed to centrally come within its denotation; it is supposed to be a model example or exemplar, something about which we are inclined to say, ‘That’s an X if anything is’ or ‘If that’s not an X, I don’t know what is’. It is the kind of case that psychologists who study concepts would call a ‘prototypical category member’ and which has been found to be associated with various psychological phenomena, such as tending to first spring to mind when people are told to think of examples of an X, or being more rapidly categorized as an X compared to other category members in categorization tasks. This exemplar status makes it especially fit for the purpose of explaining the meaning of the relevant word in ostensive definitions (and its being used for that purpose reinforces its exemplar status in turn).

A particularly striking example of a paradigm case in this sense (an exemplar of an exemplar, if you will) might be the International Prototype of the Kilogram, a lump of platinum kept in Paris that was used to define what a kilogram is, such that anything else was a kilogram in weight if and only if it was the same weight as this object. The cases that the defender refers to as paradigm Xs are thought of as playing a similar meaning-setting role in relation to the relevant term ‘X’ (though this comparison has its limits; for example, the cases might not have come to play that role through explicit stipulation or formal decision). The problem, then, that the defender has with the skeptic is that in denying that there are any Xs, the skeptic seems to be denying that what apparently are paradigm cases of Xs are Xs, which would be analogous to denying that the International Prototype of the Kilogram is a kilogram in weight.

3. The PCA as Part of a Wider Response to the Skeptic

Of course, when the skeptic denies that there are any Xs, he does so due to some reasons or arguments. The PCA, however, does not directly engage with the arguments that the skeptic gives or the significant complexities they can give rise to. This is because, from the defender’s perspective, the skeptic’s claims can ‘be seen to be false in advance of an examination of the arguments adduced in support of them’ (Malcolm 1963, p. 181; also see Malcolm 1992/1942, p. 114), since the PCA is supposed to show that the skeptical claim must be wrong. In other words, for the defender, the skeptical argument (assuming it is logically valid) should be regarded as a reductio ad absurdum of a premise in the argument, since it leads to an absurd or impossible conclusion.

It is this apparently brusque way of treating the skeptic’s arguments that provoked suspicion and even hostility towards the PCA on the part of some critics. Thus some have sarcastically referred to it as a ‘remarkably economical device for resolving complex philosophical disputes’ (Beattie 1981, p. 78), or as ‘a very simple way of disposing of immense quantities of metaphysical and other argument, without the smallest trouble or exertion’ (Heath 1952, p. 1). For others it seems to take the fascination and wonder out of philosophy by its summary rejection of intriguing claims and arguments (Watkins 1957a, p. 26). Why the defender feels entitled to treat the skeptic’s arguments in this way is explained in section eight.

Defenders do not give the skeptic’s arguments quite the short shrift that these remarks suggest, however, since they see the PCA as being only a part of an adequate philosophical response to the skeptic. Accordingly, both Malcolm and Flew stated that to truly free us from the skeptic’s position, reminding us of ordinary linguistic usage is not enough. We also need to reconstruct and examine the reasoning (Malcolm 1951, p. 340; 1992/1942, p. 123) or to identify the ‘intellectual sources’ (Flew 1966, p. 264) that drew us towards the skeptical conclusion. (The importance of this is especially evident in the free will debate, where even philosophers who sympathize with the PCA defense of free will can still feel troubled by the skeptical arguments.) This part of the response to skepticism involves examining the skeptical arguments, and it can also involve unearthing any unstated presuppositions, comparisons, or pictures that might be informing those arguments. Sometimes these sources get their intellectual power over us precisely from the fact that we are not explicitly conscious of them, and they can lose this power when we become conscious of them (Wittgensteinians sometimes call this the ‘therapeutic’ part of the investigation). For instance, regarding the argument that we never see people—a sort of argument that is not unprecedented (see Campbell 1944–45, pp. 14–18; Descartes 2008/1641, p. 23)—the implicit assumption might be that in order to truly see something you must see all its parts or aspects, or the implicit comparison might be with cases of seeing a movie or a play, which one has not properly done unless one has seen it from beginning to end (if we miss a bit, we qualify our statement: ‘I saw most of it’). In sum, defenders believe that ‘the application of a PCA is only a begin-all and not a be-all and end-all of the satisfactory treatment’ of the skeptic’s challenge (Flew 1982, p. 117; 1966 pp. 264-265).

It is also recognized by some defenders that identifying the paradigm cases of something is a far cry from giving an account or theory of it. If something is a paradigm case of an X it is so because of certain features that it has and does not have, and philosophers often want to know what these features are, though they cannot simply be ‘read off’ some paradigm cases. Identifying paradigm cases can then be only a ‘jumping-off point for establishing the relevant rules and conventions’ (Black 1973, p. 271) governing the term, and a preliminary to developing an alternative account of the phenomenon to the one implicit in the skeptic’s argument.

4. Malcolm’s Version of the PCA

A close reading of the literature on the PCA reveals that there is not one but two different kinds of arguments that go by the name ‘paradigm case argument’, the first of which is especially evident in Malcolm’s 1942 paper and which is of more limited application. Distinguishing between these versions is important as not doing so can lead to confusion in the critical appraisal of these sorts of arguments.

The key feature of what we may call ‘Malcolm’s version’ is that it exploits the idea that there are certain expressions ‘the meanings of which must be shown and cannot be explained’ (Malcolm, 1992/1942, p. 120). Color terms are often mentioned to illustrate this; to make someone fully understand what ‘yellow’ means you must go beyond verbal explanations and produce a sample. Consider, for instance, a philosopher who claims that space and time do not exist. Malcolm first uses Moore’s method of ‘translating into the concrete’ (Moore 1918, p. 112), where an abstract statement is considered in terms of its specific implications. Thus he understands this as amounting to the denial that anything is ever to the left of anything else, that anything is ever above anything else, that anything ever happens earlier or later than anything else, and so on. It is the denial that such states of affairs ever exist. Furthermore, for a philosopher to actually make such a denial (as opposed to just parroting words), she must understand the meanings of the expressions contained therein. She must understand what it means to say that one thing is under another, that one event occurred after another, and so forth.

But how, Malcolm asks, could one ever have come to understand the meaning of such expressions as ‘after’, ‘to the left of’, ‘above’, and ‘under’? Only, he maintains, by our being shown or being acquainted with actual instances (or ‘paradigms’) of things being to the left of other things, of things being above other things, and so on (1992/1942, p. 120). Therefore, for Malcolm, spatial and temporal relations must exist for us to understand the meanings of such expressions and thus, ironically, the existence of space and time is a precondition for the possibility of denying their existence. Or at least the skeptic owes us an explanation of how he can understand spatial and temporal vocabulary on the assumption that spatial and temporal relations do not exist (Soames 2003, p. 166).

The skeptic could respond, however, by simply denying that he understands spatial and temporal vocabulary. That is, the skeptic’s claim might be that such vocabulary has no intelligible meaning, a claim which he perhaps misleadingly expressed by saying ‘Space and time don’t exist’ (as misleading as it would be to say ‘Square circles don’t exist’, as if to imply that there is an intelligible description there that nothing happens to satisfy). And Malcolm does suggest something of this sort in saying that the skeptic’s real point is that these ideas are subtly self-contradictory. However, Malcolm claims that no expression that has a descriptive use is self-contradictory, and he maintains that these expressions do have descriptive uses.

Taking their cue from Malcolm, some commentators have interpreted the PCA as applying only to expressions whose meanings are so fundamental or irreducible that they can be conveyed only ostensively (for example, Alexander 1958, p. 119). Certain defenders were then reproached for attempting paradigm case arguments with expressions apparently not of this type (Passmore 1961, p. 115; Watkins 1957a, p. 29). For instance, the most intense discussion of the PCA was in relation to the expression ‘free will’, which should probably not be regarded as this kind of expression. It was noted that the meanings of certain expressions can be formed and learned by our associating them with an abstract specification or definition. In other cases, our understanding can be derived from examples, but examples that are fictional, like when we learn what miracles are by reading about miraculous events in myths and stories (Watkins 1957a, p. 27). In both cases it remains an open question whether the expression denotes anything real. Given that ‘free will’ could be an expression of those types, no inference can be made from the fact that ‘free will’ has a meaning or is understood by us to the conclusion that there is free will.

However, a different version of the PCA exists that does not rely on the idea that the meaning of the relevant expression ‘must be shown and cannot be explained’. To see this, we will look in some detail at how the PCA works in relation to the controversial topic of free will skepticism.

5. Flew’s Version of the PCA

Next we will examine a particular application of the PCA, Anthony Flew’s use of it to rebut skepticism about actions done of one’s own free will, which we may call ‘free actions’ for short. By focusing on a particular application, and the one that has generated the most discussion, we can examine the argument’s logical features in some depth. The following quotations, then, are Flew’s presentation of it from his earlier papers on the topic. Though these were the most frequently quoted and discussed presentations of the PCA, we will see that they were problematic and that he reached a more mature understanding of it in his later work. These problems largely stem from clinging to Malcolm’s model of the PCA with a concept for which it is not appropriate.

Crudely: if there is any word the meaning of which can be taught by reference to paradigm cases, then no argument whatever could ever prove that there are no cases whatsoever of whatever it is. Thus, since the meaning of ‘of his own freewill’ can be taught by reference to such paradigm cases as that in which a man, under no social pressure, marries the girl he wants to marry (how else could it be taught?): it cannot be right, on any grounds whatsoever, to say that no one ever acts of his own freewill. For cases such as the paradigm, which must occur if the word is ever to be thus explained (and which certainly do in fact occur), are not in that case specimens which might have been wrongly identified: to the extent that the meaning of the expression is given in terms of them they are, by definition, what ‘acting of one’s own freewill’ is. (Flew 1955a, p. 35)

Here is another more concise statement of the argument:

As the meaning of expressions such as ‘of his own free will’ is and must ultimately be given by indicating cases of the sort to which it is pre-eminently and by ostensive definition applicable, and not in terms of some description (which might conceivably be found as a matter of fact not to apply to anything which ever occurs); it is out of the question that anyone ever could now discover that there are not and never have been any cases to which these expressions may correctly be applied. (Flew 1954, p. 54)

There are at least two errors with this. Firstly, Flew claims in places that the meaning of ‘free will’ must be given by referring to paradigm cases. But this is not right. As suggested above, it seems possible that its meaning could be given with a definition (‘A free action is an action that . . .’). It would then be an open question whether there is anything satisfying the definition. Flew came to think that this ‘must’ claim was unnecessarily strong, and that for his argument to work it is enough that the meaning of ‘free action’ can be given by referring to paradigm cases (1957, p. 37).

But secondly, even if the meaning of ‘free action’ can be given by referring to paradigm cases, that would not entail that there must be cases of free action (that is, Flew is wrong in saying that the paradigm cases ‘must occur if the word is ever to be thus explained’). For cases can be real or hypothetical, and it is not necessary that the paradigm cases occur for it to be possible to explain the meaning of a term by describing them (Chisholm 1951, pp. 327–328; Hallett 2008, p. 86). Indeed, even Flew himself, in the first passage, seems to describe a hypothetical case of a man who under no social pressure marries the woman he wants to marry to explain the meaning of ‘free will’ (at least he does not tell us that he is referring to some actual case he is familiar with). We all know that such cases occur of course, but it is a contingent fact that they do (our world might have been one where all marriages were arranged and obligatory) and that fact has no bearing on the pedagogical usefulness of the case.

Thus it would not be the mere fact that the meaning of ‘free action’ is or can be explained in terms of paradigm cases that guarantees that there are free actions. It would, rather, be the fact that the meaning of ‘free action’ can be explained in terms of certain paradigm cases, plus the fact that such paradigm cases actually occur which would guarantee that there are free actions. This two-step structure of the PCA is noted by Marconi when he says, ‘it is not enough, to refute skepticism about miracles, that the turning of water into wine would be ordinarily described as a miracle, for it is far from uncontroversial that such an event ever took place’ (2009, pp. 118–119).

Flew elucidates the structure of the argument along these lines, and achieves a more mature understanding of the PCA, in a later paper. There he says that the ‘logical form of this argument type consists in two steps: The first is an insistence upon (what is taken to be) a plain matter of fact [that is, that certain cases exist or happen] . . . The second step consists in the assertion that examples such as those presented just are paradigm cases of whatever it is which it is being so paradoxically denied’ (1982, p. 116; also see Donnellan 1967, p. 108). Thus Flew’s paradigm case argument for free actions consists of two premises.

P1: As ‘a plain matter of fact’, cases exist where a man marries the woman he loves and wants to marry without threats, pressure, or compulsion.

P2: Such cases are paradigm cases of free actions.

Conclusion: Free actions exist.

Here we can see that one of the premises is an existential statement, with the other saying that the thing quantified over is a paradigm case of whatever the skeptic is denying. In other words, one premise says that there exist cases matching a particular description, while the other says that anything matching such a description is a paradigm case of an X (where ‘X’ refers to what the skeptic claimed not to exist). Together they yield the conclusion that there are Xs.

But that is not all, since the PCA is known to draw on linguistic considerations somehow. This is not evident in the above argument schema, so where do they enter into it? They enter into it, it seems, in justifying the second premise. Thus the defender will say that those cases are paradigms of free actions because the meaning of ‘free action’ is taught or explained with reference to such cases, or because we ordinarily say of such cases that the agent ‘acted of his own free will’.

The justificatory significance of ordinary linguistic usage is discussed below. But now that we have identified the basic structure of Flew’s argument, let us first look at the various avenues of criticism available to the skeptic.

6. Critical Responses to Flew’s PCA

a. Challenging the First Premise

Critics of Flew’s PCA have tended to grant premise 1 as just being an uncontroversial empirical truth. Yet perhaps premise 1 could be resisted if we insist on understanding ‘compulsion’ or ‘being forced/constrained’ in a particular way, such that any kind of deterministic cause ‘compels’ its effect or ‘forces’ the effect to happen, so that nobody could act without compulsion in a deterministic universe (see Beebee 2013, p. 110; Hardie 1957, p. 21). Here the analytic effort would move to the ideas of compulsion or of being forced, which would need to be clarified. So although the premise here is supposed to be a statement of plain empirical fact, it could be challenged through the development of a conceptual point.

b. Challenging the Second Premise

But the main focus of attention has been on premise 2. Are such marriages indeed paradigm cases of acting freely? Or if we tend to judge that they are, is this only because of certain assumptions we are making about those cases that were unmentioned in Flew’s description, assumptions that might be open to challenge?

Some critics have argued that advocates of the PCA err by assuming a sharp distinction between teaching the meaning of a word by presenting cases and by giving criteria. For mixtures of these can also occur when we explain the meaning of a word with reference to cases, but cases that are interpreted as satisfying certain criteria (Ayer 1963, pp. 17–18; Gellner 1959, p. 34; Passmore 1961, pp. 115–116). Consider, for instance, a superstitious society where people believe in miracles. There, when explaining what a miracle is, people might refer to cases such as when the leader suddenly and inexplicably recovered from a grave illness, and others involving a sharp turnaround in fortune, but it is being assumed that these turnarounds satisfy the description of being caused by the intervention of a spiritual being. Notice that here the meaning of ‘miracle’ is being explained with reference to real cases, but this does not prove that there are miracles. For the cases are being interpreted in a certain way and the interpretation could be wrong. Could it be the same with the marriage cases? Do we think they are cases of acting freely only because of some contentious background features that we assume to apply to them?

This thinking is evident in David Papineau’s criticism of the PCA when he says, ‘Maybe ordinary people are happy to apply the term “free will” to such actions as drinking a cup of coffee or buying a new car. But this is only because they are implicitly assuming that these actions are not determined by past causes. But in fact they are wrong in this assumption. All human actions are determined by past causes’ (1998, p. 133). Similarly, John Passmore grants that it is natural for us to describe grooms as acting freely in the circumstances described by Flew, but he adds that ‘we have also learned criteria: we have been told that a person acts of his own free will only when his action proceeds from an act of will . . . [with] the metaphysical peculiarity of being uncaused’ (1961, p. 118; also see Ayer 1963, p. 18; Lucas 1970, p. 12). Passmore’s implication is that in saying that the groom acted freely, we are implicitly assuming that he satisfied this criterion.

Note that these philosophers are making claims about what ordinary speakers mean when they talk of free actions, and thus about the ordinary or ‘folk’ concept of free action, saying that it involves the idea of an uncaused or undetermined act. They are, in that respect, engaging in ‘ordinary language philosophy’ with Flew, and disputing his (more implied than stated) characterization of the ordinary concept. However, it is not enough for them to simply claim that this is a feature of the ordinary concept of a free action. There is an onus on them to support that claim with methods or evidence appropriate for this task.

But what support could they provide? An old-school ordinary language philosopher like Flew would appeal to ordinary linguistic usage to support the idea that free action is, roughly, doing what you want to do without pressure or duress, pointing out that this explains the fact that we say of a groom who marries the woman he loves and wants to marry that he marries of his own free will, but not of the groom in an arranged marriage or shotgun marriage. As an old-schooler, moreover, he would be confident that he knows well what the ordinary use of ‘free will’ is just by being fluent in English. Others who think that philosophy should be more ‘scientific’ in its methods would think it necessary to gather some empirical data on ordinary speakers’ judgments through surveys. (Interestingly, one such study yielded ideas similar to Flew’s; see Monroe and Malle 2010.) However, Papineau’s and Passmore’s criterion—that a free action is one not determined by past causes—does not seem to explain this usage at all. For we might not doubt that in both happy marriages and ones involving coercion the groom’s saying ‘I do’ can be causally explained—crudely, by love in the former and fear in the latter—and that neither sort of explanation is any less deterministic than the other. We would not speak of these cases differently if this was our criterion of free action, and it is not clear what practical usefulness the expression would have on that understanding.

Another kind of support for claims about what speakers mean or are implicitly assuming is the speakers’ own admissions or acknowledgments. When someone describes an event as a miracle, for instance, we can elicit his acknowledgment that in doing so he was thinking that a deity intervened. But will we be able to elicit from an ordinary speaker the acknowledgment that when he said that Debora married of her own free will, he meant that her marrying was not determined by past causes? Can we regard something as part of what a person meant in saying something if he does not acknowledge it as part of what he meant? Papineau and Passmore would need to allay the suspicion that their characterization of the ordinary meaning of ‘free action’ is an imposition from philosophical theory. It is not clear, for instance, where exactly we have ‘been told’ the criteria for free action that Passmore says we have been told, besides in the philosophy classroom.

Of course, these critics’ assumption that a free act is uncaused or undetermined must have come from somewhere, and Flew and Malcolm insisted that a thorough investigation of the ‘intellectual sources’ of the skeptic’s claim must be carried out, to identify the comparisons, pictures, analogies, and so forth that lure us towards it. Any PCA will seem shallow without this concomitant.

To sum up, these ways of challenging the paradigm case argument involve contesting the defender’s claim about what the relevant expression ordinarily means. But this requires that the skeptic play and beat the ordinary language philosophers (in the wide sense of those who work on elucidating the meanings of ordinary expressions, which could include certain experimental philosophers) at their own game. Skeptics who dispute a defender’s claim about what ordinary speakers identify as the paradigm cases of something, or about what exactly ordinary speakers are assuming in making such identifications, must supply evidence appropriate for determining the character of ordinary concepts, a burden which, of course, also applies to the defenders.

Another philosopher who questioned whether Flew’s description identifies a paradigm case of free action is MacIntyre (1957). Suppose we are told that the groom’s falling in love with the bride was due to a hypnotic suggestion (assuming such things are authentic). MacIntyre maintains that in that case, he would not have married of his own free will (though it could be autonomy that is lacking here, rather than free will; on this distinction, see Christman 2015, section 1.1; Piper 2010, section 2c). The defender would reply that though such an etiology was not explicitly ruled out by Flew’s description of the case, we were supposed to imagine that this was an ordinary case and thus that no such extraordinary things happened. But to this MacIntyre says that there ‘is no relevant difference in the logical status between explanations in terms of endocrine glands [or whatever the explanation is in ordinary cases] and those which refer us to hypnotic suggestion’ (1957, p. 31).

This kind of move—claiming that there is no important difference between putative paradigm cases of free action and of unfree action—is a familiar one from free will skeptics, and it is independent of the particulars of the paradigm case argument. It also leads to stalemate, since given that sameness and difference are symmetrical relations we can argue the other way around just as cogently: we can take our intuitions about the free action case for granted and say that because the unfree action case is no different in its essentials, it is, despite initial appearances, a case of free action (see Beebee 2013, p. 85).

c. The Charge of Irrelevance

Other critics have taken a different, more concessionary approach to dealing with the PCA over the free will issue. Rather than contesting Flew’s characterization of the ordinary meaning of ‘free will’, they agree with it, but maintain that this is just not the concept of free will that is relevant to the philosophical debates. For instance, Danto agrees with Flew that ‘when, in ordinary contexts, we say that Smith married of his own free-will, we mean only that there was no shotgun being pointed at him by an angry father (or something like this). We do not deny that marriages are predictable, or even that this marriage was’ (1959, p. 124). We just mean that he was not made to do it against his will, pressured or strong-armed into doing something he did not want to do (Ibid., p. 123). However, ‘ordinary language so construed is simply irrelevant to the celebrated problem of the freedom of the will’ (p. 121), which is a ‘metaphysical problem’ that can be solved only with a ‘metaphysical solution’ (p. 124). Similarly, some philosophers have been explicit in saying that the free will that philosophers are curious about is not the free will that we speak of in daily life (Hardie 1957, p. 30; van Inwagen 2008, p. 329, note 1). Relatedly, others try to distinguish freedom of action from freedom of will and shift the debate towards the latter idea (see McKenna and Pereboom 2016, p. 10). The former idea roughly corresponds to what Flew was talking about, while the latter is supposedly something quite different and concerns choice or decision rather than action, and is less in common currency.

Though the sharp disparity between the views of the defender and the skeptic would be well explained by this idea that they are ‘talking past each other’, operating with different notions, there is a problem with it. There is an unwritten rule (or a ‘conversational maxim’, to use a Gricean expression) that we must tell our readers that we are using some expression in an unusual sense if we are doing so. This is to prevent misunderstanding and confusion, since we naturally interpret a person’s words to have their ordinary signification unless told to do otherwise. However, most philosophers, not to mention psychologists and neuroscientists, do not say that they are using ‘free will’ or ‘free action’ in some special or unusual sense in their written works on this topic. Thus, if they are doing this, then many of them are being irresponsible by not being upfront about it. This omission would be excusable if it were common knowledge that ‘free will’ is being used in some non-standard sense in the literature, but this is hardly true, especially considering that some philosophers have said the exact opposite: that in the free will debate we are investigating whether free will exists as ordinarily conceived (see, for example, Jackson 1998, p. 31).

In light of these conflicting indications, it is simply not clear whether in the debates about the existence of free action it is free action in the ordinary sense that is being discussed. One way to find clarity on this, however, might be through reflection on the related phenomenon of moral responsibility. Most philosophers have not been interested in free will just for its own sake but because of its importance for moral responsibility, believing that whether we can be held morally accountable for our actions, and can be deserving of praise and blame, turns on whether we can act freely. Thus, to the question ‘What sense of free will are you talking about?’, some might reply, ‘The one that matters for moral responsibility’. However, this might not be of great help because even if there is some ‘metaphysical’ notion of free will that is critical for moral responsibility, the ordinary notion of free will is also important for it. For ordinarily if we are told that someone did something terrible, but are then told that he did not do it of his own free will, we will (if we believe this) infer that he is less responsible for having done it.

7. “Ordinary Language is Correct Language”

Let us look again at premise 2 of Flew’s PCA. This stated that cases matching a certain description are paradigm cases of free action. But how does a defender support such a claim? By referring to linguistic considerations. By saying that these are the kinds of cases that we ordinarily or standardly call ‘free actions’, or that these are the kinds of cases that we would refer to when teaching or explaining the meaning of ‘free action’. Furthermore, we can take the former to be the most fundamental consideration because the meaning of a term can be taught or explained correctly or incorrectly, depending on whether the instruction reflects the ordinary use, and besides, much of our native language is not learned from explicit instruction.

But can we safely infer from the fact that a certain sort of case or thing is ordinarily called ‘X’ that it is in fact an X? It seems easy to find reasons to dismiss this principle. After all, didn’t people in superstitious societies ordinarily refer to certain events as miracles, or to the Sun as a deity, while being incorrect in saying those things?

The idea that if something is ordinarily called ‘an X’ then it is an X was expressed by Malcolm in his statement that ‘ordinary language is correct language’ (Malcolm 1992/1942, p. 118, p. 120), which came to be regarded as a central slogan of ordinary language philosophy. As a slogan, however, this needs deciphering. Malcolm explained what he meant in saying this by distinguishing between two kinds of mistakes that can be made when making a statement, being mistaken about the facts, and using incorrect language (1992/1942, p. 117). The distinction can be illustrated with a case adapted from Malcolm. Suppose that Jones and Smith see an animal in some bushes at a distance, and Jones claims it is a wolf while Smith claims it is a fox. After it emerges from the bushes, Jones clearly sees that it has the characteristics of a fox and that he was mistaken. This was a factual mistake. But imagine another case where they both see the animal clearly and are in full agreement on what its characteristics are, though Jones claims it is a wolf while Smith claims it is a fox. Though the form of their disagreement is the same as before, we now have a linguistic rather than a factual disagreement: they disagree about what a thing of this sort is called. At least one of them is mistaken about the meaning of these words. (Though Malcolm contrasts ‘factual’ with ‘linguistic’ disagreement here, he would not deny that a linguistic mistake is based on a factual error (see Malcolm 1940). That a word has the particular meaning that it has is, of course, a kind of fact. This contrast might therefore be better described as one between linguistic and non-linguistic facts, and one might want to press Malcolm to clarify it further.)

But then Malcolm asks us to imagine the second disagreement again, though with Jones acknowledging that an animal of this sort is ordinarily called ‘a fox’ while maintaining that it is nevertheless incorrect to call it that and correct to call it ‘a wolf’. According to Malcolm, this would be absurd. It is absurd, he says, because ordinary language is correct language. To refute Jones’ claim here it suffices to say, ‘But that’s not what people call it.’

In his discussion of the paradigm case argument, Diego Marconi criticizes this view. He agrees that if some things are correctly called ‘Xs’ then they are Xs (2009, p. 116). But he disagrees that if some things are ordinarily called ‘Xs’ then they are correctly called ‘Xs’. For people might only be calling them ‘Xs’ because they appear to be Xs when in fact they are not Xs (p. 119). This seems right as far as it goes. However, if people are always calling some things ‘Xs’ because they appear to be Xs while not being Xs, then they are like Jones who called a fox ‘a wolf’ because it appeared to be a wolf to him: they are factually mistaken. Malcolm’s idea was that if some things are ordinarily called ‘Xs’ and if no factual mistakes are being made about them, then they are Xs. That is, Malcolm’s slogan represented an attempt to characterize a notion of linguistic correctness, saying that, assuming no factual mistakes are being made about it, the correct thing to call something is what everyone calls it (but for a hard case, see Watkins 1957a, p. 28). The factual/linguistic error distinction is indispensable for understanding the slogan.

8. Ordinary Usage as Practices

It is possible to gain a deeper understanding of why the defender puts so much weight on ordinary usage. But first let us return to an earlier point. We saw earlier that according to the defender, the PCA allows us to reject the skeptical position that there are no Xs without having to examine the skeptical argument. What is the source of this supposed imperviousness to skeptical argument? Can such an apparently dogmatic attitude be tolerated in philosophy? Consider again the skeptic who argued that there are no cases of seeing people. The defender responded by making the simple point that we ordinarily say that we see people in cases where we look at them clothed, cases that were deemed not to be cases of seeing people by the skeptical argument. But why exactly does the fact that we ordinarily say that make it correct to say that? And why should that ordinary usage be unassailable?

The reason is that the defender thinks she is describing what could be called a linguistic practice, custom, convention, or rule. She is trying to point out that it is our practice or custom, or a rule of our language, to call cases of this sort cases of seeing people. Now such things as practices, customs, or rules are open to criticism in various ways. For instance, a rule of a game can be criticized for making the game too long, too complicated, too inconvenient, too dangerous, or less exciting, and rules are sometimes changed to improve games along these lines. But it cannot be criticized for being incorrect, since practices, customs, or rules cannot be correct or incorrect.

Consider the rule in chess that the bishops can move only diagonally, for instance. What sense can there be in saying that this rule is correct? It is, indeed, one of the rules of chess. It is correct to say that this is a rule of chess. The statement that this is a rule of chess is correct. A move may be correct by being in conformity with it. But the rule itself is not correct; it is simply followed, and its being followed makes it one of the rules of chess (though something can also be a rule in virtue of being decreed by a relevant authority, even if people ignore it). Admittedly, we might sometimes speak loosely of a ‘correct rule’. But ‘correct’ here is redundant; ‘These are the correct rules of chess’ is just an emphatic way of saying, ‘These are the rules of chess’. For we have no understanding of what an incorrect rule of chess would be. Would moving the bishop vertically and horizontally be an example? No, since we can reprimand someone doing that by saying, ‘That’s not the rule for the bishop’. (It would confuse him to say ‘That is indeed a rule for the bishop, but an incorrect one’.)

So when a defender says, ‘We (ordinarily) call cases of this sort cases of seeing a person’, she is trying to say, ‘It is our practice/custom/rule to call cases of this sort cases of seeing a person’, and as such it is not the kind of thing that could be refuted by an argument. It is not something that could be proven by any argument either, just as a rule of chess can be neither proven nor refuted (though statements as to what are the rules of chess can be proven or refuted). Wittgenstein called this ‘bedrock’, where ‘I am inclined to say: “This is simply what I [or better, what we] do”’ (2009/1953, §217; also see §654). As practices or rules of our ‘language-game’ they are self-standing; they are things that philosophers ‘cannot justify’ in an evidential sense and must ‘leave as they are’.

But if a linguistic practice cannot be correct or incorrect, how does this help the defender? For didn’t the defender want to claim that it is correct to say that such-and-such a case is a case of seeing a person? Indeed, but note what she is claiming here: that it is correct to say that such-and-such a case is a case of seeing a person. The statement is what is correct here, not the practice, and it is correct by being in conformity with the practice. The point here is that though practices cannot be correct or incorrect, they are determiners of correctness. Thus a move in chess can be correct by being in conformity with the rules of chess, or a man’s manner of addressing the Queen can be correct by being in conformity with the accepted customs for addressing the Queen. Similarly, certain kinds of statements can be correct (not just grammatically correct, but true) by being in conformity with the rules of English. Thus the statement that some case, C, is a case of an X can be a correct and true statement by being in conformity with the practice of calling Cs ‘X’. (To take a simple example, ‘This color is orange’ can be true and correct by being in line with our practice of calling that color ‘orange’.) And this can be a practice just because it is followed, because the relevant people ordinarily do it.

Thus the paradigm case argument works in part by reminding us of what our linguistic practices are, practices that determine what it is to play the ‘game’ of speaking the relevant language, practices that the skeptic too, in unguarded moments or as a layperson, can be seen to participate in. This, however, is not to say that we should never break the linguistic rules that we currently follow. No prohibition is being urged here on creativity or novelty in the use of language; we are not being urged to never stray from the bounds of conventional and correct speech. The defender only wishes to maintain, against the skeptic, that calling certain things cases of seeing people, calling certain other ones cases of acting freely, and so forth, is not incorrect speech, insofar as it is in conformity with our linguistic customs to do so. Nor is it to deny that those linguistic practices can be criticized as problematic for reasons unrelated to correctness or truth, such as for pragmatic, moral, or political reasons.

9. Conclusion

So, does the paradigm case argument work? There does not seem to be anything intrinsically fallacious about it at least, but this general sort of question is not a good one to ask. First, we have seen that it is problematic to speak of the paradigm case argument, since two versions of it can be distinguished. But more importantly, it may be a bad question to ask because every topic to which it is applied may have its own peculiarities, such that a PCA may work in one application but not in another. For instance, we have seen that with free will skepticism there is a possibility that ‘free will’ is being used in a technical or unusual sense, which would make a PCA type of argument inapplicable to that topic, though nothing similar might be going on with some other topics. Applications of the PCA thus should be judged on a case-by-case basis.

Assessing the influence of the PCA on the analytic philosophical tradition is less easy than it would seem. By one measure, that of observing philosophers explicitly using or referring to the argument and accepting its conclusions, we would have to say that its influence has not been great. However, it is unclear just how much weight we should put on that measure since, as Gilbert Harman said, a ‘philosopher’s acceptance of the paradigm case argument need not be revealed in any explicit statement of the argument, since this acceptance may show itself in the philosopher’s attitude towards skepticism’ (1990, p. 7; also see Gellner 1959, p. 32).

For instance, this acceptance might be manifested in a philosopher’s tendency to treat things commonly or ‘intuitively’ identified as paradigms cases of an X as a datum for the purpose of developing a theory of X (by, for instance, trying to extract necessary or sufficient conditions from the cases), despite the existence of skeptical traditions that deny the existence of Xs. It is not uncommon to see philosophers proceeding in this way (sometimes called ‘the method of cases’) in positive theory development. If pushed to justify this procedure, the philosopher could (but might not) resort to something like the PCA. Skeptics might insist that this philosopher has no right to assume that those ‘paradigm cases’ are genuine paradigms without refuting their skeptical arguments. But defenders can attempt to turn the tables on the skeptics by requesting that they answer these questions. Any skeptical argument against the existence of any X must be based on some conception or analysis, implicit though it may be, of what X is. But how can we know that we have the right conception or analysis of X? Is there a better alternative to using the method of cases? And if not, might depending on the method of cases commit us to non-skepticism about X?

10. References and Further Reading

Alexander, H. G. (1958). More about the paradigm-case argument. Analysis, 18(5), pp. 117–120.
Ayer, A. J. (1963). Philosophy and language. In The Concept of a Person and Other Essays. London; Basingstoke: Macmillan, pp. 1–35.
Beattie, C. (1981). The paradigm case argument: its use and abuse in education. Journal of Philosophy of Education, 15(1), pp. 77–86.
Beebee, H. (2013). Free Will: An Introduction. Basingstoke: Palgrave Macmillan.
Black, M. (1973). Paradigm cases and evaluative words. Dialectica, 27(1), pp. 261–272.
Black, M. (1958). Making something happen. In Determinism and Freedom in the Age of Modern Science (S. Hook, ed.). New York: New York University Press, pp. 31–45.
Blanchard, B. (1962). Reason and Analysis. London: George Allen and Unwin Ltd. See chap. 7.
Butchvarov, P. (1964). Knowledge of meanings and knowledge of the world. Philosophy, 39(148), pp. 145–160.
Campbell, C. A. (1944–45). Common-sense propositions and philosophical paradoxes. Proceedings of the Aristotelian Society, 45, pp. 1–25.
Chappell, V. C. (1961). Malcolm on Moore. Mind, 70(279), pp. 417–425.
Chisholm, R. (1951). Philosophers and ordinary language. The Philosophical Review, 60(3), pp. 317–328.
Christman, J. (2015). Autonomy in moral and political philosophy. Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/autonomy-moral/.
Danto, A (1959). The paradigm case argument and the free-will problem. Ethics, 69(2), pp. 120–124.
Descartes, R. (2008/1641). Meditations on First Philosophy. Trans. M. Moriarty. Oxford: Oxford University Press.
Donnellan, K. (1967). Paradigm-case argument. In The Encyclopedia of Philosophy (P. Edwards, ed.). New York: Macmillan, pp. 106–113.
Eveling, H. S. & Leith, G. O. M. (1958). When to use the paradigm-case argument. Analysis, 18(6), pp. 150–152.
Flew, A. G. N. (1982). The paradigm case argument: abusing and not using the PCA. Journal of Philosophy of Education, 16(1), pp. 115–121.
Flew, A. G. N. (1966). Again the paradigm. In Mind, Matter, and Method: Essays in Philosophy and Science in Honor of Herbert Feigl (P. K. Feyerabend & G. Maxwell, eds.). Minneapolis: University of Minnesota Press, pp. 261–272.
Flew, A. G. N. (1957). ‘Farewell to the paradigm-case argument’: a comment. Analysis, 18(2), pp. 34–40.
Flew, A. G. N. (1955a). Philosophy and language. The Philosophical Quarterly, 5(18), pp. 21–36.
Flew, A. G. N. (1955b). Divine Omnipotence and Human Freedom. In New Essays in Philosophical Theology. Ed. A. Flew and A. MacIntyre. London: SCM, pp. 144–169.
Flew, A. G. N. (1954). Crime or disease. The British Journal of Sociology, 5(1), pp. 49–62.
Gellner, E. (1959). Words and Things: A Critical Account of Linguistic Philosophy and a Study in Ideology. Great Britain: Victor Gollancz.
Hallett, G. L. (2008). Linguistic Philosophy: The Central Story. Albany, N. Y.: State University of New York Press. See chapter 10.
Hanfling, O. (1990). What is wrong with the paradigm case argument? Proceedings of the Aristotelian Society, 91, pp. 21–38.
Hardie, W. F. R. (1957). My own free will. Philosophy, 32(120), pp. 21–38.
Harman, G. (1990). Skepticism and the Definition of Knowledge. London; New York: Routledge. See chapter 1.
Harre, R. (1958). Tautologies and the paradigm-case argument. Analysis, 18(4), pp. 94–96.
Heath, P. L. (1952). The appeal to ordinary language. The Philosophical Quarterly, 2(6), pp. 1–12.
Houlgate, L. D. (1962). The paradigm-case argument and ‘possible doubt’. Inquiry, 5(1–4), pp. 318–324.
Jackson, F. (1998). From Metaphysics to Ethics: A Defence of Conceptual Analysis. Oxford: Clarendon Press.
King-Farlow, J. & Rothstein, J. M. (1964). Paradigm cases and the injustice to Thrasymachus. The Philosophical Quarterly, 14(54), pp. 15–22.
Lucas, J. R. (1970). The Freedom of the Will. Oxford: Oxford University Press.
MacIntyre, A. C. (1957). Determinism. Mind, 66(261), pp. 28–41.
McKenna, M. and Pereboom, D. 2016. Free Will: A Contemporary Introduction. New York; London: Routledge.
Malcolm, N. (1963). George Edward Moore. In Knowledge and Certainty: Essays and Lectures by Norman Malcolm. Englewood Cliffs, N. J.: Prentice-Hall, Inc, pp. 163–183.
Malcolm, N. (1951). Philosophy for philosophers. The Philosophical Review 60(3), pp. 329–340.
Malcolm, N. (1992/1942). Moore and ordinary language. In The Linguistic Turn (R. Rorty, ed.). Chicago and London: The University of Chicago Press (pp. 111–124). Originally published in (1942) The Philosophy of G. E. Moore (Paul A. Schilpp, ed.). Evanston and Chicago: Northwestern University Press (pp. 345–368).
Malcolm, N. (1940). Are necessary propositions really verbal? Mind, 194, pp. 189–203.
Marconi, D. (2009). Being and being called: paradigm case arguments and natural kind words. The Journal of Philosophy, 106(3), pp. 113–136.
Monroe, A. E. & Malle, B. F. (2010). From uncaused will to conscious choice: the need to study, not speculate about people’s folk concept of free will. Review of Philosophy and Psychology, 1(2), pp. 211–224.
Moore, G. E. (1939). Proof of an external world. Proceedings of the British Academy, 25, pp. 273–300.
Moore, G. E. (1918). The conception of reality. Proceedings of the Aristotelian Society, 18(1), pp. 101–120.
Papineau, D. (1998). Methodology: the elements of the philosophy of science. In Philosophy 1: A Guide Through the Subject (A. C. Grayling ed.). Oxford: Oxford University Press, pp. 123–180.
Passmore, J. (1961). Philosophical Reasoning. London: Duckworth. See chapter 6.
Parker-Ryan, S. (2010). Reconsidering ordinary language philosophy: Malcolm’s (Moore’s) ordinary language argument. Essays in Philosophy, 11(2), pp. 123–149.
Piper, M. (2010). Autonomy: normative. Internet Encyclopedia of Philosophy.
Richman, R. J. (1962). Still more on the argument of the paradigm case. Australasian Journal of Philosophy, 40(2), pp. 204–207.
Richman, R. J. (1961). On the argument of the paradigm case. Australasian Journal of Philosophy, 39(1), pp. 75–81.
Soames, S. (2003). Philosophical Analysis in the Twentieth Century, vol. 2. Princeton, New Jersey: Princeton University Press. See chapter 7.
Stebbing, S. (1937). Philosophy and the Physicists. London: Methuen.
Stroud, B. (1984). The Significance of Philosophical Scepticism. Oxford; New York: Oxford University Press. See chapter 2.
Urmson, J. O. (1953). Some questions concerning validity. Revue Internationale de Philosophie, 7(25), pp. 217–229.
van Inwagen, P. (2008). How to think about the problem of free will. The Journal of Ethics, 12(3/4), pp. 327–341.
van Inwagen, P. (1983). An Essay on Free Will. Oxford: Clarendon Press. See chapter 4.
Watkins, J. W. N. (1957a). Farewell to the paradigm-case argument. Analysis, 18(2), pp. 25–33.
Watkins, J. W. N. (1957b). A reply to professor Flew’s comment. Analysis, 18(2), pp. 41–42.
Williams, C. J. F. (1961). More on the argument of the paradigm case. Australasian Journal of Philosophy, 39(3), pp. 276–278.
Wittgenstein, L. (2009/1953). Philosophical Investigations. Trans. G. E. M. Anscombe, P. M. S. Hacker & Joachim Schulte. Chichester: Wiley-Blackwell.

Author Information

Kevin Lynch
Email: kevinlynch405@eircom.net
Huaqiao University
China

Duality in Logic and Language

Duality phenomena occur in nearly all mathematically formalized disciplines, such as algebra, geometry, logic and natural language semantics. However, many of these disciplines use the term ‘duality’ in vastly different senses, and while some of these senses are intimately connected to each other, others seem to be entirely unrelated. Consequently, if the term ‘duality’ is used in two different senses in one and the same work, the authors often explicitly warn about the potential confusion.

This article focuses exclusively on duality phenomena involving the interaction between an ‘external’ and an ‘internal’ negation of some kind, which arise primarily in logic and linguistics. A well-known example from logic is the duality between conjunction and disjunction in classical propositional logic: $\varphi \wedge \psi$ is logically equivalent to $\neg (\neg \varphi \vee \neg \psi)$, and hence $\neg ( \varphi \wedge \psi)$ is logically equivalent to $\neg \varphi \vee \neg \psi$. A well-known example from linguistics concerns the duality between the aspectual particles already and still in natural language: already outside means the same as not still inside, and hence, not already outside means the same as still inside (where inside is taken to be synonymous with not outside). Examples such as these show that dualities based on external/internal negation show up for a wide variety of logical and linguistic operators.

Duality phenomena of this kind are highly important. First of all, since they occur in formal as well as natural languages, they provide an interesting perspective on the interface between logic and linguistics. Furthermore, because of their ubiquity across natural languages, it has been suggested that duality is a semantic universal, which can be of great heuristic value. Finally, duality principles play a central role in Freudenthal’s famous proposal for a language for cosmic communication.

Many authors employ the notion of duality as a means to describe the specific details of a particular formal or natural language, without going into any systematic theorizing about this notion itself. Next to such auxiliary uses, however, there also exist more abstract, theoretical accounts that focus on the notion of duality itself. For example, these theoretical perspectives address the group-theoretical aspects of duality, or its interplay with the so-called Aristotelian relations. This article examines a wide variety of dualities in formal and natural languages, and it discusses some of the more theoretical perspectives on duality.

The article is organized as follows. Sections 1 and 2 provide an extensive overview of the most important concrete examples of duality in logic and natural language. Section 3 describes a detailed framework (based on the notion of a Boolean algebra) that allows systematical analysis of these dualities. Section 4 presents a group-theoretical approach to duality phenomena, and Section 5 draws an extensive comparison between duality relations and another type of logical relation, namely those that characterize the Aristotelian square of opposition.

As to the technical prerequisites for this article, Sections 1 and 2 should be accessible to everyone with a basic understanding of philosophical logic. In Sections 3, 4 and 5, the use of some other mathematical tools and techniques is unavoidable; these sections require a basic understanding of discrete mathematics (in particular, Boolean algebra and elementary group theory).

Duality in Logic
Duality in Natural Language
Theoretical Framework
A Group-Theoretical Approach to Duality
Duality Relations and Aristotelian Relations
References and Further Reading

1. Duality in Logic

Conjunction and disjunction. The most widely known example of duality in logic is undoubtedly that between conjunction and disjunction in classical propositional logic ($\mathsf{CPL}$). Because of their semantics, i.e. the way they are standardly interpreted in $\mathsf{CPL}$, these connectives can be defined in terms of each other, and consequently, only one of them needs to be taken as primitive. For example, if conjunction ($\wedge$) and negation ($\neg$) are taken as primitives, then disjunction ($\vee$) can be defined as follows:

\begin{equation}\label{eq1}\varphi\vee\psi :\equiv \neg(\neg\varphi\wedge\neg\psi).\end{equation} Alternatively, if disjunction is taken as primitive, then conjunction can be defined as follows:
\begin{equation}\label{eq2}\varphi\wedge\psi :\equiv \neg(\neg\varphi\vee\neg\psi).\end{equation}

Furthermore, each of these equivalences can be derived from the other one; for example, if (\ref{eq1}) is taken as primitive, then we obtain (\ref{eq2}) as follows:

\begin{equation} \label{eq3}
\neg(\neg\varphi\vee\neg\psi) \equiv \neg\neg(\neg\neg\varphi\wedge\neg\neg\psi) \equiv \varphi\wedge\psi.
\end{equation}

Finally, in both cases we obtain the well-known laws of De Morgan. For example, if conjunction is taken as primitive, then (\ref{eq4}) follows immediately from (\ref{eq1}), while (\ref{eq5}) follows from (\ref{eq1}) via (\ref{eq3}):

\begin{equation}
\label{eq4}\neg(\varphi\vee\psi) & \equiv & \neg\varphi\wedge\neg\psi
\end{equation}
\begin{equation}
\label{eq5}\neg(\varphi\wedge\psi) & \equiv & \neg\varphi\vee\neg\psi.
\end{equation}

Equivalences such as (\ref{eq1}-\ref{eq5}) exhibit the duality between conjunction and disjunction. They clearly show the interaction between an internal negation (which attaches to each of the individual formulas $\varphi$ and $\psi$, and thus occurs inside the scope of the conjunction/disjunction connective) and an external negation (which occurs outside the scope of the connectives). Equivalences (\ref{eq1}-\ref{eq2}) show that applying both internal and external negation to a disjunction yields the corresponding conjunction, and vice versa. Similarly, (\ref{eq4}-\ref{eq5}) show that the internal negation of a disjunction is logically equivalent to the external negation of the corresponding conjunction, and vice versa. All these equivalences are manifestations of the underlying semantics of the conjunction and disjunction connectives in $\mathsf{CPL}$.

Universal and existential quantifiers. Another well-known case of duality concerns the universal and existential quantifiers in classical first-order logic ($\mathsf{FOL}$). The situation here is largely analogous to that of conjunction and disjunction. Because of their semantics, i.e. the way they are standardly interpreted in $\mathsf{FOL}$, these quantifiers can be defined in terms of each other, and consequently, only one of them needs to be taken as primitive. For example, if the universal quantifier ($\forall$) is taken as primitive, the existential quanifier ($\exists$) can be defined as follows:

\begin{equation}\label{eq6}\exists x\varphi :\equiv \neg\forall x\neg\varphi.\end{equation}

Conversely, if the existential quantifier is taken as primitive, then the universal quantifier can be defined as follows:

\begin{equation}\label{eq7}\forall x\varphi :\equiv \neg\exists x\neg\varphi.\end{equation}

Again, each of these equivalences can be derived from the other one; for example, if (\ref{eq6}) is taken as primitive, then we obtain (\ref{eq7}) as follows:

\begin{equation} \label{eq8}
\neg\exists x\neg\varphi \equiv \neg\neg\forall x\neg\neg\varphi\equiv \forall x\varphi.
\end{equation}

Finally, in both cases we obtain the well-known quantifier laws. For example, if the universal quantifier is taken as primitive, then (\ref{eq9}) follows immediately from (\ref{eq6}), while (\ref{eq10}) follows from (\ref{eq6}) via (\ref{eq8}):

\begin{equation}\label{eq9}\neg\exists x \varphi & \equiv & \forall x\neg\varphi,\end{equation}
\begin{equation}\label{eq10}\neg\forall x\varphi & \equiv & \exists x\neg\varphi\end{equation}

Equivalences such as (\ref{eq6}-\ref{eq10}) exhibit the duality between the universal and the existential quantifier. Again, they show the interaction between an internal negation (which occurs inside the scope of the quantifier) and an external negation (which occurs outside the scope of the quantifier). Equivalences (\ref{eq6}-\ref{eq7}) show that applying both internal and external negation to an existential quantifier yields the corresponding universal quantifier, and vice versa. Similarly, (\ref{eq9}-\ref{eq10}) show that the internal negation of an existential quantifier is logically equivalent to the external negation of the corresponding universal quantifier, and vice versa. All these equivalences are manifestations of the underlying semantics of the universal and existential quantifiers in $\mathsf{FOL}$.

Modal operators. Another rich source of dualities is the broad family of modal logics. For example, in alethic modal logic, necessity ($\Box$) and possibility ($\Diamond$) are dual to each other (\ref{eq11}-\ref{eq12}), while in deontic logic, obligation ($O$) and permission ($P$) are usually taken as duals (\ref{eq13}-\ref{eq14}):

\begin{equation}\label{eq11}\Box\varphi \equiv \neg\Diamond\neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg\Box\varphi \equiv \Diamond\neg\varphi\end{equation}
\begin{equation}\label{eq12}\Diamond\varphi \equiv \neg\Box\neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg\Diamond\varphi \equiv \Box\neg\varphi,\end{equation}
\begin{equation}\label{eq13}O\varphi \equiv \neg P \neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg O \varphi \equiv P \neg\varphi,\end{equation}
\begin{equation}\label{eq14}P \varphi \equiv \neg O \neg\varphi, \hspace{0.3cm} & \hspace{0.3cm} \neg P \varphi \equiv O \neg\varphi.\end{equation}

Blackburn et al. (2001) provide many other modal examples from concrete application domains, such as temporal logic, propositional dynamic logic and hybrid logic, and more mathematically motivated examples, such as the dualities involving the difference modality and the universal modality. In general, an $n$-ary modal operator is called a triangle ($\Delta$), and its dual a nabla ($\nabla$):

\begin{equation}\label{eqnew01}\Delta(\varphi_1,\dots,\varphi_n) &\equiv & \neg\nabla(\neg\varphi_1,\dots,\neg\varphi_n),\end{equation}
\begin{equation}\label{eqnew02}\nabla(\varphi_1,\dots,\varphi_n) &\equiv & \neg\Delta(\neg\varphi_1,\dots,\neg\varphi_n).\end{equation}

The equivalences (\ref{eqnew01}-\ref{eqnew02}) again clearly illustrate the interaction between internal and external negation. Note, furthermore, that the internal negation is applied to all formulas ($\varphi_1,\dots,\varphi_n$). This was also the case with conjunction/disjuction (\ref{eq1}-\ref{eq2}) and with the universal/existential quantifiers (\ref{eq6}-\ref{eq7}) (although the latter case is trivial, since in equivalences (\ref{eq6}-\ref{eq7}) there is only a single formula ($\varphi$) to which the internal negation can be applied).

Interconnections. Many of the examples given above are systematically related to each other, and might thus be viewed as manifestations of the same underlying duality. First of all, it is well-known that the propositional connectives of conjunction and disjunction are related to the universal and existential quantifiers, respectively. For example, the formulas $\forall x Px$ and $\exists x Px$ can informally be viewed as expressing the conjunction $Pa \wedge Pb \wedge Pc \wedge \dots$ and the disjunction $Pa \vee Pb \vee Pc \vee \dots$, respectively. This reveals a structural similarity between equivalences (\ref{eq1}-\ref{eq2}) and (\ref{eq6}-\ref{eq7}). Secondly, in Kripke semantics the modal operators are interpreted as quantifying over possible worlds. For example, the formulas $\Box p$ and $\Diamond p$ can be interpreted as stating that $p$ is true in all possible worlds and that $p$ is true in at least one possible world, respectively. This reveals a structural similarity between equivalences (\ref{eq6}-\ref{eq7}) and (\ref{eq11}-\ref{eq12}).

2. Duality in Natural Language

Quantifiers and modalities in natural language. The most obvious class of natural language expressions that give rise to duality behavior, are the immediate counterparts of the logical operators discussed in Section 1. For example, the determiners all and some combine with a noun to yield noun phrases such as all books and some books, and seem to correspond directly to the quantifiers $\forall$ and $\exists$. This correspondence is not entirely unproblematic, since it ignores linguistically relevant distinctions, such as the difference between every and all vis-à-vis collective and distributive predicates (Dowty 1987; Brisson 2003), and the distinction between quantificational and non-quantificational uses of some (Löbner 1987). Setting such considerations aside, however, one can say that the natural language determiners all and some are each other’s duals, just like the first-order quantifiers $\forall$ and $\exists$ are each other’s duals. Similarly, the duality relation between $\Box$ and $\Diamond$ in modal logic also shows up for a whole range of natural language expressions for necessity and possibility. In logic, $\Box$ and $\Diamond$ are almost invariably operators taking propositions as their arguments. In natural language, however, the modal notions are expressed in a variety of linguistic categories, such as modal adjectives (necessary vs. possible), modal adverbs (necessarily vs. possibly) or modal auxiliary verbs (must/should vs. can/may).

Conjunction and disjunction in natural language. The most prototypical duality in logic, namely that between the propositional connectives of conjunction and disjunction, only plays a minor role, if any, in the linguistic realm. The main reason is the ambiguity of natural language and and or, which is often explained pragmatically in terms of conversational implicatures (Horn 2004). For example, natural language conjunction very often conveys additional aspects of causality ($\varphi$ and $\psi$ $\equiv$ $\varphi$ and therefore $\psi$) or sequentiality ($\varphi$ and $\psi$ $\equiv$ $\varphi$ and afterwards $\psi$), whereas disjunction is notoriously ambiguous between an inclusive interpretation ($\varphi$ or $\psi$ $\equiv$ $\varphi$ or $\psi$, and perhaps both) and an exclusive interpretation ($\varphi$ or $\psi$ $\equiv$ $\varphi$ or $\psi$, but not both). These asymmetrical ambiguities of natural language conjunction and disjunction render the notion of duality less suitable for their linguistic and philosophical analysis, as observed by Humberstone (2011, p. 772):

for many logical purposes $[\ldots]$ conjunction and disjunction are attractively treated in a symmetrical fashion. Inherent asymmetries in the informal conceptual apparatus we bring to bear on logic often make duality an inappropriate consideration to bring in for philosophical purposes, however.

Testing for duality. In logic, duality is a matter of definition or convention; in modal logic, for example, the duality between $\Box$ and $\Diamond$ follows from the way in which the semantics of these operators is defined. By contrast, in linguistics, duality is a much more empirical matter. In other words, duality relations between natural language expressions have to be argued for or demonstrated and may thus be refuted on empirical grounds. For that purpose, duality tests have been devised (Löbner 2011, p. 492ff.), which crucially rely on the relation of lexical inversion holding between predicates such as be on/off, be inside/outside or be here/gone. Testing for internal negation evaluates the equivalence between (i) a proposition $O(P)$ with operator $O$ and predicate $P$ and (ii) a proposition $O'(P’)$, with operator $O’ = \Tiny{INEG} \small{(O)}$ being the internal negation of $O$, and predicate $P’ = \Tiny{LEXINV} \small(P)$ being the lexical inverse of $P$; see (\ref{eq19}). The examples in (\ref{eq20}-\ref{eq21}) illustrate the internal negations of the quantifiers:

\begin{equation}\label{eq19}O(P) & \equiv & \Tiny{INEG}\small(O)(\Tiny{LEXINV}\small{(P)})\end{equation}
\begin{equation}\label{eq20} \textbf{Some}~ lights~ are~ \textbf{on}. &\equiv & \textbf{Not all}~ lights~ are~ \textbf{off.}\end{equation}
\begin{equation}\label{eq21}\textbf{No}~ children~ are~ \textbf{inside.} &\equiv & \textbf{All}~ children~ are~ \textbf{outside.}\end{equation}

Testing for duality evaluates the equivalence between (i) a proposition which gives a negative answer to a polarity question of the form $O(P)$ and (ii) a proposition $O'(P’)$, with operator $O’ = \Tiny{DUAL}\small{(O)}$ being the dual of $O$, and predicate $P’ = \Tiny{LEXINV}\small{(P)}$ again being the lexical inverse of $P$; see (\ref{eq22}). The examples in (\ref{eq23}-\ref{eq24}) illustrate the dialogue patterns establishing the duality of the universal and existential quantifiers:

\begin{equation}\label{eq22} \neg O(P) & \equiv & \Tiny{DUAL}\small{(O)}(\Tiny{LEXINV}\small{(P)})\end{equation}
\begin{equation}\label{eq23} Are~ \textbf{some}~ lights~ \textbf{on}? – No, & \equiv & \textbf{all}~ lights~ are~ \textbf{off}.\end{equation}
\begin{equation}\label{eq24} Are~ \textbf{all}~ children~ \textbf{inside}? – No, & \equiv & \textbf{some}~ children~ are~ outside.\end{equation}

The main reason for applying lexical inversion to the predicates in these tests, rather than straightforward grammatical negation by means of the negative particle not, is that the latter may yield scope ambiguities, depending on whether it is taken to express internal or external negation (Löbner 2011, p. 492ff.). For example, the negative particle not in the lefthand side of (\ref{eq25}-\ref{eq26}) may get the internal negation reading (\ref{eq25}) as well as the external negation reading (\ref{eq26}). Similarly, the modal auxiliary may in the lefthand side of (\ref{eq27}-\ref{eq28}) interacts differently with the negative particle not depending on the type of modality involved: in its epistemic use, it gets the internal negation reading (\ref{eq27}), whereas in its deontic use, it gets the external negation reading (\ref{eq28}).

\begin{equation}\label{eq25} \textbf{All}~ children~ are~ \textbf{not}~ inside. & \stackrel{1}{\equiv} & \textbf{All}~ children~ are~ \textbf{outside}.\end{equation}
\begin{equation}\label{eq26} &\stackrel{2}{\equiv}& \textbf{Not all}~ children~ are~ \textbf{inside}.\end{equation}
\begin{equation}\label{eq27}She~ \textbf{may not}~ stay. & \stackrel{1}{\equiv} & She~ \textbf{may}~ leave.\end{equation}
\begin{equation}\label{eq28} & \stackrel{2}{\equiv} & She~ \textbf{must}~ leave.\end{equation}

The negative particle not and the quantifier all in the lefthand side of (\ref{eq25}-\ref{eq26}) can take scope over each other: in (\ref{eq25}), not occurs inside the scope of all (thereby transforming the predicate inside into its lexical inverse outside), while in (\ref{eq26}), all occurs inside the scope of not. Such scope ambiguities also arise for other operators besides negation. For example, the quantifier all and the modal adverb necessarily in (\ref{neweq1}-\ref{neweq2}) can take scope over each other, thus giving rise to the de dicto reading (\ref{neweq1}) and the de re reading (\ref{neweq2}). However, scope distinctions cannot be fully reduced to the de dicto/de re distinction. After all, the latter is a binary distinction, whereas operators that take scope over each other can give rise to more than two distinct interpretations (Kripke 1977).

\begin{equation}\label{neweq1} \textbf{Everything}~ is~ \textbf{necessarily}~ self\textrm{-}identical. & \stackrel{1}{\equiv} & \Box\forall x(x = x),\end{equation}
\begin{equation}\label{neweq2} & \stackrel{2}{\equiv}& \forall x\Box(x=x).\end{equation}

Another complication arising from negation concerns the cognitive difficulty that people have with processing sentences that contain multiple negations. Because of these cognitive difficulties, some of the tests described above are less easily applicable to determine whether a certain relation holds between two expressions. For example, we not only have a duality between the positive quantifiers all and some, but also one between the negative quantifiers no and some not. The former duality is empirically confirmed by the dialogue patterns in (\ref{eq23}-\ref{eq24}). In contrast, the corresponding dialogue patterns for the latter duality in (\ref{eq29}-\ref{eq30}) contain three grammatical negations (no, no and not) and one lexical inversion (off), and therefore sound much less natural (even though they are logically impeccable).

\begin{equation}\label{eq29} Are~ \textbf{no}~ lights~ \textbf{on}?- No, &\equiv& \textbf{not all}~ lights~ are~ \textbf{off}.\end{equation}
\begin{equation}\label{eq30} Are~ \textbf{not all}~ lights~ \textbf{on}?- No, &\equiv& \textbf{no}~ lights~ are~ \textbf{off}.\end{equation}

Pronouns and adverbs of quantification. The universal and existential quantifiers are not only related to the determiners all and some, but also to a number of other linguistic categories. For example, when quantifying over people or objects, the determiners are morphologically integrated with the nouns body and thing into indefinite pronouns. Similarly, when quantifying over places, the determiners are morphologically integrated with the adverb where into compound adverbs. By contrast, adverbs that quantify over time and manner exhibit more idiosyncratic lexicalization patterns. Irrespective of such morphological details, all of the categories in the table below inherit the same basic duality pattern from the determiners, and thus, ultimately, from the logical quantifiers $\forall$ and $\exists$.

$\forall$	$\neg\forall$	$\forall\neg$	$\neg\forall\neg$
$\neg\exists\neg$	$\exists\neg$	$\neg\exists$	$\exists$
every	not every	no	some
everybody	not everybody	nobody	somebody
everything	not everything	nothing	something
everywhere	not everywhere	nowhere	somewhere
always	not always	never	sometimes
anyhow	not anyhow	no way	somehow

Generalized quantifiers. Contemporary generalized quantifier theory (GQT) is able to deal with a considerably larger range of natural language quantifiers than the usual universal and existential ones (Barwise and Cooper 1981; Peters and Westerståhl 2006). These include quantifiers that cannot be expressed in first-order languages, such as most. Additionally, GQT allows for a more compositional treatment of quantification. Consider, for example, the sentences John runs and everybody runs, which have by and large the same syntactic structure (namely: noun phrase + verb phrase). While the first-order representations of the semantics of these sentences are vastly different–$\textit{run}(j)$ vs. $\forall x\colon \textit{run}(x)$–, their GQT representations are much more similar: $\textit{John}(\textit{run})$ vs. $\textit{everybody}(\textit{run})$.

GQT offers two (mathematically equivalent) perspectives on quantification: a functional and a relational perspective. Focusing on the former, a quantifier expression $Q$ is taken to denote a set of subsets of the universe $U$ of people, and for any unary predicate expression $B$, the formula $Q(B)$ is true iff $[\![B]\!] \in [\![Q]\!]$. For example, since $$[\![\textit{everybody}]\!] = \{X \subseteq U \mid U = X\}$$ and $$[\![\textit{somebody}]\!] = \{X \subseteq U \mid X \neq \emptyset\}$$ it is easy to see that $\textit{everybody}(\textit{run})$ is true iff $U = [\![\textit{run}]\!]$ and that $\textit{somebody}(\textit{run})$ is true iff $[\![\textit{run}]\!] \neq \emptyset$. As expected, the external negation, internal negation and dual of the formula $Q(B)$ are defined as $\neg Q(B)$, $Q(\neg B)$ and $\neg Q(\neg B)$, respectively (with the convention that $[\![\neg B]\!] = U – [\![B]\!] = \{x \in U \mid x \notin [\![B]\!]\}$). For example, the dual of $\textit{everybody}(\textit{run})$ is $\neg \textit{everybody} (\neg \textit{run})$, which is true iff $U \neq U – [\![\textit{run}]\!]$, i.e. iff $[\![\textit{run}]\!]\neq \emptyset$. This shows that in GQT, too, the dual of $\textit{everybody}(\textit{run})$ is $\textit{somebody}(\textit{run})$. Finally, if the proper name John names the individual $j \in U$, then GQT defines the generalized quantifier $$[\![\textit{John}]\!] = \{X \subseteq U \mid j \in X\}$$ and thus we find that $\textit{John}(\textit{run})$ is true iff $[\![\textit{run}]\!] \in [\![\textit{John}]\!]$, iff $j \in [\![\textit{run}]\!]$. Note that the dual of $\textit{John}(\textit{run})$ is $\neg \textit{John}(\neg \textit{run})$, which is true iff $j \notin U – [\![\textit{run}]\!]$, iff $j\in [\![\textit{run}]\!]$. This shows that $\textit{John}(\textit{run})$ is dual to itself, which illustrates the fact that in GQT, proper names are self-dual (Gamut 1991, p. 238)).

We now turn to the alternative, relational perspective in GQT. This perspective focuses on sentences of the form $Q(A,B)$, where $Q$ is a quantifier expression and $A$ and $B$ are unary predicate expressions. The formula $Q(A,B)$ is true iff $([\![A]\!],[\![B]\!]) \in [\![Q]\!]$. Here are some well-known examples (with $\wp(U)$ denoting the powerset of $U$, i.e. $\wp(U) = \{X \mid X \subseteq U\}$):

\begin{align}{r c l}
[\![\textit{all}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid X \subseteq Y\} \\
[\![\textit{some}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid X \cap Y\neq \emptyset\} \\
[\![\textit{most}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid |X \cap Y| > |X – Y|\} \\
[\![\textit{some but not all}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid X \cap Y \neq \emptyset \text{ and } X – Y \neq \emptyset\} \\
[\![\textit{exactly half of the}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid |X \cap Y| = \frac{1}{2}|X|\} \\
[\![\textit{the}_{\text{sing}}]\!] & = & \{(X,Y) \in \wp(U)\times \wp(U) \mid |X| = 1 \text{ and } X \subseteq Y\} \\
\end{align}

The external negation, internal negation and dual of the formula $Q(A,B)$ are defined as $\neg Q(A,B)$, $Q(A,\neg B)$ and $\neg Q(A,\neg B)$, respectively. Note that, in contrast to the examples from logic discussed in Section 1, internal negation is not applied to all predicate expressions, but only to the second one. Here, too, generalized quantifiers can be their own dual or internal negation. For example, the internal negation of some but not all (man, run) is some but not all (man, $\neg$run$)$, which is true iff $$[\![man]\!] \cap (U – [\![run]\!]) \neq \emptyset \text{ and } [\![man]\!] – (U – [\![run]\!]) \neq \emptyset$$ iff $$[\![man]\!] – [\![run]\!] \neq \emptyset \text{ and } [\![man]\!] \cap [\![run]\!] \neq \emptyset$$ iff some but not all $($man,run$)$ is true. This shows that some but not all is its own internal negation. Similarly, the proportional quantifier exactly half of the can be shown to be its own internal negation; for example, exactly half of the men are awake is equivalent to exactly half of the men are not awake.

The duality patterns of quantifiers such as most and many have been a matter of contention. Peterson (1979) proposed an analysis from which it follows that most and many are dual to each other. However, as pointed out by Horn (2006, p. 36), it seems unlikely that $most(A, B)$ is in general equivalent to $\neg many(A,\neg B)$. Consider, for example:

\begin{equation}\label{eq31} \textit{Most Italians like pizza.}\end{equation}
\begin{equation}\label{eq32} \textit{Not many Italians do not like pizza.}\end{equation}
\begin{equation}\label{eq33} \textit{Many Italians do not like pizza.}\end{equation}

If most and many were indeed dual, then (\ref{eq31}) and (\ref{eq32}) should be equivalent, while (\ref{eq31}) and (\ref{eq33}) should be contradictory. However, (\ref{eq31}) is true, but, since there are indeed many Italians that do not like pizza, (\ref{eq32}) is false and (\ref{eq33}) is true. This shows that (\ref{eq31}) and (\ref{eq32}) are not equivalent, and that (\ref{eq31}) and (\ref{eq33}) are not contradictory either.

Other linguistic expressions. Duality patterns also arise among natural language expressions that do not directly correspond to logical operators or quantifiers. For example, König (1991) has suggested that the causative conjunction because and the concessive conjunction although are duals, based on dialogue tests for duality such as (\ref{eq34}).

\begin{equation}\label{eq34} p~ \textbf{because}~ q? – No, & \equiv & p~ \textbf{although}~ \neg q.\end{equation}

However, based on other linguistic evidence and more general, methodological considerations, this proposal has been criticized by Iten (1998, 2005). Working in the framework of relevance theory, Iten argues that causative conjunctions make a significant contribution to the truth conditions of sentences in which they occur: p because q is true iff q is true, p is true, and q‘s being true is the cause of p‘s being true. By contrast, concessive conjunctions do not contribute to the truth conditions of sentences in which they occur: p although q is true iff q is true and p is true. Because of this discrepancy, Iten claims that sentences such as $\neg$(p because q) and p although $\neg$q do not have the same truth conditions, and consequently, because and although are not dual to each other.

The most widely studied example of linguistic duality, however, is that between the aspectual adverbs already and still (Löbner 1989, 1990, 1999; van der Auwera 1993; Mittwoch 1993; Michaelis 1996; Smessaert and ter Meulen 2004). The dialogue tests for duality in (\ref{eq35}-\ref{eq36}) suggest that already and still are indeed each other’s duals.

\begin{equation}\label{eq35} Is~ Bob~ \textbf{already}~ \textbf{outside}? – No, & \equiv & he~ is~ \textbf{still}~ \textbf{inside}.\end{equation}
\begin{equation}\label{eq36} Is~ Bob~ \textbf{still}~ \textbf{outside}? – No, & \equiv & he~ is~ \textbf{already}~ \textbf{inside}.\end{equation}

Similarly, using the equivalence tests for internal negation in (\ref{eq37}-\ref{eq38}), we find that the internal negation of already is no longer and that of still is not yet. Finally, the equivalences in (\ref{eq39}-\ref{eq40}) show that the external negation of already is not yet and that of still is no longer.

\begin{equation}\label{eq37} Bob~ is~ \textbf{already}~ \textbf{outside}. & \equiv & Bob~ is~ \textbf{no longer}~ \textbf{inside}.\end{equation}
\begin{equation}\label{eq38} Bob~ is~ \textbf{still}~ \textbf{outside}. & \equiv & Bob~ is~ \textbf{not yet}~ \textbf{inside}.\end{equation}
\begin{equation}\label{eq39} It’s~ not~ the~ case~ that~ Bob~ is~ \textbf{already}~ \textbf{outside}. & \equiv & Bob~ is~ \textbf{not yet}~ \textbf{outside}.\end{equation}
\begin{equation}\label{eq40} It’s~ not~ the~ case~ that~ Bob~ is~ \textbf{still}~ \textbf{outside}. & \equiv & Bob~ is~ \textbf{no longer}~ \textbf{outside}\end{equation}

The two negative adverbs no longer and not yet are also dual to each other, as illustrated by the dialogues in (\ref{eq41}-\ref{eq42}). However, because of the multiple negative elements, these dialogues sound less natural than the ones in (\ref{eq35}-\ref{eq36}), even though all of them are equally logically correct (compare with the dialogues in (\ref{eq23}-\ref{eq24}) and (\ref{eq29}-\ref{eq30}) for the dualities between the standard quantifiers).

\begin{equation}\label{eq41} Is~ Bob~ \textbf{not yet}~ \textbf{outside}? – No, & \equiv & he~ is~ \textbf{no longer}~ \textbf{inside}.\end{equation}
\begin{equation}\label{eq42} Is~ Bob~ \textbf{no longer}~ \textbf{outside}? – No, & \equiv & he~ is ~\textbf{not yet}~ \textbf{inside}.\end{equation}

Phase quantification. In order to account for the duality patterns of the aspectual adverbs described in (\ref{eq35}-\ref{eq42}), Löbner (1989; 1990; 2011) has developed the theory of phase quantification. He considers a (linear) temporal scale, a reference time $t$ on that scale, and a proposition $p$ (which is either true or false at any timepoint of the scale). The semantics of aspectual adverbs crucially concerns single polarity transitions on this temporal scale. There are two types of such transitions: the truth value of $p$ can change from false into true, or alternatively, from true into false. Furthermore, the reference time $t$ can either be situated in the positive ($p$) phase or in the negative ($\neg p$) phase of such a transition. In total, there are thus four cases to be distinguished:

- t is in the positive phase of a polarity transition from falsity to truth

As illustrated in Figure 1(a), this corresponds to sentences such as Bob was already reading the paper at noon. The reference time (at noon) is situated in the positive phase (in which Bob was reading the paper), and thus occurs after the (actual) transition of starting to read (i.e. the transition from not reading to reading) has taken place.

Figure 1: Löbner’s Four Phase Diagrams

- t is in the positive phase of a polarity transition from truth to falsity

As illustrated in Figure 1(b), this corresponds to sentences such as Bob was still reading the paper at noon. The reference time (at noon) is situated in the positive phase (in which Bob was reading the paper), and thus occurs before the (potential) transition of stopping to read (i.e. the transition from reading to not reading) has taken place.

- t is in the negative phase of a polarity transition from falsity to truth

As illustrated in Figure 1(c), this corresponds to sentences such as Bob was not yet reading the paper at noon. The reference time (at noon) is situated in the negative phase (in which Bob was not reading the paper), and thus occurs before the (potential) transition of starting to read (i.e. the transition from not reading to reading) has taken place.

- t is in the negative phase of a polarity transition from truth to falsity

As illustrated in Figure 1(d), this corresponds to sentences such as Bob was no longer reading the paper at noon. The reference time (at noon) is situated in the negative phase (in which Bob was not reading the paper), and thus occurs after the (actual) transition of stopping to read (i.e. the transition from reading to not reading) has taken place.

In the case of duality (already/still and not yet/no longer), the actual polarity of $p$ thus remains unchanged, but the direction of the polarity transition gets reversed. By contrast, in the case of external negation (not yet/already and still/no longer) the actual polarity of $p$ is switched, but the polarity transition remains unchanged. Finally, in the case of internal negation (not yet/still and already/no longer), both the actual polarity of $p$ and the direction of the polarity transition are reversed. This shows that in the phase quantification analysis, internal negation is viewed as the combination of duality and external negation. Löbner has also used this analysis to account for asymmetries in lexicalization patterns: already and still are less marked than not yet, which in turn is less marked than no longer (also see Section 5). Finally, it should also be emphasized that this analysis has been generalized to other lexical domains besides the aspectual adverbs, such as scalar predicates and (the procedural interpretation of) the first-order quantifiers.

Language universals and universal languages. The overview presented in this section shows that duality phenomena are not only ubiquitous in formal logical languages, but also in natural languages. It has therefore been suggested that duality is a semantic universal, which can be of great heuristic value in comparative linguistic research (van Benthem 1991). Furthermore, duality also plays a central role in artificial languages, which can be viewed as occupying an intermediate position between formal and natural languages. For example, Lincos, which was developed by Freudenthal (1960) for the purpose of cosmic communication, contains duality principles for conjunction/disjunction (1.36.8), universal/existential quantification (1.36.9), necessity/possibility (3.25.1) and obligation/permission (3.32.3).

3. Theoretical Framework

General definition. We will now present a general theoretical framework in which duality phenomena can be described and analyzed. Consider Boolean algebras $$\mathbb{A} = \langle A, \wedge_\mathbb{A}, \vee_\mathbb{A}, \neg_\mathbb{A}, \top_\mathbb{A}, \bot_\mathbb{A}\rangle$$ and $$\mathbb{B} = \langle B, \wedge_\mathbb{B}, \vee_\mathbb{B}, \neg_\mathbb{B}, \top_\mathbb{B}, \bot_\mathbb{B}\rangle$$ (Givant and Halmos 2009), and consider $n$-ary operators $O_1, O_2\colon\mathbb{A}^n \to \mathbb{B}$. The duality relations are defined as follows: $O_1$ and $O_2$ are

identical – abbreviated as $\Tiny{ID}\small{(O_1, O_2)}$ – iff

$\forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = O_2(a_1,\dots,a_n)$,

each other’s external negation – abbreviated as $\Tiny{ENEG}\small{(O_1, O_2)}$ – iff

$\forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = \neg_\mathbb{B}O_2(a_1,\dots,a_n)$,

each other’s internal negation – abbreviated as $\Tiny{INEG}\small{(O_1, O_2)}$ – iff

$\forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = O_2(\neg_\mathbb{A}a_1,\dots,\neg_\mathbb{A}a_n)$,

each other’s dual – abbreviated as $\Tiny{DUAL}\small{(O_1, O_2)}$ – iff
$\forall a_1,\dots,a_n \!\in\! A\!: O_1(a_1,\dots,a_n) = \neg_\mathbb{B}O_2(\neg_\mathbb{A}a_1,\dots,\neg_\mathbb{A}a_n)$.

Special cases. The definition provided above is fully abstract and general, but by plugging in concrete Boolean algebras for $\mathbb{A}$ and $\mathbb{B}$, we can recover the usual dualities as special cases. For example, in the language $\mathcal{L}_\mathsf{CPL}$ of classical propositional logic ($\mathsf{CPL}$), we can define equivalence classes $$[\varphi] := \{\psi \in\mathcal{L}_\mathsf{CPL} \mid \varphi \equiv \psi\}$$ and consider the Lindenbaum-Tarski algebra $$\mathbb{B}_\mathsf{CPL} := \{[\varphi] \mid \varphi \in \mathcal{L}_\mathsf{CPL}\}$$ It is well-known that $\mathbb{B}_\mathsf{CPL}$ is a Boolean algebra, and can thus be plugged in for $\mathbb{A}$ and/or $\mathbb{B}$ in the aforementioned definition. For example, if we consider conjunction and disjunction as binary operators $$\wedge,\vee\colon\mathbb{B}_\mathsf{CPL}\times\mathbb{B}_\mathsf{CPL} \to \mathbb{B}_\mathsf{CPL}$$ (defined by $[\varphi]\wedge[\psi]:=[\varphi\wedge\psi]$ and $[\varphi]\vee[\psi]:=[\varphi\vee\psi]$), this definition states that $\Tiny{DUAL}\small{(\wedge,\vee)}$ iff

for all $[\varphi], [\psi] \in \mathbb{B}_\mathsf{CPL}: [\varphi] \wedge [\psi] = \neg(\neg [\varphi] \vee \neg[\psi])$,

which is equivalent to the formulation (\ref{eq2}) that was given above

for all $\varphi, \psi \in \mathcal{L}_\mathsf{CPL}: \varphi \wedge \psi \equiv \neg(\neg \varphi \vee \neg\psi)$.

(Note that identity between elements in the Lindenbaum-Tarski algebra boils down to logical equivalence between the formulas themselves.) Similarly, the first-order quantifiers can be seen as unary operators $$\forall,\exists\colon\mathbb{B}_\mathsf{FOL}\to\mathbb{B}_\mathsf{FOL}$$ where $\mathbb{B}_\mathsf{FOL}$ is the Lindenbaum-Tarski algebra of first-order logic ($\mathsf{FOL}$), which is a cylindric algebra (Henkin et al. 1971), and thus a fortiori a Boolean algebra. Finally, by taking $\mathbb{A}$ and/or $\mathbb{B}$ to be other, more exotic Boolean algebras, the aforementioned definition also allows us to study duality relations in other, less well-known applications (Demey and Smessaert 2016).

Relations vs. functions. All the duality relations have a number of special properties. For any relation $R \in \{\Tiny{ID}\small{,}\Tiny{INEG}\small{,}\Tiny{ENEG}\small{,}\Tiny{DUAL}\small{\}}$, one can show that

R is deterministic:
for all $O_1,O_2,O_3\colon\mathbb{A}^n\to\mathbb{B}$: if $R(O_1,O_2)$ and $R(O_1,O_3),$ $O_2 = O_3$,
‌
R is serial:
for all $O_1\colon\mathbb{A}^n\to\mathbb{B}$, there exists an $O_2\colon\mathbb{A}^n\to\mathbb{B}$ such that $R(O_1,O_2)$,
‌
R is symmetric:
for all $O_1,O_2\colon\mathbb{A}^n\to\mathbb{B}: R(O_1,O_2)$ iff $R(O_2,O_1)$.

The first two properties jointly state that for each $O_1$, there is exactly one $O_2$ such that $R(O_1,O_2)$. This means that the relation $R$ is essentially a function, and switching from relational to functional notation, we can thus write $O_2 = R(O_1)$.

For example, since $\Tiny{DUAL}\small{(\wedge,\vee)}$, we can write $\vee = \Tiny{DUAL}\small{(\wedge)}$, and say that $\vee$ is the (unique) dual of $\wedge$. However, since $\wedge$ and $\vee$ are seen as binary operators on the Lindenbaum-Tarski algebra $\mathbb{B}_{\mathsf{CPL}}$, it should be kept in mind that this uniqueness claim ultimately boils down to a logical equivalence claim (see above). For example, consider the operator $$O\colon\mathbb{B}_{\mathsf{CPL}}\times\mathbb{B}_{\mathsf{CPL}}\to\mathbb{B}_{\mathsf{CPL}}$$ defined by $$O([\varphi],[\psi]) := \neg(\neg[\varphi] \wedge \neg[\psi])$$ It then holds that $\Tiny{DUAL}\small{(\wedge,\vee)}$ and $\Tiny{DUAL}\small{(\wedge,O)}$, which together entail that $\vee = O$. The latter is an identity of functions, and thus means that for all $[\varphi],[\psi]\in\mathbb{B}_{\mathsf{CPL}}$, we have $$[\varphi] \vee [\psi] = O([\varphi],[\psi]) = \neg(\neg[\varphi] \wedge \neg[\psi])$$ in other words: for all $$\varphi,\psi\in\mathcal{L}_\mathsf{CPL}$$ it holds that $$\varphi \vee \psi \equiv \neg(\neg\varphi \wedge \neg\psi)$$

Since each $R \in \{\Tiny{ID}\small{,}\Tiny{INEG}\small{,}\Tiny{ENEG}\small{,}\Tiny{DUAL}\small{\}}$ can be viewed as a function, the symmetry of the relation $R$ can equivalently be expressed as follows: $O_2 = R(O_1)$ iff $O_1 = R(O_2)$, which is itself equivalent to the property that $R(R(O)) = O$ for all operators $O\colon\mathbb{A}^n\to\mathbb{B}$. This means that the function $R$ is an involution.

Obviously, the definitions of the duality relations/functions can harmlessly be transposed from operators $O\colon\mathbb{A}^n\to\mathbb{B}$ to the outputs of those operators. For example, if the operator $O_2\colon\mathbb{A}^n\to\mathbb{B}$ is the dual of the operator $O_1\colon\mathbb{A}^n\to\mathbb{B}$, then for all $a_1,\dots,a_n\in\mathbb{A}$, the element $O_2(a_1,\dots,a_n) \in\mathbb{B}$ can be said to be the dual of the element $O_1(a_1,\dots,a_n) \in\mathbb{B}$. For example, in this way, we can say not only that $\vee$ is the dual of $\wedge$, but also that $[\varphi]\vee[\psi]$ is the dual of $[\varphi]\wedge[\psi]$, for all $[\varphi],[\psi]\in\mathbb{B}_{\mathsf{CPL}}$ – or more informally, that $\varphi\vee\psi$ is ‘the’ dual (up to logical equivalence) of $\(\varphi\wedge\psi$, for all $\(\varphi,\psi\in\mathcal{L}_\mathsf{CPL}$.

Duality squares. For every operator $O\colon\mathbb{A}^n\to\mathbb{B}$, one can define the set of four operators $$\delta(O) := \{\Tiny{ID}\small{(O)}, \Tiny{ENEG}\small{(O)},\Tiny{INEG}\small{(O)},\Tiny{DUAL}\small{(O)}\}$$ It is natural to view the set $\delta(O)$ as ‘generated’ by the operator $O$; however, it should be emphasized that $\delta(O)$ can be seen as generated by any of its elements. For example, if we consider $\Tiny{DUAL}\small{(O)}$, we find that $$\delta(\Tiny{DUAL}\small{(O))} =$$ $$\{\Tiny{ID}\small{(}\Tiny{DUAL}\small{(O))}, \Tiny{ENEG}\small{(}\Tiny{DUAL}\small{(O))},$$ $$\Tiny{INEG}\small{(}\Tiny{DUAL}\small{(O))},\Tiny{DUAL}\small{(}\Tiny{DUAL}\small{(O))}\} =$$ $$\{\Tiny{DUAL}\small{(O)},\Tiny{INEG}\small{(O)},\Tiny{ENEG}\small{(O)},\Tiny{ID}\small{(O)}\} =$$ $\delta(O)$. In general, for any $O’ \in \delta(O)$, it holds that $\delta(O’) = \delta(O)$ (Peters and Westerståhl 2006, p. 134; Westerståhl 2012, p. 205).

The argument above is based on the fact that $\delta(O)$ is ‘closed under duality’, in the sense that applying any of the $\Tiny{ID}$-, $\Tiny{ENEG}$-, $\Tiny{INEG}$- or $\Tiny{DUAL}$-functions to its elements only yields operators that already belong to $\delta(O)$. This observation is the starting point for the group-theoretical perspective on duality that will be developed in Section 4. The operators in $\delta(o)$ thus constitute natural families (van Benthem 1991, p. 31; Peters and Westerståhl 2006, p. 26), which are often visualized by means of square diagrams. The diagram’s vertices represent the four operators (or formulas), and its edges and diagonals represent the various relations between those operators. Figure 2(a) shows the graphical convention that will be used in this article to visualize these relations.

Visually speaking, duality squares can be presented in a number of different ways, depending on which aspects the author wishes to emphasize. The most widely used presentation can be found in Figure 2(b), in which the $\Tiny{ENEG}$-, $Tiny{INEG}$- and $\Tiny{DUAL}$-relations occupy the square’s diagonals, horizontal and vertical edges, respectively. This presentation thus emphasizes the analogy between the duality square and the well-known Aristotelian square, in which the contradiction, (sub)contrariety and subalternation relations also occupy the diagonals, horizontal and vertical edges, respectively (van Benthem 1991, p. 31; Jaspers 2005, p. 148; Peters and Westerståhl 2006, p. 25, Westerståhl 2012, p. 202); also see Section 5. Figure 2(c) shows an alternative layout, in which the $\Tiny{DUAL}$-relations occupy the diagonals, thereby graphically reflecting the fact that $\Tiny{DUAL}$ is the combination of $\Tiny{ENEG}$ (which constitutes the vertical edges) and $\Tiny{INEG}$ (which constitutes the horizontal edges) (Löbner 1990, p. 69ff.; Konig 1991, p. 201); also see Section 4. Thirdly, Löbner (1999, p. 57; 2011, p. 488) has argued, on the basis of his phase quantification approach to duality (see Section 2), that $\Tiny{INEG}$ should be seen as the combination of $\Tiny{ENEG}$ and $\Tiny{DUAL}$, and thus uses squares as in Figure 2(d), in which the former occupies the diagonals. Finally, it should be emphasized that the $\Tiny{ID}$-relations are not visualized explicitly in any of these three ways of presenting duality squares, since they would simply constitute loops on all vertices of the squares.

Figures 3 and 4 show duality squares for some concrete dualities from logic and language (all these squares follow the presentation of Figure 2(b), and thus have $\Tiny{ENEG}$-diagonals). The first three squares in Figure 3 correspond to the first three examples of duality in logic that were discussed in Section 1: (a) the propositional connectives of conjunction and disjunction, (b) the universal and existential quantifiers, and (c) the modal operators of necessity and possibility. Furthermore, it should be emphasized that the general perspective on duality in terms of external and internal negation also allows us to draw less standardized duality squares; for example, Figure 3(d) shows the less widely known duality square that is generated by the propositional connective of material implication ($\to$). Finally, the squares in Figure 4 correspond to two examples of duality in natural language that were discussed in Section 2, namely (a) the quantification adverbs everywhere/somewhere, and (b) the aspectual adverbs already/still.

Figure 2: (a) Graphical representations of the duality relations; presentationsof duality squares with (b)ENEG-diagonals, (c)DUAL-diagonals and (d)INEG-diagonals.

Figure 3: Duality squares from logic: (a) conjunction-disjunction, (b) universal-existential, (c) necessity-possibility, (d) implication.

Figure 4: Duality squares from linguistics: (a) everywhere-somewhere, (b) already-still.

Degenerate duality patterns. For some operators $O\colon\mathbb{A}^n\to\mathbb{B}$, it might happen that $\Tiny{DUAL}\small{(O)} = O = \Tiny{ID}\small{(O)}$, i.e. $O$ is self-dual. In this case, one can also show that $\Tiny{INEG}\small{(O)} = \Tiny{ENEG}\small{(O)}$, i.e. $O$’s internal and external negation coincide with each other. For example, as was already shown in Section 2, proper names are self-dual in generalized quantifier theory. For another example, consider the identity operator $I_\mathbb{A}\colon\mathbb{A}\to\mathbb{A}$ (for any Boolean algebra $\mathbb{A}$), which is defined by $I_\mathbb{A}(a) := a$. For any element $a \in A$, it holds that $$\Tiny{DUAL}\small{(I_\mathbb{A})(a)} = \neg_\mathbb{A} I_\mathbb{A}(\neg_\mathbb{A} a) = \neg_\mathbb{A}\neg_\mathbb{A} a = a = I_\mathbb{A}(a)$$ and thus $\Tiny{DUAL}\small{(I_\mathbb{A})} = I_\mathbb{A}$, i.e. $I_\mathbb{A}$ is self-dual. Similarly, for any element $a\in A$ it holds that $$\Tiny{INEG}\small{(I_\mathbb{A})(a)} = I_\mathbb{A}(\neg_\mathbb{A} a) = \neg_\mathbb{A} a = \neg_\mathbb{A} I_\mathbb{A}(a) = \Tiny{ENEG}\small{(I_\mathbb{A})(a)}$$ and thus $\Tiny{INEG}\small{(I_\mathbb{A})} = \Tiny{ENEG}\small{(I_\mathbb{A})}$.

Completely analogously, for some operators $O\colon\mathbb{A}^n\to \mathbb{B}$, it can happen that $\Tiny{INEG}\small{(O)} = O = \Tiny{ID}\small{(O)}$, i.e. $O$ is its own internal negation. In this case, one can also show that $\Tiny{DUAL}\small{(O)} = \Tiny{ENEG}\small{(O)}$, i.e. $O$’s external negation and dual coincide with each other. Consider, for example, the contingency operator $C\colon\mathbb{B}_\mathsf{S5}\to\mathbb{B}_\mathsf{S5}$, which is defined by $$C([\varphi]) := \Diamond[\varphi]\wedge\Diamond\neg[\varphi] = [\Diamond\varphi\wedge\Diamond\neg\varphi]$$ (recall that $\mathbb{B}_\mathsf{S5}$ is the Lindenbaum-Tarski algebra of the modal logic $\mathsf{S5}$, which is a modal algebra (Blackburn et al. 2001), and thus a fortiori a Boolean algebra). For any $[\varphi]\in\mathbb{B}_\mathsf{S5}$, it holds that $$\Tiny{INEG}\small{(C)([\varphi])} = C(\neg[\varphi]) = \Diamond\neg[\varphi]\wedge\Diamond\neg\neg[\varphi] = \Diamond[\varphi] \wedge\Diamond\neg[\varphi]= C([\varphi])$$ and thus $\Tiny{INEG}\small{(C)} = C$. Similarly, it holds that $$\Tiny{DUAL}\small{(C)([\varphi])} = \neg C(\neg[\varphi]) = \neg(\Diamond\neg[\varphi]\wedge\Diamond\neg\neg[\varphi]) = \neg(\Diamond[\varphi]\wedge\Diamond\neg\varphi) =$$ $$\Tiny{ENEG}\small{(C)([\varphi])}$$ and thus $\Tiny{DUAL}\small{(C)} = \Tiny{ENEG}\small{(C)}$.

We have now discussed the possibility of an operator coinciding with its dual, or with its internal negation. This naturally leads to the question whether there are also operators that coincide with their external negation. It is easy to see, however, that there exist no non-trivial operators with this property. After all, if $O\colon\mathbb{A}^n\to\mathbb{B}$ is its own external negation, then for all $n$-tuples $\overline{a} \in A^n$, it holds that $$O(\overline{a}) = \neg_\mathbb{B} O(\overline{a})$$ and hence, $$\top_\mathbb{B} = O(\overline{a}) \vee_\mathbb{B}\neg_\mathbb{B}O(\overline{a})=O(\overline{a}) \vee_\mathbb{B}O(\overline{a})=O(\overline{a})$$ and also $$\bot_\mathbb{B} = O(\overline{a}) \wedge_\mathbb{B}\neg_\mathbb{B}O(\overline{a})=O(\overline{a}) \wedge_\mathbb{B}O(\overline{a})=O(\overline{a})$$ which means that $\mathbb{B}$ is the trivial Boolean algebra in which $\bot_\mathbb{B}= \top_\mathbb{B}$ (in logical terms: $\mathbb{B}$ is the Lindenbaum-Tarski algebra of a logical system that is inconsistent).

Whenever an operator $O$ is its own dual or internal negation, the set $\delta(O)$ does not contain four, but only two distinct operators (Peters and Westerståhl 2006, p. 134;Westerståhl 2012, p. 205), and thus cannot be visualized using an ordinary duality square. Recall the standard presentation of the duality square (with horizontal $\Tiny{INEG}$- and vertical $\Tiny{DUAL}$-edges) in Figure 2(b), which is repeated here as Figure 5(a). If $O = \Tiny{DUAL}\small{(O)}$, then $\delta(O) = \{\Tiny{ID}\small{(O)},\Tiny{INEG}\small{(O)}\}$, and thus, the duality square in Figure 5(a) degenerates into the binary horizontal duality diagram in Figure 5(b). Analogously, if $O = \Tiny{INEG}\small{(O)}$, then $\delta(O) = \{\Tiny{ID}\small{(O)},\Tiny{DUAL}\small{(O)}\}$, and thus, the duality square in Figure 5(a) degenerates into the binary vertical duality diagram in Figure 5(c).

Figure 5: (a) Ordinary duality square, (b) degenerate duality pattern for an operator that is its own dual, (c) degenerate duality pattern for an operator that is its own internal negation.

Beyond external and internal negation. In the introduction, it was emphasized that this article mainly focuses on duality phenomena that arise in logical and natural languages. As was illustrated in Sections 1 and 2, these dualities can informally be characterized in terms of internal and external negation. In this section, this informal characterization was made mathematically precise, by appealing to operators $O\colon\mathbb{A}^n\to\mathbb{B}$ and viewing the internal and external negation as the negations $\neg_\mathbb{A}$ and $\neg_\mathbb{B}$ of the source and target Boolean algebras $\mathbb{A}$ and $\mathbb{B}$, respectively. However, it should be emphasized that in the broader mathematical perspective on duality (Gowers 2008; Kabakov et al. ~ 2014), internal/external negation plays a less central role. For example, in category-theoretic terms, conjunction and disjunction are characterized as follows (Mac Lane 1998; Davey and Priestley 2002):

$\varphi\wedge\psi$ is the unique
formula $\pi$ such that:
– $\pi$ entails $\varphi$
– $\pi$ entails $\psi$
– for all $\alpha$: if $\alpha$ entails $\varphi$ and $\psi$,
$\hspace{0.325cm}$ then $\alpha$ entails $\pi$

$\varphi\vee\psi$ is the unique
formula $\pi$ such that:
– $\varphi$ entails $\pi$
– $\psi$ entails $\pi$
– for all $\alpha$: if $\varphi$ and $\psi$ entail $\alpha$,
$\hspace{0.325cm}$ then $\pi$ entails $\alpha$

From this perspective, the duality of conjunction and disjunction is thus not characterized in terms of internal and external negation, but rather in terms of systematically ‘reversing’ the direction of entailment (a similar connection between duality and ‘reversing’ the direction of polarity transitions shows up in Löbner’s phase quantification theory, as discussed in Section 2). This difference should not be exaggerated, however, as can already be seen from the law of contraposition, in which the ideas of negation and reversal are brought together: $\varphi\to\psi \equiv \neg\psi\to\neg\varphi$.

4. A Group-Theoretical Approach to Duality

The Klein four group. When $\Tiny{ID}$, $\Tiny{ENEG}$, $\Tiny{INEG}$ and $\Tiny{DUAL}$ are viewed as functions, they map each operator $O\colon\mathbb{A}^n\to\mathbb{B}$ onto the operators $$\Tiny{ID}\small{(O)},\Tiny{ENEG}\small{(O),}$$ $$\Tiny{INEG}\small{(O)},\Tiny{DUAL}\small{(O)}\colon\mathbb{A}^n\to\mathbb{B}$$ Since the input and output of the functions $\Tiny{ID}$, $\Tiny{ENEG}$, $\Tiny{INEG}$ and $\Tiny{DUAL}$ are of the same type (namely: operators $\mathbb{A}^n\to\mathbb{B}$), they can be applied repeatedly. For example, starting with an operator $O\colon\mathbb{A}^n\to\mathbb{B}$, we can apply $\Tiny{INEG}$ to it to obtain the operator $\Tiny{INEG}\small{(O)}\colon\mathbb{A}^n\to\mathbb{B}$; by applying $\Tiny{ENEG}$ to the latter we obtain the operator $\Tiny{ENEG}\small{(}\Tiny{INEG}\small{(O))}\colon\mathbb{A}^n\to\mathbb{B}$. It follows immediately from the definitions of the duality relations/functions that $\Tiny{ENEG}\small{(}\Tiny{INEG}\small{(O))} = \Tiny{DUAL}\small{(O)}$. Since this holds independently of the concrete operator $O$, we can write $\Tiny{ENEG} \small{\circ} \Tiny{INEG} \small{=} \Tiny{DUAL}$, which means that applying $\Tiny{INEG}$ and then $\Tiny{ENEG}$ (to some operator) yields the same result as applying $\Tiny{DUAL}$ (to that same operator). In a similar vein, since for all operators $O\colon\mathbb{A}^n\to\mathbb{B}$ it holds that $\Tiny{INEG}\small{(}\Tiny{INEG}\small{(O))} = O = \Tiny{ID}\small{(O)}$, we can write $\Tiny{INEG}$ $\circ$ $\Tiny{INEG}$ $=$ $\Tiny{ID}$. In this way, we obtain a large number of functional identities that descibe the behavior of the duality and internal/external negation functions:

\begin{alignat}{3}
\small{ID} & \circ \small{ID} & = & \; \ \small{ID} & = & \ \small{DUAL} & \circ \small{DUAL} \notag\\
\small{ENEG} & \circ \small{ENEG} & = & \; \ \small{ID} & = & \ \small{INEG} & \circ \small{INEG} \notag\\
\small{INEG} & \circ \small{ENEG} & = & \ \small{DUAL} & = & \ \small{ENEG} & \circ \small{INEG} \notag\\
\small{INEG} & \circ \small{DUAL} & = & \ \small{ENEG} & = & \ \small{DUAL} & \circ \small{INEG} \notag\\
\small{DUAL} & \circ \small{ENEG} & = & \ \small{INEG} & = & \ \small{ENEG} & \circ \small{DUAL} \notag\\
\end{alignat}

These identities can be summarized by stating that the functions $\Tiny{ID}$, $\Tiny{ENEG}$, $\Tiny{INEG}$ and $\Tiny{DUAL}$ jointly form a group that is isomorphic to the Klein four group $V_4$ (German: Kleinsche Vierergruppe). Its Cayley table looks as follows:

$\begin{array}{ c|c c c c }
\circ & \small{ID} & \small{ENEG} & \small{INEG} & \small{DUAL} \\
\hline
\small{ID} & \small{ID} & \small{ENEG} & \small{INEG} & \small{DUAL} \\
\small{ENEG} & \small{ENEG} & \small{ID} & \small{DUAL} & \small{INEG} \\
\small{INEG} & \small{INEG} & \small{DUAL} & \small{ID} & \small{ENEG} \\
\small{DUAL} & \small{DUAL} & \small{INEG} & \small{ENEG} & \small{ID}
\end{array}$

The fact that duality behavior can be described by means of V4 was already noted by authors such as Piaget (1949), Gottschalk (1953), Löbner (1990), van Benthem (1991) and Peters and Westerståhl (2006). However, many of them used slightly differing labels for the group elements; here is an overview table:

	Piaget	Gottschalk	Löbner	Peters & Westerståhl
$\small{ID}$	identité ($\small{I}$)	identity ($\small{E}$)	indentity
$\small{ENEG}$	inversion ($\small{N}$)	negational ($\small{N}$)	negation	outer negation
$\small{INEG}$	réciprocation ($\small{R}$)	contradual ($\small{C}$)	subnegation	inner negation
$\small{DUAL}$	corrélation ($\small{C}$)	dual ($\small{E}$)	dual	dual

This group-theoretical perspective also allows us to describe the degenerate cases of operators that are their own duals or their own internal negations. Note that these cases are characterized by the identities $\Tiny{DUAL}$ $=$ $\Tiny{ID}$ and $\Tiny{INEG}$ $=$ $\Tiny{ID}$, respectively. Note that if $\Tiny{DUAL}$ $=$ $\Tiny{ID}$, then also $\Tiny{ENEG}$ $=$ $\Tiny{INEG}$, and thus $V_4$ collapses into a group that is isomorphic to $\mathbb{Z}_2$; see the left and middle Cayley tables below and also recall Figure 5(b). Similarly, if $\Tiny{INEG}$ $=$ $\Tiny{ID}$, then also $\Tiny{ENEG}$ $=$ $\Tiny{DUAL}$, and thus $V_4$ again collapses into a group that is isomorphic to $\mathbb{Z}_2$; see the right and middle Cayley tables below and also recall Figure 5(c).

$\begin{array}{c|c c}
\circ & \small{ID} & \small{INEG} \\ \hline
\small{ID} & \small{ID} & \small{INEG} \\
\small{INEG} & \small{INEG} & \small{ID} &
\end{array}$

$\begin{array}{c|c c}
\circ & \small{0} & \small{1} \\ \hline
\small{0} & \small{0} & \small{1} \\
\small{1} & \small{1} & \small{0} &
\end{array}$

$\begin{array}{c|c c}
\circ & \small{ID} & \small{DUAL} \\ \hline
\small{ID} & \small{ID} & \small{DUAL} \\
\small{DUAL} & \small{DUAL} & \small{ID} &
\end{array}$

Finally, it should be noted that the Klein four group $V_4$ is isomorphic to the direct product of $\mathbb{Z}_2$ with itself, i.e. $V_4$ $\cong$ $\mathbb{Z}_2$ $\times$ $\mathbb{Z}_2$ = $\mathbb{Z}_2^2$. Although this fact is well-known in group theory, its logico-linguistic significance has only recently begun to be explored. The Cayley table for $\mathbb{Z}_2$ × $\mathbb{Z}_2$ looks as follows:

$\begin{array}{ c|c c c c }
\circ & (0, 0) & (1, 0) & (0, 1) & (1, 1) \\
\hline
(0, 0) & (0, 0) & (1, 0) & (0, 1) & (1, 1) \\
(1, 0) & (1, 0) & (0, 0) & (1, 1) & (0, 1) \\
(0, 1) & (0, 1) & (1, 1) & (0, 0) & (1, 0) \\
(1, 1) & (1, 1) & (0, 1) & (1, 0) & (0, 0)
\end{array}$

Comparing the Cayley tables for $\mathbb{Z}_2$ × $\mathbb{Z}_2$ and the Klein four group $V_4$, we see that the concrete isomorphism looks as follows:
\begin{equation}\label{eq45}
\small{ID} \leftrightarrow (0, 0),\;\> \small{ENEG} \leftrightarrow (1, 0),\;\> \small{INEG} \leftrightarrow (0, 1),\;\> \small{DUAL} \leftrightarrow (1, 1).
\end{equation}
This group-theoretical isomorphism turns out to be very informative: $0$ and $1$ represent the number of times negation is being applied in a given Boolean algebra, and the left and right coordinates stand for the target and source Boolean algebra (i.e. external and internal negation), respectively. For example, $\Tiny{ENEG}$ corresponds to $(1, 0)$, which represents $1$ external negation and $0$ internal negations. Similarly, $\Tiny{INEG}$ corresponds to $(0, 1)$, which represents $0$ external negations and $1$ internal negation (keeping in mind that internal negation applies to all arguments). Using the conventions that $\neg \ _\mathbb{A}^0 a := a$ and $\neg \ _\mathbb{A}^1 a := \neg \ _\mathbb{A}a$ for all $a \in \mathbb{A}$, we thus find for any operator $\small{O}: \mathbb{A}^n \rightarrow \mathbb{B}$ and $i, \ k \in \{0, 1\}$:
\begin{equation}\label{eq46}
(i, \ k)(\small{O})(a_1, …,a_n) = \neg \ ^i_\mathbb{B}\small{O}(\neg \ ^k_\mathbb{A} a_1, … , \neg ^k_\mathbb{A} a_n).
\end{equation}
Representing $V_4$ as $\mathbb{Z}_2 \times \mathbb{Z}_2$ thus gives us a firm syntactic handle on duality: it shows how duality behavior arises out of the interplay of the independent behaviors ($0$ or $1$) of an external and an internal negation (resp. left and right coordinate).

Composed operators. The group-theoretical account of duality can be extended in a number of different ways. For example, Demey (2012a) has used it to study the duality behavior of composed operators. Given operators $\small{O}_1: \mathbb{A}^n \rightarrow \mathbb{B}$ and $\small{O}_2: \mathbb{B} \rightarrow \mathbb{C}$, we will write $\small{O}_2 \circ \small{O}_1: \mathbb{A}^n \rightarrow \mathbb{C}$ for the composed operator that first applies $\small{O}_1$ to the arguments, and then $\small{O}_2$. For simplicity, we will assume that $\small{O}_2$ is unary, but this assumption is not essential. In this article, we will focus on the basic example $\forall \circ \square$ from modal syllogistics (Buridan 2001; Read 2012). A more linguistically motivated example, viz. possessives with multiple quantifiers, such as three athletes of each country, is discussed in Westerståhl (2012).

Each of $\small{O}_1$ and $\small{O}_2$ has its own internal and external negation, but it is easy to see that in the composed operator $\small{O}_2$ $\circ$ $\small{O}_1$, the external negation of $\small{O}_1$ coincides with the internal negation of $\small{O}_2$. As a consequence, the composed operator $\small{O}_2$ $\circ$ $\small{O}_1$ has three negations, namely external, intermediate, and internal (formally: $\neg _\mathbb{C}$, $\neg _\mathbb{B}$, and $\neg _\mathbb{A}$, respectively). Since each of these 3 negations may or may not be applied, $\small{O}_2$ $\circ$ $\small{O}_1$ gives rise to $2^3 = 8$ operators. As an example, consider the case of $\forall \circ \square$ in (\ref{eq47}):
\begin{equation}\label{eq47}
\begin{array}{c|c|c|c|c}
\phantom{\neg} \small{O}_2 \phantom{\neg} \small{O}_1 \phantom{\neg} &\phantom{\neg} \forall x \phantom{\neg} \square \phantom{\neg} \small{P}(x) & & \neg \small{O}_2 \neg \small{O}_1 \neg & \neg \forall x \neg \square \neg \small{P}(x) \\
\phantom{\neg} \small{O}_2 \phantom{\neg} \small{O}_1 \neg &\phantom{\neg} \forall x \phantom{\neg} \square \neg \small{P}(x) & & \neg \small{O}_2 \neg \small{O}_1 \phantom{\neg} & \neg \forall x \neg \square \phantom{\neg} \small{P}(x) \\
\phantom{\neg} \small{O}_2 \neg \small{O}_1 \phantom{\neg} &\phantom{\neg} \forall x \neg \square \phantom{\neg} \small{P}(x) & & \neg \small{O}_2 \phantom{\neg} \small{O}_1 \neg & \neg \forall x \phantom{\neg} \square \neg \small{P}(x) \\
\neg \small{O}_2 \phantom{\neg} \small{O}_1 \phantom{\neg} &\neg \forall x \phantom{\neg} \square \phantom{\neg} \small{P}(x) & & \phantom{\neg} \small{O}_2 \neg \small{O}_1 \neg & \phantom{\neg} \forall x \neg \square \neg \small{P}(x) \\
\end{array}
\end{equation}
In comparison to single operators, we see that composed operators have one additional negation, and hence, it should not be surprising that their duality behavior is not governed by $\mathbb{Z}_2 \times \mathbb{Z}_2$, but rather by $\mathbb{Z}_2 \times \mathbb{Z}_2 \times \mathbb{Z}_2$. Next to $\Tiny{INEG}$ and $\Tiny{ENEG}$, there is also the intermediate negation function $\Tiny{MNEG}$, and the isomorphism given in (\ref{eq45}) is generalized to the one defined by (\ref{eq48}):
\begin{equation}\label{eq48}
\small{ID} \leftrightarrow (0, 0, 0),\;\> \small{ENEG} \leftrightarrow (1, 0, 0),\;\> \small{MNEG} \leftrightarrow (0, 1, 0),\;\> \small{INEG} \leftrightarrow (0, 0, 1).
\end{equation}
In analogy to (\ref{eq46}), it is now again possible to succinctly describe the effects of these operations:
\begin{equation}\label{eq49}
(i,j,k)(\small{O}_2 \circ \small{O}_1)(a_1, …,a_n) = \neg \ ^i_\mathbb{C}\small{O}_2\neg^j_\mathbb{B}\small{O}_1(\neg \ ^k_\mathbb{A} a_1, … , \neg ^k_\mathbb{A} a_n).
\end{equation}
We also see that composed operators give rise to a much richer duality behavior than single operators. Recall that in the case of single operators, duality can be seen as the combination of the external and internal negations ($\Tiny{DUAL}$ $=$ $\Tiny{ENEG} \circ \Tiny{INEG}$). In the case of composed operators, however, we have three negations, and thus three pairwise combinations: $\Tiny{ENEG} \circ \Tiny{INEG}$, $\Tiny{ENEG} \circ \Tiny{MNEG}$, and $\Tiny{MNEG} \circ \Tiny{INEG}$. Although the first of these seems to be closest to what is classically called ‘duality’, the other two can plausibly be seen as (non-standard) duality operations too. Finally, there is also the operation $\Tiny{ENEG} \circ \Tiny{MNEG} \circ \Tiny{INEG}$, which operates on all negations simultaneously.

Visualizing these duality patterns cannot be done by means of a square, but rather requires a duality cube. For example, Figure 6 shows a duality cube for the composed operator $\forall \circ \square$ ; analogously, Westerståhl (2012) draws a duality cube for possessives with multiple quantifiers. Demey (2012a) makes use of the group-theoretical perspective to study the internal structure of this cube. It is a well known group-theoretical fact that the group $\mathbb{Z}_2 \times \mathbb{Z}_2 \times \mathbb{Z}_2$ has exactly 7 subgroups that are isomorphic to $V_4$. These can naturally be partitioned into three families, based on their number of ‘basic’ operations (i.e. operations governing a single negation: $\Tiny{ENEG}$, $\Tiny{MNEG}$ and $\Tiny{INEG}$): (a) the first family consists of three groups that contain two basic operations, (b) the second family consists of three groups that contain one basic operation, and (c) the third family consists of a single group that does not contain any basic operations. Examples of groups from each of these families are given in (\ref{eq50}a–c), respectively.
\begin{equation}\label{eq50}
\begin{array}
((a) \; \; \{ \small{ID}, \small{ENEG}, \small{INEG}, \small{ENEG} \circ \small{INEG} \} \\
(b) \; \; \{ \small{ID}, \small{ENEG}, \small{MNEG} \circ \small{INEG}, \small{ENEG} \circ \small{MNEG} \circ \small{INEG}\} \\
(c) \; \; \{ \small{ID}, \small{ENEG} \circ \small{INEG}, \small{MNEG} \circ \small{INEG}, \small{ENEG} \circ \small{MNEG}\} \\
\end{array}
\end{equation}
Each of these groups defines two complementary ‘duality squares’, and we thus find a total number of $7 \times 2 = 14$ ‘duality squares’ inside the duality cube. (We are using the term ‘duality square’ inside scare quotes here, because some of these squares visualize non-standard duality operations that involve $\Tiny{MNEG}$; see above.) Note that, in contrast to the groups of families (a) and (b), the non-$\Tiny{ID}$ elements of the group in family (c) pairwise share a basic operation. Demey (2012a) argues that this difference in group-theoretical structure correlates with a difference in geometric embedding of the squares inside the cube.

Generalized Post duality. The group-theoretical account described above conforms to the basic requirement that internal negation be applied to all arguments of a given operator; see the k-superscripts in (\ref{eq46}) and (\ref{eq49}). Although the most canonical examples of duality indeed obey this requirement (recall the example of conjunction/disjunction from Section 1), there are also operators whose duality behavior seems to violate this requirement. For example, it was shown in Section 2 that in the relational perspective on generalized quantifiers, internal negation is applied only to the second argument—so that the internal negation of $\small{Q}(\small{A}, \small{B})$ is $\small{Q}(\small{A}, \neg \small{B})$, rather than $\small{Q}(\neg \small{A}, \neg \small{B})$. Similarly, in syllogistics one can independently study the effects of predicate negation—as in $\small{Q}(\small{A}, \neg \small{B})$—and of subject negation—as in $\small{Q}(\neg \small{A}, \small{B})$ (Keynes 1884; Johnson 1921; Reichenbach 1952; Hacker 1975). Finally, in public announcement logic, the dual of $[ \ !\varphi \ ] \ \psi$ is defined as $[ \ !\neg \varphi \ ] \ \neg \psi$, so the internal negation of the binary $[ \ ! \ \cdot \ ] \ \cdot$ operator is applied only to its second argument ($\psi$) (Demey 2012b).

Figure 6: Duality cube for the composed operator $\forall \circ \square$

If we drop the requirement that internal negation be applied to all arguments, the behavior that arises is called generalized Post duality (Humberstone 2011, p. 410ff.; Urquhart 2008). Consider an $\small{n}$-ary operator $\small{O}: \mathbb{A}^n \leftarrow \mathbb{B}$. This operator has $1$ external and $\small{n}$ independent internal negations. Since each of these $\small{n} + 1$ negations may or may not be applied, $\small{O}$ gives rise to $2^{n+1}$ operators. As an example, consider the binary operator of conjunction:
\begin{equation}\label{eq51}
\begin{array}{c|c|c|c|c}
\phantom{\neg} \small{O}( \phantom{\neg} , \phantom{\neg} )& \phantom{\neg} (\phantom{\neg} p \wedge \phantom{\neg} q )\ & & \neg \small{O}( \neg, \neg) & \neg ( \neg p \wedge \neg q) \\
\phantom{\neg} \small{O}( \phantom{\neg} , \neg )&\phantom{\neg}
(\phantom{\neg} p \wedge \neg q )\ & & \neg \small{O}( \neg, \phantom{\neg}) & \neg ( \neg p \wedge \phantom{\neg} q) \\
\phantom{\neg} \small{O}( \neg , \phantom{\neg} )&\phantom{\neg}
(\neg p \wedge \phantom{\neg} q )\ & & \neg \small{O}( \phantom{\neg}, \neg) & \neg ( \phantom{\neg} p \wedge \neg q) \\
\neg \small{O}( \phantom{\neg} , \phantom{\neg} )& \neg
(\phantom{\neg} p \wedge \phantom{\neg} q )\ & & \phantom{\neg} \small{O}( \neg, \neg) & \phantom{\neg} ( \neg p \wedge \neg q) \\
\end{array}
\end{equation}
In comparison to the ordinary duality behavior of a binary operator, we thus have $\small{n}+1$ rather than $2$ independent negations, and generalized Post duality behavior is governed by the group $\mathbb{Z}_2^{n+1}$ rather than $\mathbb{Z}_2^2$ (Libert 2012). Next to $\Tiny{ENEG}$, the operation of $\Tiny{INEG}$ is split into $\Tiny{INEG}_1, . . . , \Tiny{INEG}_n$, with $\Tiny{INEG}_i$ operating on the operator’s $\small{i}^{th}$ argument, for $1 \ \leq \ \small{i} \ \leq \ \small{n}$. Furthermore, the isomorphism given in (\ref{eq45}) can be generalized to the one defined by (\ref{eq52}):

\begin{alignat}{3}
\small{ID} & \; \leftrightarrow \; (0, 0, 0, . . . , 0, 0) & & \small{ENEG} & \; \leftrightarrow \; (1, 0, 0, . . . , 0, 0) \notag \\
\small{INEG}_1 & \; \leftrightarrow \; (0, 0, 0, . . . , 0, 0) & \; \cdot \cdot \cdot \; \; & \small{INEG}_n & \; \leftrightarrow \; (0, 0, 0, . . . , 0, 1) &. \label{eq52}\\
\end{alignat}

In analogy to (\ref{eq46}), the effects of these operations can be described succinctly by means of (\ref{eq53}). Note that (\ref{eq46}) can be seen as a special case of (\ref{eq53}), by requiring that $\small{k}_1 = \small{k}_2 = … = \small{k}_n$.

\begin{equation}\label{eq53}
(i,k_1,…,k_n)(\small{O})(a_1, …,a_n) = \neg \ ^i_\mathbb{B}\small{O}(\neg \ ^{k_1}_\mathbb{A} a_1, … , \neg ^{k_n}_\mathbb{A} a_n).
\end{equation}

As was the case with the duality behavior of a composed operator, we see that the generalized duality behavior of an $n$-ary operator is much richer than its ‘ordinary’ duality behavior. Consider again the binary operator of conjunction. If both arguments can be negated independently, there are several combinations of external and internal negation ($\Tiny{ENEG} \circ \Tiny{INEG1}$, $\Tiny{ENEG} \circ \Tiny{INEG}_2$ and $\Tiny{ENEG} \circ \Tiny{INEG}_1 \circ \Tiny{INEG}_2$), all of which can plausibly be called duality operations. (The last one of these involves negating all arguments, and thus coincides with ‘ordinary’ duality.) As a consequence, visualizing the generalized duality behavior of conjunction requires a duality cube, as in Figure 7. Note that the diagonal plane that spans the front left and back right vertical edges of this cube corresponds to the ‘ordinary’ duality square for conjunction (see Figures 2(c) and 3(a)).

Finally, it should be noted that the duality cubes in Figures 6 and 7 are highly similar, which is due, of course, to the fact that they are two distinct manifestations of the group $\mathbb{Z}_2^3$ (and can thus serve as two distinct concrete interpretations of the abstract cube in Moretti (2012, p. 88)). This illustrates the strong connection between the ‘ordinary’ duality behavior of composed operators on the one hand and the generalized duality behavior of single (binary) operators on the other. Both cases involve creating an additional negation: the former achieves this by ‘splitting’ the operator, while the latter achieves it by ‘splitting’ the argument positions.

Figure 7: ‘Generalized Post duality’ cube for the binary operator $\wedge$.

5. Duality Relations and Aristotelian Relations

The Aristotelian relations. Next to the duality relations, there is another widely known set of logical relations, namely the Aristotelian relations, which were originally defined in the logical works of Aristotle (Ackrill 1961). These are defined relative to some background logical system $\textsf{S}$, which is assumed to have connectives expressing Boolean negation ($\neg$), conjunction ($\wedge$) and implication ($\rightarrow$), and a model-theoretic semantics ($\models$). Formally, the Aristotelian relations are defined as follows: the formulas $\varphi$ and $\psi$ are said to be
\begin{equation}\label{eq54}
\begin{array}{l l l l l}
\textsf{S}\textrm{-}contradictory & \mathrm{iff} & \textsf{S} \> \models \neg (\varphi \wedge \psi ) & \mathrm{and} & \textsf{S} \> \models \neg (\neg \varphi \wedge \neg \psi ), \\
\textsf{S}\textrm{-}contrary & \mathrm{iff} & \textsf{S} \> \models \neg (\varphi \wedge \psi ) & \mathrm{and} & \textsf{S} \> \not \models \neg (\neg \varphi \wedge \neg \psi ), \\
\textsf{S}\textrm{-}subcontrary & \mathrm{iff} & \textsf{S} \> \not \models \neg (\varphi \wedge \psi ) & \mathrm{and} & \textsf{S} \> \models \neg (\neg \varphi \wedge \neg \psi , \\
\textsf{S}\textrm{-}subalternation & \mathrm{iff} & \textsf{S} \> \models \varphi \rightarrow \psi & \mathrm{and} & \textsf{S} \> \not \models \psi \rightarrow \varphi , \\
\end{array}
\end{equation}
When the system $\textsf{S}$ is clear from the context, it is often left implicit (Smessaert and Demey 2014). Informally, two formulas are contradictory iff they cannot be true together and cannot be false together; they are contrary iff they cannot be true together but may be false together; they are subcontrary iff they cannot be false together but may be true together; they are in subalternation iff the first one entails the second one but not vice versa. Finally, it should be noted that this definition of the Aristotelian relations can be generalized to arbitrary Boolean algebras, just like the definition of the duality relations provided in Section 3 (Demey and Smessaert 2016). However, since this generalization is less relevant for our current concerns, it will not be discussed here.

The Aristotelian relations holding between a given set of formulas are often visualized by means of Aristotelian diagrams (based on graphical conventions such as the one shown in Figure 8(d)). The most widely known of these diagrams is the so-called ‘square of oppositions’, which comprises 4 formulas and the 6 Aristotelian relations holding between them. For example, Figure 8 shows Aristotelian squares involving (a) the propositional connectives of conjunction and disjunction, (b) the universal and existential quantifiers, and (c) the modal operators of necessity and possibility.

Figure 8: ‘Aristotelian squares: (a) conjunction-disjunction, (b) universal existential, (c) necessity possibility; (d) graphical representations of the Aristotelian relations.

Similarities. The Aristotelian squares in Figure 8(a–c) closely resemble the duality squares in Figure 3(a–c), respectively. In particular: (i) on the diagonals, the duality relation $\Tiny{ENEG}$ corresponds to the Aristotelian relation of contradiction, (ii) on the vertical edges, the duality relation $\Tiny{DUAL}$ corresponds to the Aristotelian relation of subalternation, and (iii) on the horizontal edges, the duality relation $\Tiny{INEG}$ corresponds to the Aristotelian relations of contrariety and subcontariety. These strong similarities might explain why authors such as D’Alfonso (2012), Meles (2012) and Schumann (2013) have come close to straightforwardly identifying the two types of squares—for example, by using Aristotelian terminology to describe the duality square (or vice versa), or by viewing one as a generalization of the other.

Furthermore, both Aristotelian and duality diagrams have been used by linguists to explain certain lexicalization patterns in natural languages. For example, Horn (1989) and Jaspers (2005) make use of the Aristotelian relations to explain the so-called non-lexicalization of the O-corner, i.e. the observation that natural languages have primitive lexical items for the quantifiers all, some and none, but not for not all (the latter’s lexicalization as a single word—for example: *nall— does not occur in natural language). The same asymmetry can be found in the lexicalization pattern of the propositional connectives: natural languages have primitive lexical items for and, or and nor, but not for not and (the latter’s lexicalization as a single word—for example: *nand—does not occur in natural language). These linguistic phenomena are also explained by Löbner (1990, 2011), but his phase quantification account is based on the duality relations, rather than the Aristotelian relations. Finally, it should be noted that the Aristotelian account of these lexical asymmetries has recently been generalized beyond the square by Seuren and Jaspers (2014).

Dissimilarities. As noted by Löbner (2011), Chow (2012) and Westerståhl (2012), there are also several differences between the duality square and the Aristotelian square. For example, although duality seems to correspond to subalternation, the former relation is symmetric, while the latter is asymmetric. Furthermore, although both sets of relations contain four members, there is no clean one-to-one mapping in either direction: on the one hand, the Aristotelian relations of contrariety and subcontrariety correspond to a single duality relation ($\Tiny{INEG}$), and on the other hand, the duality relation $\Tiny{ID}$ does not correspond to any Aristotelian relation whatsoever. (However, Smessaert and Demey (2014) introduce a quasi-Aristotelian relation that holds precisely between a formula and itself, and thus does correspond to the duality relation $\Tiny{ID}$.)

Another difference concerns sensitivity to the specific axioms of the background logic (Demey 2015). Consider, for example, the modal operators $\square$, $\lozenge$: $\mathbb{B}_{\textsf{S}} \rightarrow \mathbb{B}_{\textsf{S}}$, where $\mathbb{B}_{\textsf{S}}$ is the Lindenbaum-Tarski algebra of some normal modal logic $\textsf{S}$. The Aristotelian relation holding between these operators depends on the logical system $\textsf{S}$: in normal modal systems that are at least as strong as $\textsf{KD}$, there is a subalternation from $\square p$ to $\lozenge p$, but in weaker normal modal systems, there is no Aristotelian relation at all between these two formulas (Hughes and Cresswell 1996). Nevertheless, in all of these modal systems, it is the case that $\square \varphi$ is logically equivalent to $\neg \lozenge \neg \varphi$ for all formulas $\varphi \in \mathcal{L}_{\textsf{S}}$, and hence $[\varphi] = \neg \lozenge \neg [\varphi]$ for all $[\varphi] \in \mathbb{B}_{\textsf{S}}$. This means exactly that $\Tiny{DUAL}$ ($\square$, $\lozenge$), and hence the duality relation holding between $\square$ and $\lozenge$ holds independently of the specific axioms of the logical system $\textsf{S}$.

At this point, it might be objected that the duality relations are logic-sensitive after all; for example, conjunction and disjunction are dual to one another in classical propositional logic ($\textsf{CPL}$), but not in intuitionistic propositional logic ($\textsf{IPL}$). However, the Lindenbaum-Tarski algebra of $\textsf{IPL}$ is itself not a Boolean algebra (but rather a Heyting algebra), and thus falls outside the scope of the definition of the duality relations that was provided in Section 3.

Another difference between the duality and the Aristotelian relations is that the former, but not the latter, are functional. As was already discussed in Section 3, every formula has exactly one internal negation, exactly one external negation, and exactly one dual (up to logical equivalence). By contrast, the Aristotelian relations are not functional: for example, a given formula might be contrary to several (non-equivalent) formulas. As illustrated by Smessaert (2012), this difference becomes much more apparent if we move from squares to larger diagrams. For example, Figures 9(a–b) show an Aristotelian and a duality diagram for the same set of six modal formulas. Consider the formula p. Within the Aristotelian hexagon, this formula has two (non-equivalent) contraries, namely $\square \neg p$ and $\lozenge p \wedge \lozenge \neg p$. From a duality perspective, the first of these two formulas is the internal negation of $\square p$, but the second one stands in no duality relation at all to $\square p$. The duality ‘hexagon’ in Figure 9(b) thus ultimately turns out to consist of two independent components: the ordinary duality square in Figure 9(c) and the degenerate duality pattern (containing two formulas that are their own internal negations) in Figure 9(d).

Figure 9: (a) Aristotelian hexagon (for a modal system that is at least as strong as $\textsf{KD}$, (b) duality ‘hexagon’, and (c–d) its two components.

Finally, it should also be noted that it is perfectly possible for two operators/formulas to stand in a duality relation without standing in any Aristotelian relation, or vice versa. Moving to the level of diagrams, this means that it is possible for four operators/formulas to constitute a duality square without constituting an Aristotelian square, or vice versa (Löbner 1986). For example, the aspectual adverbs already, still, not yet and no longer constitute a duality square—see Figure 4(b)—, but not an Aristotelian square: for example, already and still are each other’s duals, but there is no subalternation between them in either direction. Analogously, the modal formulas $\square p$, $\square \vee \square \neg p$, $\lozenge \neg p$ and $\lozenge p \wedge \lozenge \neg p$ constitute an Aristotelian square (embedded inside the Aristotelian hexagon in Figure 9(a) with a counterclockwise rotation of 120◦), but not a duality square: for example, $\square p$ and $\lozenge p \wedge \lozenge \neg p$ are contraries, but there is no duality relation between them. In fact, looking at these four modal formulas in the duality ‘hexagon’ in Figure 9(b), we see that $ \square p \vee \square \neg p$ and $\lozenge p \wedge \lozenge \neg p$ by themselves constitute a degenerate duality pattern (Figure 9(d)), while $\square p$ and $\lozenge \neg p$ belong to another, ‘real’ duality square (Figure 9(c)).

6. References and Further Reading

Ackrill, J. (1961). Aristotle’s Categories and De Interpretatione. Clarendon Press, Oxford.
Barwise, J. and Cooper, R. (1981). Generalized quantifiers and natural language. Linguistics and Philosophy, 4:159–219.
Blackburn, P., de Rijke, M., and Venema, Y. (2001). Modal Logic. Cambridge University Press, Cambridge.
Brisson, C. (2003). Plurals, All, and the nonuniformity of collective predication predication. Linguistics and Philosophy, 26:129–184.
Buridan, J. (2001). Summulae de Dialectica. Translated by Gyula Klima. Yale University Press, New Haven, CT.
Chow, K. (2012). General patterns of opposition squares and 2n-gons. In Beziau, J.-Y. and Jacquette, D., editors, Around and Beyond the Square of Opposition, pages 263–275. Springer, Basel.
D’Alfonso, D. (2012). The square of opposition and generalized quantifiers. In Beziau, J.-Y. and Payette, G., editors, Around and Beyond the Square of Opposition, pages 219–227. Springer, Basel.
Davey, B. A. and Priestley, H. A. (2002). Introduction to Lattices and Order (Second Edition). Cambridge University Press, Cambridge.
Demey, L. (2012a). Algebraic aspects of duality diagrams. In Philip T. Cox, B. P. and Rodgers, P., editors, Diagrammatic Representation and Inference, Lecture Notes in Computer Science (LNCS) 7352, pages 300–302. Springer, Berlin.
Demey, L. (2012b). Structures of oppositions for public announcement logic. In Beziau, J.-Y. and Jacquette, D., editors, Around and Beyond the Square of Opposition, pages 313–339. Springer, Basel.
Demey, L. (2015). Interactively illustrating the context-sensitivity of Aristotelian diagrams. In Christiansen, H., Stojanovic, I., and Papadopoulos, G., editors, Modeling and Using Context, LNCS 9405, pages 331–345. Springer.
Demey, L. and Smessaert, H. (2016). Metalogical decorations of logical diagrams. Logica Universalis, 10:233–292.
Dowty, D. (1987). Collective predicates, distributive predicates, and All. In Marshall, F., editor, Proceedings of the 3rd Eastern States Conference on Linguistics (ESCOL), pages 97–115. Ohio State University, Columbus, OH.
Freudenthal, H. (1960). Lincos. Design of a Language for Cosmic Intercourse. North-Holland, Amsterdam.
Gamut, L. (1991). Logic, Language, and Meaning.
Givant, S. and Halmos, P. (2009). Introduction to Boolean Algebras. Springer, New York, NY.
Gottschalk, W. H. (1953). The theory of quaternality. Journal of Symbolic Logic, 18:193–196.
Gowers, T., editor (2008). The Princeton Companion to Mathematics. Princeton University Press, Princeton, NJ.
Hacker, E. A. (1975). The octagon of opposition. Notre Dame Journal of Formal Logic, 16:352–353.
Henkin, L., Monk, J. D., and Tarski, A. (1971). Cylindric Algebras, Part I. NorthHolland, Amsterdam.
Horn, L. (2006). The border wars: A neo-Gricean perspective. In von Heusinger, K. and Turner, K., editors, Where Semantics Meets Pragmatics, pages 21–48. Elsevier, Amsterdam.
Horn, L. R. (1989). A Natural History of Negation. University of Chicago Press, Chicago, IL.
Horn, L. R. (2004). Implicature. In Horn, L. R. and Ward, G., editors, Handbook of Pragmatics, pages 3–28. Blackwell, Oxford.
Hughes, G. E. and Cresswell, M. J. (1996). A New Introduction to Modal Logic. Routledge, London.
Humberstone, L. (2011). The Connectives. MIT Press, Cambridge, MA.
Iten, C. (1998). Because and although: a case of duality? In Rouchota, V. and Jucker, A. H., editors, Current Issues in Relevance Theory, pages 59–80. John Benjamins, Amsterdam.
Iten, C. (2005). Linguistic Meaning, Truth Conditions and Relevance: The Case of Concessives. Palgrave Macmillan, Basingstoke/New York (NY).
Jaspers, D. (2005). Operators in the Lexicon. On the Negative Logic of Natural Language. LOT Publications, Utrecht.
Johnson, W. (1921). Logic. Part I. Cambridge University Press, Cambridge.
Kabakov, F. A., Parkhomenko, A. S., Voitsekhovskii,
M. I., and Fofanova, T. S. (2014). Duality principle. In Encyclopedia of Mathematics. Springer, available at
http://www.encyclopediaofmath.org/index.php?title=Duality principle&oldid=35095.
Keynes, J. N. (1884). Studies and Exercises in Formal Logic. MacMillan, London.
Konig, E. (1991). Concessive relations as the dual of causal relations. In Zaefferer, D., editor, Semantic Universals and Universal Semantics, volume 12 of Groningen-Amsterdam Studies in Semantics, pages 190–209. Foris, Berlin.
Kripke, S. (1977). Speaker’s reference and semantic reference. In French, P., Uehling, Jr., T., and Wettstein, H., editors, Contemporary perspectives in the philosophy of language, pages 6–27. University of Minnesota Press, Minneapolis, MN.
Libert, T. (2012). Hypercubes of duality. In Beziau, J.-Y. and Jacquette, D., editors, Around and Beyond the Square of Opposition, pages 293–301. Springer, Basel.
Löbner, S. (1986). Quantification as a major module. In Groenendijk, J., de Jongh, D., and Stokhof, M., editors, Studies in Discourse Representation Theory and the Theory of Generalized Quantifiers, pages 53–85. Foris, Dordrecht.
Löbner, S. (1987). Natural language and generalized quantifier theory. In Gardenfors, P., editor, Generalized Quantifiers, pages 181–201. Reidel, Dordrecht.
Löbner, S. (1989). German. schon – erst – noch: an integrated analysis. Linguistics and Philosophy, 12:167–212.
Löbner, S. (1990). Wahr neben Falsch. Duale Operatoren als die Quantoren naturlicher Sprache. Max Niemeyer Verlag, Tubingen.
Löbner, S. (1999). Why German schon and noch are still duals: a reply to van der Auwera. Linguistics and Philosophy, 22:45–107.
Löbner, S. (2011). Dual oppositions in lexical meaning. In Maienborn, C., von Heusinger, K., and Portner, P., editors, Semantics: An International Handbook of Natural Language Meaning, volume I, pages 479–506. de Gruyter Mouton, Berlin.
Mac Lane, S. (1998). Categories for the Working Mathematician. Springer, Berlin.
Meles, B. (2012). No group of opposition for constructive logics: The intuitionistic and linear cases. In Beziau, J.-Y. and Payette, G., editors, Around and Beyond the Square of Opposition, pages 201–217. Springer, Basel.
Michaelis, L. (1996). On the use and meaning of already. Linguistics and Philosophy, 19:477–502.
Mittwoch, A. (1993). The relationship between schon/already and noch/still: A reply to Löbner. Natural Language Semantics, 2:71–82.
Moretti, A. (2012). Why the logical hexagon? Logica Universalis, 6:69–107.
Peters, S. and Westerståhl, D. (2006). ˚ Quantifiers in Language and Logic. Oxford University Press, Oxford.
Peterson, P. (1979). On the logic of “few”, “many”, and “most”. Notre Dame Journal of Formal Logic, 20:155–179.
Piaget, J. (1949). Traite de logique. Essai de logistique operatoire. Colin/Dunod, Paris.
Read, S. (2012). John Buridan’s theory of consequence and his octagons of opposition. In Beziau, J.-Y. and Jacquette, D., editors, ´ Around and Beyond the Square of Opposition, pages 93–110. Springer, Basel.
Reichenbach, H. (1952). The syllogism revised. Philosophy of Science, 19:1–16.
Schumann, A. (2013). On two squares of opposition: the Lesniewski’s style formalization of synthetic propositions. Acta Analytica, 28:71–93.
Seuren, P. and Jaspers, D. (2014). Logico-cognitive structure in the lexicon. Language, 90:607–643.
Smessaert, H. (2012). The classical Aristotelian hexagon versus the modern duality hexagon. Logica Universalis, 6:171–199.
Smessaert, H. and Demey, L. (2014). Logical geometries and information in the square of oppositions. Journal of Logic, Language and Information, 23:527–565.
Smessaert, H. and ter Meulen, A. (2004). Temporal reasoning with aspectual adverbs. Linguistics and Philosophy, 27:209–261.
Urquhart, A. (2008). Emil Post. In Gabbay, D. M. and Woods, J., editors, Handbook of the History of Logic. Volume 5. Logic from Russell to Church. Elsevier, Amsterdam.
van Benthem, J. (1991). Linguistic universals in logical semantics. In Zaefferer, D., editor, Semantic Universals and Universal Semantics, volume 12 of Groningen-Amsterdam Studies in Semantics, pages 17–36. Foris, Berlin.
van der Auwera, J. (1993). ‘Already’ and ‘still’: beyond duality. Linguistics and Philosophy, 16:613–653.
Westerståhl, D. (2012). Classical vs. modern squares of opposition, and beyond. In Beziau, J.-Y. and Payette, G., editors, The Square of Opposition. A General Framework for Cognition, pages 195–229. Peter Lang, Bern.

Author Information

Lorenz Demey
Email: lorenz.demey@kuleuven.be
Catholic University of Leuven
Belgium

and

Hans Smessaert
Email: hans.smessaert@kuleuven.be
Catholic University of Leuven
Belgium

The Meaning of Life: Contemporary Analytic Perspectives

Depending on whom one asks, the question, “What is the meaning of life?” is either the most profound question of human existence or else nothing more than a nonsensical request built on conceptual confusion, much like, “What does the color red taste like?” or “What is heavier than the heaviest object?” Ask a non-philosopher, “What do philosophers discuss?” and a likely answer will be, “The meaning of life.” Ask the same question of a philosopher within the analytic tradition, and you will rarely get this answer. The sources of suspicion about the question within analytic philosophy, especially in earlier periods, are varied. First, the question of life’s meaning is conceptually challenging because of terms like “the” “meaning” and “life,” and especially given the grammatical form in which they are arranged. Second, it is often asked with transcendent, spiritual, or religious assumptions at the fore about what the world “should” be like in order for there to be a meaning of life. In so far as the question is entangled with such ideas, the worry is that even if the concept of a meaning of life is coherent, there likely is not one.

Despite such suspicions and relative disinterest in the question of life’s meaning among analytic philosophers for a large part of the twentieth century, there is a growing body of work on the topic over roughly the last two decades. Much of this work focuses on developing and defending theories of meaning in life (see Section 2.d. for more on the distinction between meaning in life and the meaning of life) via conceptual analyses of the necessary and sufficient conditions for meaningful life. A smaller, though no less important, subset of work in this growing field focuses on why we even use “meaning” in the first place to voice our questions and concerns about central facets of the human condition.

This article surveys important trajectories in discussions of life’s meaning within contemporary analytic philosophy. It begins by introducing key aspects of the human context in which the question is asked. The article then investigates three ideas that illumine what meaning means in this context: sense-making, purpose, and significance. The article continues by surveying important topics that provide a greater understanding of what is involved in our requests for meaning. After briefly surveying theories of meaning in life, it concludes with discussions of death and futility, followed by important areas of research that remain under-investigated.

The Human Context
The Contemporary Analytic Context: Prolegomena
Theories of Meaning in Life
Death, Futility, and a Meaningful Life
Underinvestigated Areas
References and Further Reading

1. The Human Context

The human desire for meaning finds vivid expression in the stories we tell, diaries we keep, and in our deepest hopes and fears. According to twentieth century Freudian psychoanalyst Bruno Bettelheim, “our greatest need and most difficult achievement is to find meaning in our lives” (Bettelheim 1978: 3). Holocaust survivor and psychiatrist Viktor Frankl said that the human will to meaning comes prior to either our will to pleasure or will to power (Frankl 2006: 99).

Questions about meaning arise and take shape within varied contexts: when struggling to make an important decision about what to do with our lives, when trapped in a job we hate, when wondering if there is more to life than the daily hum-drum, when diagnosed with a terminal illness, when experiencing the loss of a loved one, when feeling small while looking up at the night sky, when wondering if this universe is all there is and why it is even here in the first place, when questioning whether life and love will have a lasting place in the universe or whether the whole show will end in utter and everlasting desolation and silence.

Lurking behind many of our questions about meaning is our capacity to get outside of ourselves, to view our lives from a wider standpoint, a standpoint from which to understand the setting for our lives and question the “why?” of what we do. Humans possess self-awareness, and can take an observational, self-reflective viewpoint on our lives. In this, we are able to shift from mere automatic engagement to observation and evaluation. We do more than simply respond to streams of stimuli. We step back and question who we are and what we do. Shifting our focus to the widest standpoint—sub specie aeternitatis (literally, from the perspective of eternity; a universal perspective)—we wonder how such infinitesimally small and fleeting creatures like ourselves fit in the grand scheme of things, within vast space and time. We worry about whether a reality of such staggering magnitude, at the deepest level, cares about us (for related discussions, see Fischer 1993; Kahane 2013; Landau 2011; Nagel 1971, 1989; and Seachris 2013).

That our concerns about meaning are often cosmically-focused is instructive. Despite the current theoretical emphasis in analytic philosophy on the more terrestrially-focused idea of meaning in life, questions about meaning are very often cosmic in scope. In the words of sociologist Peter Berger, in seeking life’s meaning, many are attempting to locate it “within a sacred and cosmic frame of reference” of trying to plumb the connection “between microcosm and macrocosm” (Berger 1967: 27). This is an important reason why God, transcendence, and other ideas embodied and expressed in religion are so often thought to be relevant to life’s meaning.

2. The Contemporary Analytic Context: Prolegomena

Relatively speaking, not too long ago many analytic philosophers were suspicious that the question of life’s meaning was incoherent. Such views found expression in popular culture too, for example, in Douglas Adams’ widely read book The Hitchhiker’s Guide to the Galaxy. The story’s central characters visit the legendary planet Magrathea and learn about a race of hyper-intelligent beings who built a computer named Deep Thought. Deep Thought’s purpose was to answer the ultimate question of life, the universe, and everything, that answer being a bewildering 42. Deep Thought explained that this answer was incomprehensible because the beings who designed it, though super-intelligent, did not really know what they were asking in the first place. Asking for life’s meaning might be like this, in which case 42 is as good of an answer as any other.

Some analytic philosophers in the twentieth century, in the wake of logical positivism, shared Deep Thought’s suspicion. They were particularly weary of the traditional formulation—What is the meaning of life? Meaning, it was thought, belongs in the linguistic realm. Words, sentences, and other linguistic constructions are the proper bearers of meaning, not objects, events, or states of affairs, and certainly not life itself. Some philosophers thought that in asking for life’s meaning, we use an ill-chosen expression to voice something real, perhaps an emotional response of awe or wonder at the staggering fact that anything exists at all. Yet, experiencing such feelings and asking a meaningful question are two different things altogether.

Asking what something means, though, need not be a strictly semantic activity. We ask for the meanings of all kinds of things and employ “meaning” in a wide variety of contexts in everyday life, only some of which are narrowly linguistic. Paying careful attention to the meanings of “meaning” provides important clues about what life’s meaning is all about. Three connotations in particular are instructive here: sense-making, purpose, and significance.

a. The Meanings of “Meaning”

Meaning-talk is common in everyday discourse. Most ordinary uses of “meaning” tend to cluster around three basic ideas: (1) sense-making (which can include the ideas of intelligibility, clarification, or coherence), (2) purpose, and (3) significance (which can include the idea of value). The following list of statements and questions captures the richly varied ways in which we employ the concept of meaning on a regular basis.

Meaning as Sense-Making

What you said didn’t mean a thing.
What did you mean by that statement?
Do you know what I mean?
What did you mean by that face? (overlaps with purpose)
What is the meaning of that book? (what is it about?)
What is the meaning of this? (for example, when asked upon returning home to find one’s house ransacked)

Meaning as Purpose

What did you mean by that face? (overlaps with intelligibility)
The tantrum is meant to catch his dad’s attention.
What is the meaning of that book? (why was it written?)
I really mean it!
I didn’t mean to do it. I promise!

Meaning as Significance

That was such a meaningful
This watch really means something to me.
That is a highly meaningful event in the life of that city.
What do his first six months in office mean for the country (likely overlaps with intelligibility)
That is a meaningful
That is a meaningless
You mean nothing to me.

i. Sense-Making

This category is an important ordinary sense of meaning and connotes ideas like intelligibility, clarification, and coherence. Something has meaning if it makes sense; it lacks meaning if it does not. One way of understanding sense-making is through the idea of proper fit. Words, concepts, propositions, but also events and states of affairs, make sense and are meaningful if and when they fit together properly; if they lack such fit, they make no sense and are meaningless. This applies narrowly. For example, it makes no sense to ask, “What is brighter than the brightest light source?” It does not fit with the concept brightest to ask what is brighter, but it has a broader application too. We say things like:

It does not make sense for the president to send in troops given the geopolitical situation in the region.
Asking philosophy students to perform long-division on their midterm makes no sense.

In each of these situations, we perceive a lack of fit—a lack of fit between a decision and circumstances surrounding that decision or between reasonable expectations about what one will find on a philosophy exam and what one actually finds. There is a kind of absurdity here. Perceiving this weaker lack of fit will be a product of beliefs, norms, and other epistemic, evaluative, and social commitments. Therefore, determining whether or not something, in fact, involves a lack of fit in this broader sense often will be a messier task than in cases of narrow sense-making.

Ascertaining meaning, then, is often about fitting something into a larger context or whole: words into sentences, paragraphs, novels, or monographs; musical notes into measures, movements, and symphonies (i.e., the movement from mere sound to music), parts of a photograph within the entire photograph. Meaning is about intelligibility within a wider frame, about “inserting small parts into a larger, integrated context” (Svendsen 2005: 29). Similarly, we can plausibly view our requests for the meaning of life as attempts to secure the overarching context through which to make sense of our lives in the universe (see Thomson 2003: 132-138). Our focus here is on existentially weighty matters that define and depict the human condition: questions and concerns surrounding origins, purpose, significance, value, suffering, and death and destiny. We want answers to our questions about these matters, and want these answers to fit together in an existentially satisfying way. We want life to make sense, and when it does not, we are haunted by the specter of meaninglessness.

ii. Purpose

Requests for meaning are very often requests for purpose. We want to know whether we have a purpose(s) and if so, what it is. Many assume that there is a cosmic purpose around which to order our lives. A cosmic purpose likely would require transcendence or God. Someone must intend it all in order for there to be a purpose of it all. One might reject the idea of cosmic purpose, though, and still frame the question about life’s meaning as one largely about purpose. In this case, meaningful life (or meaning in life) is about ordering one’s life around self-determined purposes.

We also distinguish actions done on purpose from those done by accident. We use meaning (or meant) to contrast willful from non-willful action. We say things like, “I really mean it” to indicate the ‘full’ operation of our will. Alternatively, our child might say, “I didn’t mean it, I promise!” to indicate that she did not intend to spill her glass of milk. This sense of “meant” is also relevant for life’s meaning. We want sufficient autonomy, and when it is absent or severely mitigated, we worry about the meaningfulness of our lives (see Mawson 2016; Sartre 1973). Most of us do not want to walk through life haphazardly, nor in a way that is largely determined apart from our own consent. Likely one aspect of meaningful life, then, is life lived with our wills sufficiently engaged, one lived on purpose. These two shades of purpose are probably related. We want to really mean it as we select and align our lives with aims that will provide the salient structural rhythms to our day-to-day existence. In other words, we do not want to be alienated from the purposes that guide our lives.

Purpose and sense-making often are connected. Purpose itself, via future-targeted goals that shape pre-goal activity, provide important aspects of the structure that serves as the framework through which life fits together and makes sense. Lives that fit together and make sense—meaningful lives—are those that are sufficiently teleological. Working to attain goals at various levels of life-centrality is likely a facet of life properly fitting together and therefore being meaningful. Teleological threads connecting discreet life episodes are then necessary for a robust kind of sense-making in life. Lives lacking this are threatened with a sort of unintelligibility that results from being insufficiently structured by a telos. In the words of philosopher Alasdair MacIntyre:

When someone complains…that his or her life is meaningless, he or she is often and perhaps characteristically complaining that the narrative of their life has become unintelligible to them, that it lacks any point, any movement toward a climax or a telos (MacIntyre 2007: 217).

iii. Significance

Meaning often conveys the idea of significance, and significance tracks a related cluster of notions like mattering, importance, impact, salience, being the object of care and concern, and value, depending on context. We contrast trivial discussions about the mundane with deep discussions about important matters, referring to the latter as meaningful or significant. Physical objects deeply enmeshed in our life stories are meaningful. We view actions and events that have salient implications as significant, and in cases where that significance has positive value, as meaningful (whether a person can lead a meaningful life in virtue of making large negative impacts is a growing topic of discussion as the field seeks to understand the connection between meaning and morality; see Campbell and Nyholm 2015). Finding the cure for that disease was meaningful because it had such a large positive impact within a certain frame of cares and concerns. This shade of meaning is also in view in cases where some piece or set of data crosses a threshold of salience against background information. That such a large percentage of the population living under certain conditions is getting a particular disease is statistically significant or statistically meaningful. In this way, sense-making and significance senses of meaning connect.

Alternatively, when something does not matter to us, we might say, “That means nothing to me.” It was just a meaningless conversation; it was inconsequential. That game did not matter because the playoffs were already set. The wrapping paper does not matter, what is on the inside of the package counts. That piece of information is not meaningful relevant to the aims and questions guiding one’s inquiry. Spending your life sitting on the couch and watching sitcom re-runs on Netflix is meaningless; you do nothing that matters, you do nothing of importance or value, and so on.

Something’s significance is often and largely gauged in relation to a perspective, horizon, or point of reference, all of which can be dynamic. Something that is significant from one vantage point may, and often does, lose its significance when viewed from a broader horizon. Scraping your knee at age four is significant, at least from a four-year old’s perspective. When looking back decades later, its significance wanes. Most events important enough to make it into local lore will not matter enough to be included in a national history, let alone world and, especially, cosmic history. One quickly sees resources available from which to generate pessimistic meaning of life concerns vis-à-vis human significance as one broadens horizons, eventually terminating in the widest cosmic perspective.

Significance is often distinctly normative and person-al. When we say that something is meaningful in the sense of being significant, important, or mattering, we make a kind of evaluative claim about what is good or valuable. Additionally, significance is often connected with being the object of a person’s evaluations, cares, and concerns. Things are, most naturally, significant to someone.

Insofar as meaning is thought to have an affective dimension, that dimension likely intersects with significance. If my grandmother’s necklace is meaningful to me, it has value, it matters, and affective states fitting a certain psychological profile, like being deeply stirred or moved, often accompany such assessments of value and mattering. Though this may not make such affective states a further type of meaning or constitutive of meaning, these states reliably track instances of significance or perceived significance.

Like sense-making and purpose, significance is relevant to life’s meaning. In broad terms, one way of construing meaningful life is as a life that matters and has positive value. This, of course, admits of various understandings of mattering that, at one level, might track the objective naturalist, subjective naturalist, hybrid naturalist, and supernaturalist debate (see Section 3 below): matters to whom and according to what standard? Additionally, some find it difficult to separate personal and cosmic concerns over significance. Cosmic concerns, for many, are also intensely personal. If the universe as a whole lacks significance, some worry that their individual lives lack significance, or at least the kind that they think a deeply meaningful life requires.

b. The Word “Life”

Understanding what life’s meaning is all about is complicated, not just because of the expansive semantic range of “meaning,” but also because it is not immediately clear how we should understand the word “life” in the question. In asking for life’s meaning, we are not, at least most of us, asking for the meaning of the word “life.” Neither are we asking about how being alive is different from being non-living or how being organic is different from being inorganic. What then are we asking, and what is the scope of that request? Our question(s) about life’s meaning likely range over the following options:

Life1 = individual human life (meaning of my life)

Life2 = humanity as a whole (meaning of human existence)

Life3 = all biological life (meaning of all living organisms collectively)

Life4 = all of space-time existence (meaning of it all)

Life5 = rough marker for those aspects of human life that have a kind of existential gravitas and are of immense concern and the subject of intense questioning by human beings (see Section 2.e. below)

Each of these options for understanding “life” in the traditional formulation tracks possible interpretations of the question. The targets of our questions and concerns about meaning are varied in scope. We ask questions about our own, personal existence as well as questions about the entire show, and one might think that questions about personal meaning are connected to questions about cosmic meaning. Life5 provides a way of bringing important aspects of each together (see Section 2.e.)

c. The Definite Article

Another thorny issue for the traditional formulation is its incorporation of the definite article—the. It implies that there is only one meaning of life, which violates common inclinations that meaning is the sort of thing that varies from person to person. What makes one life meaningful is different from what makes another meaningful. One person might derive large doses of meaning from her career, another through gardening. For this reason, many are suspicious of the definite article.

There is good reason, though, to question this suspicion. First, it might reveal confusion about what meaning even is in the first place. Indeed, one of the aims of those working in the field is to clarify just what meaning is. Here, it is worth noting that many plausible theories of meaning have an objective component, indicating that not just anything goes for meaning. However, even if meaning were solely a matter of, say, being fulfilled, notice that the following two claims are still consistent: (1) the meaning of life is about being fulfilled and (2) sources of fulfillment are exceedingly diverse. Life’s meaning in this case is about being fulfilled (consistent across persons), but sources of fulfillment vary from person to person.

Second, one might also reasonably think that there is a single meaning of life at the cosmic level that itself is consistent with a rich variety of ways to lead a meaningful life (meaning in life at the terrestrial, personal level). Thinking through possibilities like this will connect with claims about what is true about the world, for example, whether there is a God with a plan for the cosmos and whether there is an overarching meaning to it all. In a case like this, there might be a single meaning of life, but the sense of meaning in which there is a single meaning could be different from the sense of meaning in which there are varied meanings. Regardless of the complexities here, the point is that one should not too quickly dismiss the definite article as contributing to intractable theoretical and practical problems for thinking about life’s meaning.

d. Meaning of Life vs. Meaning in Life

In what has become a standard distinction in the field, philosophers distinguish two ideas: the meaning of life (MofL) and meaning in life (MinL). Claims like the following are prevalent, “one can find meaning in her life, even if there is no grand, cosmic meaning of life.” MofL is more global or cosmic in scope, and often is intertwined with ideas like God, transcendence, religion, or a spiritual, sacred realm. In asking for life’s meaning, one is often asking for some sort of cosmic meaning, though she may also be asking for the meaning of her individual life from the perspective of the cosmos since many think the meaning of their individual lives is tied to whether or not there is a meaning of it all.

MinL is focused on personal meaning; the meaning of our individual lives as located in the web of human endeavors and relationships sub specie humanitatis—within the frame of human cares and concerns. Many think that we can legitimately talk about life having meaning in this sense regardless of what is true about the meaning of the universe as a whole.

One can see how the various sense of meaning discussed earlier in this entry intersect at both levels—MofL and MinL. For example, if sense-making is in view at the cosmic level, we might ask questions like the following: “What’s it all about?” or “How does it all fit together?” At the terrestrial, personal level, our sense-making questions might, rather, take the following shape: “What is my life about?” “How does my life fit together?” or “Is my life coherent?” If significance is in view at the cosmic level, we might ask, “Do our lives really matter in the grand scheme of things?” whereas terrestrially, personally, we might ask, “Does my life matter to me, my family, friends, or my community?”

e. What is the Meaning of x?

The locution, “What is the meaning of x?” need not be understood narrowly as the request for something semantic, say, for a definition or description. There are additional non-linguistic contexts in which this request makes perfect sense (see Nozick 1981). Some of them even share striking similarities to the question of life’s meaning. One in particular is especially relevant.

Sometimes we are confronted with circumstances that we do not yet sufficiently understand, in which case we might naturally respond by asking, “What’s this all about?” or “What’s going on here?” or “What happened?” or “What’s happening?” or “What does this mean?” or “What is the meaning of this?” In asking such questions, we are in search of sense-making and intelligibility. We walk in on our children fighting and demand: “What is the meaning of this?” Mary Magdalene and Mary the mother of James come to find a stone rolled away from a Roman guarded tomb. The burial linens are there, but Jesus’ body is nowhere to be found. One can imagine them thinking, “What is the meaning of this?”

We naturally invoke the formula “What is the meaning of x?” in situations where x is some fact, event, phenomena, or cluster of such things, and about which we want to know, in the words of New Testament scholar and theologian, N. T. Wright, its “implication in the wider world within which this notion makes the sense it makes” (Wright 2003: 719). Such requests track our desire to make sense of a situation, to render it intelligible with the further aim of acting appropriately in response—a kind of epistemic map to aid in practical, normative navigation.

Taking our cue from these ordinary examples, to inquire about life’s meaning is plausibly understood as asking something similar to our requests for the meaning of our children’s scuffle or of Jesus’ empty tomb. Over the course of our existence, we encounter aspects of the world that have a kind of existential gravitas in virtue of their role in defining and depicting the human condition. They capture our attention in a unique way. The word “life,” then, is a rough marker for these existentially-weighty aspects (Life5 in Section 2.b. above), aspects of life that give rise to profound questions for which we seek an explanatory framework (perhaps even a narrative framework) in order to make sense of them. These aspects of the world are akin to the portion of the scuffle and empty tomb above to which we already have limited informational access: yelling and throwing in the case of the scuffle, and the various pieces and clues observed at the empty tomb. Like the parent or Mary Magdalene in those situations, we lack important parts of life’s context, and we desire to fill in these existentially relevant gaps in our knowledge, and then live accordingly. We are in search of life’s meaning, where that meaning is, at center, a kind of overarching sense-making framework for answering and fitting together answers to our questions about origins, purpose, significance, value, suffering, and destiny.

f. Interpretive Strategies

i. The Amalgam Approach

The currently favored strategy for interpreting the traditional formulation of the question—What is the meaning of life?—is the amalgam approach. On this pluralist view, the question is not thought to be a single question at all, but rather an amalgam of numerous other questions, most of which share family resemblances. The question is, on this view, simply a place-holder (some think ill-conceived) for these other questions and is, itself, not capable of being answered in this form. Though it has no answer in this form, other questions about purpose, significance, value, worth, origins, and destiny might. We at least know what we ask when we ask them, so the thought goes. Suspicion of the traditional formulation often accompanies the amalgam view since that formulation makes use of the definite article (“the”), the word “meaning,” and the word “life,” which together in the grammatical form in which they are found contribute to a thorny interpretive challenge. Perhaps the best strategy according to many proponents of the amalgam interpretation, is simply to jettison the traditional formulation and focus on trying to answer some among this other cluster of questions that collectively embody what we are concerned about when we inquire into life’s meaning.

ii. The Single Question Approach

Though the amalgam interpretation is the most popular view among those writing on life’s meaning within analytic philosophy, a few others have favored an approach that views the traditional formulation as a single question capable of being answered in that form (see Seachris 2009, 2019; Thomson 2003). A promising strategy here is to prioritize the sense-making connotation of meaning. On this version of the interpretive approach, asking about the meaning of life is first about seeking a sense-making explanation (perhaps even narrative explanation) for our questions and concerns about origins, purpose, significance, value, suffering, and destiny. Contrary to the amalgam interpretation, on this view, the question of life’s meaning is asking for a single thing—a sense-making explanation. It is, of course, an explanation squarely focused on all this other meaning of life “stuff.” This explanation can be thought of as a worldview or metanarrative. This approach is an organic interpretive strategy that seeks a single answer (e.g., narrative explanation) that unifies or integrates answers to all the sub-questions that define and depict the human condition. It provides the conceptual resources to account for both MofL and MinL. The cosmic and the personal, the epistemic and the normative, and the theoretical and the practical are inseparable in our search for meaning. The sense-making framework that we seek links all of this as we pursue meaningful lives in light of our place within the grand scheme of it all.

This version of the single-question approach, with its emphasis on sense-making, is closely related to the concept of worldview. Worldviews provide answers to the existentially weighty set of questions that brings into relief the human condition. As philosopher Milton Munitz notes:

. . . [people] may say that what they are looking for [when asking the question of life’s meaning] is an account of the “big picture” with whose aid they would be able to see not only their own individual personal lives, but the lives of everybody else, indeed of everything of a finite or limited sort, human or not. . . . The expression of such a concern involves, at bottom, the appeal to a “worldview” or “world picture.” This undertakes to give a description of the most inclusive setting within which human life is situated . . . (Munitz 1993: 30).

To offer a worldview, then, is to offer a putative meaning of life—a sense-making framework focused squarely on the set of questions and concerns surrounding origins, purpose, significance, value, suffering, and destiny.

Looking back further into the origin of the worldview concept strengthens the connection between worldview and life’s meaning, and offers important clues that a worldview provides a kind of sense-making meaning. Nineteenth century German historian and philosopher, Wilhelm Dilthey, spoke of a worldview as a concept that “. . . constitutes an overall perspective on life that sums up what we know about the world, how we evaluate it emotionally, and how we respond to it volitionally.” Worldviews possess three distinct yet interrelated dimensions: cognitive, affective, and practical. They address both MofL and MinL. A worldview is motivated out of a desire to answer what he calls the “riddle of existence:”

The riddle of existence faces all ages of mankind with the same mysterious countenance; we catch sight of its features, but we must guess at the soul behind it. This riddle is always bound up organically with that of the world itself and with the question what I am supposed to do in this world, why I am in it, and how my life in it will end. Where did I come from? Why do I exist? What will become of me? This is the most general question of all questions and the one that most concerns me (Dilthey 1980: 81-82).

Dilthey’s cluster of questions that motivate worldview construction are those same questions to which we want answers in seeking life’s meaning. In this way, life’s meaning might just be a sense-making framework. It is not a stretch to say that life’s meaning is that which worldview’s aim to provide.

3. Theories of Meaning in Life

Beyond important preliminary discussions over the nature of the question itself and its constituent parts, one will find competing theories of meaning in life. Here, the debate is over the question of what makes a person’s life meaningful, not over the question of whether there is a cosmic meaning of it all (though, again, some think the two cannot be so easily disentangled). The four most influential views of meaning in life are: (1) Supernaturalism, (2) Objective Naturalism, (3) Subjective Naturalism, and (4) Hybrid Naturalism. (5) Nihilism is not a theory of meaning, rather, it is the denial of meaning, whether cosmic or personal. Objective, subjective, and hybrid naturalism are all optimistic forms of naturalism. They allow for the possibility of a meaningful existence in a world devoid of finite and infinite spiritual realities. Pessimistic naturalism, or what is commonly called “nihilism,” is generally, though not always, thought to be an implication of an entirely naturalistic ontology, though vigorous debate exits about whether naturalism entails nihilism.

a. Supernaturalism

Roughly, supernaturalism maintains that God’s existence, along with “appropriately relating” to God, is necessary and sufficient for securing a meaningful life, although accounts diverge on the specifics. Among countless others, historic representatives of supernaturalism in the Near-Eastern ancient world and in subsequent history include Qoheleth (the one called “Teacher” in the Old Testament book of Ecclesiastes), Jesus, the Apostle Paul, Augustine, Aquinas, Jonathan Edwards, Blaise Pascal, Leo Tolstoy, C. S. Lewis, and many contemporary analytic philosophers.

Meaningful life, on supernaturalism, consists of claims along metaphysical, epistemological, and relational-axiological axes. Metaphysically, meaningful life requires God’s existence because, for example, conditions that ground properties necessary for meaning like objective value are thought to be most plausibly anchored in a being like God (See Cottingham 2005; Craig 2008). It also requires, at some level orthodoxy (right belief) and orthopraxy (right life and practice), though again, much debate exists on the details. In addition to God’s existence, meaning in life requires that a person be appropriately related to God, perhaps as expressed in one’s beliefs and especially in one’s devotion, worship, and the quality of her life lived with and among others as, for example, embodied in Jesus’ statement of the greatest commandments (cf. Matt. 22:34-40).

Pascal captures the spirit of supernaturalism in this passage from the Pensées:

What else does this craving, and this helplessness, proclaim but that there was once in man a true happiness, of which all that now remains is the empty print and trace? This he tries in vain to fill with everything around him, seeking in things that are not there the help he cannot find in those that are, though none can help, since this infinite abyss can be filled only with an infinite and immutable object; in other words by God himself (Pascal 1995: 45).

As does St. Augustine at the beginning of his Confessions:

. . . you have made us for yourself, and our heart is restless until it rests in you (St. Augustine 1963: 17).

It is worth noting that there are versions of supernaturalism that do not view God as necessary for meaningful life, but nonetheless claim that God and relating to God in appropriate ways would significantly enhance meaning in life. This more moderate form of supernaturalism allows for the possibility of meaningful life, in some measure, on naturalism (see Metz 2019 for a helpful taxonomy of the conceptual space here).

Supernaturalist views, whether stronger or more moderate, connect with questions and concerns about the problem of evil, post-mortem survival, and ultimate justice. It is often thought that a being like God is needed to “author and direct” the narrative of the universe, and, in some sense, the narratives of our individual lives to a good and blessed ending (involving both closure and teleological senses of ending, though not an absolute termination sense; see Seachris 2011). Many worry that, on naturalism, life does not make sense or is absurd (a kind of sense-making meaning; see Section 2.a.i. above) if there is no ultimate justice and redemption for the ills of this world, and if the last word is death and dissolution, followed by silence, forever.

b. Subjective Naturalism

Subjective naturalism is an optimistic naturalistic view in claiming that life can be robustly meaningful even if there is no God, after-life, or transcendent realm. In this, it is like objective and hybrid forms of naturalism. According to subjective naturalism, what constitutes a meaningful life varies from person to person, and is a function of one getting what one strongly wants or by achieving self-established goals or through accomplishing what one believes to be really important. Caring about or loving something deeply has been thought by some to confer meaning in life (see Frankfurt 1988). Some subjectivist views focus on affective states of a certain psychological profile, like fulfillment or satisfaction for example, as constituting the essence of meaningful life (see Taylor 1967). Subjectivism is appealing to some in light of perceived failures to ground objective value, either naturally, non-naturally, or supernaturally, and in accounting for the widespread view that meaning and fulfillment are closely connected.

A worry for subjective naturalism, analogous to ethical worries about moral relativism, is that this view is too permissive, allowing for bizarre or even immoral activities to ground meaning in life. Many protest that surely deep care and love, by themselves, are not sufficient to confer meaningfulness in life. What if someone claims to find meaning by measuring and re-measuring blades of grass or memorizing the entire catalogue of Netflix shows or, worse, torturing people for fun? Can a life centering on such pursuits be meaningful? A strong, widespread intuition here inclines many towards requiring a condition of objective value or worth on meaning. Subjectivism still has thoughtful defenders, though, with some proposals moving towards grounding value inter-subjectively—in community and its shared values—as opposed to in the individual exclusively. It is also worth noting that one could be a subjectivist about meaning while being an objectivist about morality. In this way, a fulfilled torturer might lead a meaningful, though immoral life. Meaning and morality, on this view, are distinct values that can, in principle, come into conflict.

c. Objective Naturalism

Objective naturalism, like subjective naturalism, posits that a meaningful life is possible in a purely physical world devoid of finite and infinite spiritual realities. It differs, though, in what is required for meaning in life. Objective naturalists claim that a meaningful life is a function of appropriately connecting with mind-independent realities of objective worth (contra subjectivism), and that are entirely natural (contra supernaturalism). Theories differ on the nature of this connection. Some require mere orientation around objective value, while others require a stronger causal connection with good outcomes (see Smuts 2013). Again, objective naturalism is distinguished from subjective naturalism by its emphasis on mind-independent, objective value. One way of putting the point is to say that wanting or choosing is insufficient for a meaningful life. For example, choosing to spend one’s waking hours memorizing the inventory of one’s local Target store, even if this activity results in fulfillment, is likely insufficient for meaning on objective naturalism. Rather, meaning is a function of linking one’s life to objectively valuable, mind-independent conditions that are not themselves the sole products of what one wants and chooses. On objective naturalism it is possible to be wrong about what confers meaning on life—something is meaningful, at least partly, in virtue of its intrinsic nature, irrespective of what is believed about it. This is why spending salient portions of one’s life memorizing department store inventories is not meaningful on objective naturalism, even if the person strongly desires to do this.

One worry for objective naturalism is that it may have a harder time accounting for cases of neural atypicality, for example, a person with ASD who is deeply fulfilled by activities that seem to lack intrinsic value or worth. Does a person who is not a plumber and for whom pipes and interactions with pipes provide salient goals, a kind of coherence to his life, and enjoyable experiences fail to acquire meaning because it all largely revolves around a fascination with pipes? Might subjectivist views better account for the lives of those among us whose interests and interactions with the world are strikingly different, and for whom such interests are the result of neural atypicality?

Critics of objective naturalism might also press the point that proponents of this view conflate meaning and morality or at least conflate important aspects of these two putatively different kinds of value. One value might be objectively shaped, whereas the other might not.

d. Hybrid Naturalism

Many researchers think that there is something right about both objectivist and subjectivist views, but that each on its own is incomplete. Susan Wolf has developed what has come to be one of the more influential theories of meaning in life over the last decade or so, the fitting-fulfillment view. Her view includes both objective and subjective conditions, and is captured by the slogan, “Meaning arises when subjective attraction meets objective attractiveness” (Wolf 1997: 211). Meaning is not present in a life spent believing in, being fulfilled by, or caring about worthless projects, but neither is it present in a life spent engaging in worthwhile, objectively valuable projects without also believing in, being fulfilled by, or caring about them. Many think hybridist views capture what is best about objectivism and subjectivism while avoiding the pitfalls of each.

In their naturalistic forms, such theories of meaning are inconsistent with supernaturalism. However, one can imagine supernaturalist forms of each of these views. One might be a supernaturalist who thinks that meaning wholly or largely consists in subjective fulfillment in the Divine—a kind of subjectivism, or that meaning consists in orientation around objective value, again grounded in the Divine—a kind of objectivism. One could also formulate distinctly supernaturalist hybrid views.

e. Pessimistic Naturalism: Nihilism

In opposition to all optimistic views about the possibility of meaningful life, is pessimistic naturalism, more commonly called nihilism. Roughly, nihilism is the view that denies that a meaningful life is possible because, literally, nothing has any value. Nihilism may be understood as a combination of theses and assumptions drawn from both supernaturalism and naturalism: (i) God or some supernatural realm is likely necessary for value and a meaningful life, but (ii) no such entity or realm exists, and therefore (3) nothing is ultimately of value and there is, therefore, no meaning. Other forms of nihilism focus on states like boredom or dissatisfaction, arguing that boredom sufficiently characterizes life so as to make it meaningless, or that human lives lack the requisite amount of satisfaction to confer meaning upon them.

f. Structural Contours of Meaning in Life

If meaning is a distinct kind of value that a life can have, and if the three senses of meaning above (see Section 2.a. above) capture the range of ideas encompassed by meaning, then these ideas can help illumine the conceptual shape of meaning in life. Each of the ordinary senses of “meaning” provides strategies for conceptualizing the broad structural contours of meaningful life.

Sense-making: An intelligible life; one that makes sense (broad sense-making), that fits together properly, and exhibits a kind of coherence (for example, relationally, vocationally, morally, spiritually, and so on), perhaps even narrative coherence.

Purpose: A life saliently oriented around purposes, goals, and aims, and lived on purpose in which the person’s autonomy is sufficiently engaged.

Significance: A life that matters (and has positive value)—intrinsically in virtue of the kind of life that it is and extrinsically in virtue of its implications and impacts, especially within the narrow (e.g., familial) and broad (e.g., cultural) relational webs of which the person is a part.

Though one can view these as largely different ways of thinking about what a meaningful life is, one might think that there is a more organic relationship between them. Here is one strategy through which all three senses of meaning might coalesce and bring into relief the full structural contours of meaningful life in a unified way:

Meaningful Life = A life that makes sense, that fits together properly (sense-making) in virtue of appropriate orientation around goals (purpose), other (atelic) activities (see Setiya 2017), and relationships that matter and have positive value (significance).

Philosophers may want to follow social scientists here in thinking more about this tripartite conception of meaning. Psychologists, for example, are increasingly using similar accounts in experimental design and testing. One prominent psychologist working in the area of meaning proposes a definition of meaning in life that incorporates a similar triad that prioritizes sense-making:

Meaning is the web of connections, understandings, and interpretations that help us comprehend our experience and formulate plans directing our energies to the achievement of our desired future. Meaning provides us with the sense that our lives matter, that they make sense, and that they are more than the sum of our seconds, days, and years (Steger 2012: 165).

4. Death, Futility, and a Meaningful Life

Life’s meaning is closely linked with a cluster of related issues including death, futility, and endings in general. These are important themes in the literature on meaning, and are found in a wide array of sources ranging from the Old Testament book of Ecclesiastes to Tolstoy to Camus to contemporary analytic writing on the topic. Worries that death, as conceived on naturalism, threatens meaning lead into discussions about futility. It is a commonly held view that life is futile if all we are and do eventually comes to nothing. If naturalism is true and death is the end . . . period . . . then life is futile, so the argument goes. Left undeveloped, it is not entirely clear what people mean by this, though the sentiment behind the idea is intense and prevalent.

In order to explore the worry further, it is important to get clearer on what is meant by futility. In ordinary cases, something is futile when the accomplishment or fulfillment of what is aimed at or desired is impossible. Examples of futility include:

It is futile for a human being to try to both exist and not exist at the same time and in the same sense.

It is futile to try and jump to Mars.

It is futile to try and write an entire, 300-page novel, from start to finish, in one hour.

On the preceding account of futility, the existential angst that accompanies some instance of futility is proportional to how one feels about what it is that is futile. The extent to which one is invested—for example, emotionally and relationally—in attempting to reach some desired end will affect how she responds to real or perceived futility (“perceived” because one could be wrong about whether or not something is, in fact, futile). Imagine that a person has a curiosity to experience flying as a falcon flies. It would be futile to attempt to fly as a falcon flies. Though this person might be minimally distressed as a result of not being able to experience this, it is doubtful he would experience soul-crushing angst. Contrast this with a situation where one has trained for years to run an ironman triathlon, but one week prior to the event, she is paralyzed from the neck down in a tragic automobile accident. To now try and compete in the triathlon without mechanical assistance would be futile. Given the importance of this goal in the person’s life, she would appropriately feel significant existential angst at not being able to compete. Years of training would be unrewarded. Deep hopes would be dashed. A central life goal is now forever unfulfilled. The level of existential angst accompanying futility, then, is proportional to the level of one’s investment in some desired end and the relative desirability of that end.

The preceding analysis is relevant to futility and life’s meaning. What might people have in mind when they say that life itself is futile if naturalism is true and death is the last word of our lives and the universe? The discrepancy here from which a sense of futility emerges is between central longings of the human heart and a world devoid of God and an afterlife, which is a world incapable of fulfilling such longings. There is a stark incongruity between what we really want (even what we might say we need) and a completely and utterly silent universe that does not care. There is also a discrepancy between the final state of affairs where quite literally nothing matters, and the current state of affairs where many things seem to matter (e.g., relationships, personal and cultural achievements, and scientific advancements, among others). It seems hard to fathom that things with such existential gravitas are but a vapor in the grand scheme of things. We might also call this absurd, since absurdity and futility are connected, both of which are partly encapsulated in the idea of a profound incongruity or lack of fit.

Futility, in this way, connects to hope and expectations about fulfillment and longevity. In some circumstances, we are inclined to think that something is characterized by futility if it does not last as long as we think it should last given the kind of thing that it is. If you spend half a day building a snow fort and your children destroy it in five minutes, you will be inclined to think that your efforts were futile even though you accomplished your goal of building the fort. You will not, however, think your efforts were futile if the fort lasts a few days and provides you and your children with several fun adventures and a classic snowball fight. It needs to last long enough to serve its purpose.

Some say that an average human lifetime with average human experiences is sufficient to satiate core human longings and for us to accomplish central purposes (see Trisel 2004). Others, however, think that only eternity is long enough to do justice to those aspects of the human condition of superlative value, primarily and especially, happiness and love, the latter understood roughly as commitment to the true good or well-being of another. Some things are of such sublime character that for them to be extinguished, even after eons upon eons, is truly tragic, so the thinking goes. Anything less than forever is less than enough time, and leads to a sense of futility. We want the most important things in life—especially happiness, love and relationships—to last indefinitely. But if naturalism is true, all will be dissolved in the death of ourselves and the universe; it will be as if none of this ever happened. If the important stuff of life that we are so invested in lasts only a short while, many worry that life itself is deeply and ultimately futile.

Futility, then, is sometimes linked with how something ends. With life’s meaning in view, many worry that its meaning is jeopardized if, in the end, all comes indelibly to naught. Such worries have been articulated in what some call Final Outcome Arguments (see Wielenberg 2006). A final outcome argument is one whose conclusion is that life is somewhat or wholly meaningless or absurd or futile because of a “bad” ending. Such arguments can have weaker and stronger conclusions, ranging from a “bad” ending only slightly mitigating meaning all the way to completely destroying meaning. What they all have in common, however, is that they give the ending an important say in evaluating life’s meaning.

Why think that endings have such power? Many have argued that giving them this power arbitrarily privileges the future over the past. Thomas Nagel once said that “. . . it does not matter now that in a million years nothing we do now will matter” (Nagel 1971: 716). Why should we think the future is more important than, or relevant at all to the past and the present? But perhaps Nagel is mistaken. There may, in fact, be good reasons to think that how life ends is relevant for evaluating its meaning (see Seachris 2011). Whichever conclusion one adopts, principled reasons must be offered to settle the question of which viewpoint—the distant future or the immediate present—takes priority in appraisals of life’s meaning.

5. Underinvestigated Areas

Within value theory, an under-investigated area is how meaning fits within the overall normative landscape. How is it connected, if at all, with ethical, aesthetic, and eudaimonistic value, for example? What sorts of relationships, conceptual, causal or otherwise, exist between the various values? Do some reduce to others? Can profoundly unethical lives still count as meaningful? What about profoundly unhappy lives? These and other questions are on the table as a growing number of researchers investigate them.

Another area in need of increased attention is the relationship between meaning and suffering. Suffering intersects with our attempts to make sense of our lives in this universe, motivates our questions about why we are here, and gives rise to our concerns about whether or not we ultimately matter. We wonder if there is an intelligible, existentially satisfying narrative in which to locate—make sense of—our visceral experience of suffering, and to give us solace and hope. Evil in a meaningful universe does not cease from being evil, but it can be more bearable within these hospitable conditions. Perhaps the problem of meaning is more fundamental than the problem of evil. Also relevant is what can be called the eschatological dimension of the problem of evil—is there any hope in the face of pain, suffering, and death, and if so, in what does this hope consist? Addressing future-oriented considerations of suffering will naturally link to perennial meaning of life topics like death and futility. Additionally, it will motivate further discussion over whether the inherent human desire for a felicitous ending to life’s narrative, including, for example, post-mortem survival and enjoyment of the beatific vision or some other blessed state is mere wishful thinking or a cousin to our desire for water, and thus, a truly natural desire that points to an object capable of fulfilling it.

Equally under-investigated is how the concept of narrative (and meta-narrative) might shed light on the meaning of life, and especially what talk of life’s meaning is often all about. Historically, most of the satisfying narratives that in some way narrated the meaning of life were also religious or quasi-religious. Additionally, many of these narratives count as narratives in the paradigmatic sense as opposed to non-narrative modes of discourse. However, with the rise of naturalism in the West, these narratives and the religious or quasi-religious worldviews embedded within them, began to lose traction in certain sectors. Out of this milieu emerged more angst-laden questioning of life’s meaning accompanied by the fear that a naturalistic meta-narrative of the universe fails to be existentially satisfying. More work is needed by cognitive scientists, theologians, and philosophers on our narrative proclivities as human beings, and how these proclivities shape and illumine our pursuit of meaning.

Finally, a number of pressing practical and ethical questions, especially focusing on marginalized populations, deserve more careful attention. For example, how might the actual lives and experiences of persons with disabilities inform and constrain theories of meaning in life? Do their lives call into question certain theories of meaning? What does the practice of solitary confinement reveal about the human need of meaning? Does the profound lack of meaning in such circumstances provide a reason to impose stricter limitations on its use? How might the human need for meaning (see Bettelhiem 1978; Frankl 2006) be leveraged to understand and then address systemic societal issues like homelessness and opioid addiction? How can understanding seemingly pathological expressions of our yearning for meaning help make sense of and respond to nationalism and terrorism?

Analytic philosophy, once deeply skeptical of and indifferent to the meaning of life, is now the source of important and interesting new theorization on the topic. There is even something of a subfield emerging, consisting of researchers devoting significant time and energy to understanding conceptual and practical aspects of life’s meaning. The topic is being approached with an analytic rigor that is leading to progress and opening exciting avenues for promising new breakthroughs. The philosophical waters, though still murky, are clearing.

6. References and Further Reading

Adams, E. M. “The Meaning of Life.” International Journal for Philosophy of Religion 51 (April 2002): 71-81.
Antony, Louise M., ed. Philosophers Without Gods: Meditations on Atheism and the Secular Life. Oxford: Oxford University Press, 2007.
Audi, Robert. “Intrinsic Value and Meaningful Life.” Philosophical Papers 34 (2005): 331-55.
Augustine. The Confessions of St. Augustine. Trans. by Rex Warner. New York: Mentor, 1963.
Baggini, Julian. What’s It All About? Philosophy & the Meaning Of Life. Oxford: Oxford University Press, 2004.
Baumeister, Roy F. Meanings of Life. New York: The Guilford Press, 1991.
Baumeister, Roy F., Kathleen D. Vohs, Jennifer Aaker, and Emily N. Garbinsky. “Some Key Difference between a Happy Life and a Meaningful Life.” Journal of Positive Psychology 8:6 (2013): 505-516.
Benatar, David. Better Never to Have Been: The Harm of Coming into Existence. Oxford: Oxford University Press, 2009.
Benatar, David. The Human Predicament: A Candid Guide to Life’s Biggest Questions. New York: Oxford University Press, 2017.
Benatar, David, ed. Life, Death & Meaning: Key Philosophical Readings on the Big Questions. Lanham, MD: Rowman & Littlefield Publishers, 2004.
Berger, Peter. The Sacred Canopy. New York: Doubleday, 1967.
Bernstein, J. M. “Grand Narratives.” in Paul Ricouer: Narrative and Interpretation, ed. David Wood, 102-23. London: Routledge, 1991.
Bettelheim, Bruno. The Uses of Enchantment. New York: Knopf, 1978.
Bielskis, Andrius. Existence, Meaning, Excellence: Aristotelian Reflections on the Meaning of Life. London: Routledge, 2017.
Blessing, Kimberly A. “Atheism and the Meaningfulness of Life.” in The Oxford Handbook of Atheism. New York: Oxford University Press, 2013: 104-118.
Bortolotti, Lisa, ed. Philosophy and Happiness. Hampshire, UK: Palgrave Macmillan, 2009.
Bradley, Ben. “Existential Terror.” Journal of Ethics 19 (2015): 409-18.
Britton, Karl. Philosophy and the Meaning of Life. Cambridge: Cambridge University Press, 1969.
Calhoun, Cheshire. “Geographies of Meaningful Living.” Journal of Applied Philosophy 32:1 (2015): 15-34.
Campbell, Stephen M., and Sven Nyholm. “Anti-Meaning and Why It Matters.” Journal of the American Philosophical Association 1:4 (Winter 2015): 694-711.
Camus, Albert. The Myth of Sisyphus and Other Essays. Translated by Justin O’Brien. New York: Vintage International, 1983.
Chappell, Timothy. “Infinity Goes Up On Trial: Must Immortality Be Meaningless?” European Journal of Philosophy 17 (March 2009): 30-44.
Cottingham, John. On the Meaning of Life. London: Routledge, 2003.
Cottingham, John. The Spiritual Dimension: Religion, Philosophy and Human Value. Cambridge: Cambridge University Press, 2005.
Craig, William Lane. “The Absurdity of Life Without God.” in Reasonable Faith: Christian Truth and Apologetics, 3rd Ed., 65-90. Wheaton, IL: Crossway Books, 2008.
Crane, Tim. The Meaning of Belief: Religion from an Atheist’s Point of View. Cambridge, MA: Harvard University Press, 2017.
Davis, William H. “The Meaning of Life.” Metaphilosophy 18 (July/October 1987): 288-305.
Dilthey, Wilhelm. Gesammelte Schriften, 8:208-9, quoted by Theodore Plantinga, Historical Understanding in the Thought of Willhelm Dilthey. Toronto: University of Toronto Press, 1980.
Eagleton, Terry. The Meaning of Life. Oxford: Oxford University Press, 2007.
The Book of Ecclesiastes.
Edwards, Paul. “Life, Meaning and Value of.” in The Encyclopedia of Philosophy, Vol. 4, ed. Paul Edwards, 467-477. New York: Macmillan Publishing Company, 1967.
Edwards, Paul. “Why.” in The Encyclopedia of Philosophy, Vols. 7 & 8, ed. Paul Edwards, 296-302. New York: Macmillan Publishing Company, 1972.
Flew, Antony. “Tolstoi and the Meaning of Life.” Ethics 73 (January 1963): 110-18.
Fischer, John Martin. “Free Will, Death, and Immortality: The Role of Narrative.” Philosophical Papers 34 (November 2005): 379-403.
Fischer, John Martin. “Recent Work on Death and the Meaning of Life.” Philosophical Books 34 (April 1993): 65-74.
Fischer, John Martin. “Why Immortality is Not So Bad.” International Journal of Philosophical Studies 2 (September 1994): 257-70.
Flanagan, Owen. The Really Hard Problem: Meaning in a Material World. Cambridge, MA: MIT, 2007.
Ford, David. The Search for Meaning: A Short History. Berkeley, CA: University of California Press, 2007.
Frankfurt, Harry. The Importance of What We Care About. New York: Cambridge University Press, 1988.
Frankl, Viktor. Man’s Search for Meaning. Boston: Beacon Press, 2006.
Friend, David, and the Editors of LIFE. More Reflections on The Meaning of Life. Boston: Little Brown and Company, 1992.
Froese, Paul. On Purpose: How We Create the Meaning of Life. Oxford: Oxford University Press, 2016.
Gillespie, Ryan. “Cosmic Meaning, Awe, and Absurdity in the Secular Age: A Critique of Religious Non-Theism.” Harvard Theological Review 111:4 (2018): 461-487.
Goetz, Stewart. The Purpose of Life: A Theistic Perspective. London: Continuum, 2012.
Goetz, Stewart, and Joshua W. Seachris. What is This Thing Called The Meaning of Life? New York: Routledge, 2020.
Goldman, Alan H. Life’s Values: Pleasure, Happiness, Well-Being, & Meaning. Oxford: Oxford University Press, 2018.
Goodenough, Ursula W. “The Religious Dimensions of the Biological Narrative.” Zygon 29 (December 1994): 603-18.
Gordon, Jeffrey. “Is the Existence of God Relevant to the Meaning of Life?” Modern Schoolman 60 (May 1983): 227-46.
Gordon, Jeffrey. “Nagel or Camus on the Absurd?” Philosophy and Phenomenological Research 45 (September 1984): 15-28.
Haldane, John. Seeking Meaning and Making Sense. Exeter, UK: Imprint Academic, 2008.
Hamilton, Christopher. Living Philosophy: Reflections on Life, Meaning and Morality. Edinburgh: Edinburgh University Press, 2009.
Haught, John F. Is Nature Enough? Meaning and Truth in the Age of Science. Cambridge: Cambridge University Press, 2006.
Hepburn, R. W. “Questions about the Meaning of Life.” Religious Studies 1 (April 1966): 125-40.
Himmelmann, Beatrix, ed. On Meaning in Life. Boston: De Gruyter, 2013.
Holland, Alan. “Darwin and the Meaning of Life.” Environmental Values 18:4 (2009): 503-518.
Holley, David M. Meaning and Mystery: What it Means to Believe in God. Malden, MA: Wiley-Blackwell, 2010.
Kahane, Guy. “Our Cosmic Insignificance,” Noûs (2013): 1-28.
Karlsson, Niklas, George Loewenstein, and Jane McCafferty. “The Economics of Meaning.” Nordic Journal of Political Economy 30:1: 61-75.
Kauppinen, Antti. “Meaningfulness and Time.” Philosophy and Phenomenological Research 84 (2012): 345-77.
Kekes, John. “The Meaning of Life.” Midwest Studies in Philosophy 24 (2000): 17-34.
Kekes, John. The Human Condition. New York: Oxford University Press, 2010.
King, Laura A., Samantha J. Heintzelman, and Sarah J. Ward, “Beyond the Search for Meaning: A Contemporary Science of the Experience of Meaning in Life,” Current Directions in Psychological Science, 25:4 (2016): 211-216.
Klemke, E. D., and Steven M. Cahn, eds. The Meaning of Life. 4th edn. New York: Oxford University Press, 2017.
Kraay, Klaas J. Does God Matter? Essays on the Axiological Consequences of Theism. New York: Routledge, 2018.
Lacey, Alen. “The Meaning of Life,” in The Oxford Companion to Philosophy, 2nd ed., ed. Ted Honderich. New York: Oxford University Press, 2005.
Landau, Iddo. Finding Meaning in an Imperfect World. New York: Oxford University Press, 2017.
Landau, Iddo. “Life, Meaning of” in The International Encyclopedia of Ethics. Wiley-Blackwell, 2013: 3043-3047.
Landau, Iddo. “The Meaning of Life Sub Specie Aeternitatis.” Australasian Journal of Philosophy 89:4 (2011): 727-734.
Law, Stephen. “The Meaning of Life.” Think 11 (2012): 25-38.
Leach, Stephen, and James Tartaglia, eds. The Meaning of Life and the Great Philosophers. London: Routledge, 2018.
Levine, Michael. “What Does Death Have to Do with the Meaning of Life?” Religious Studies 23(1987): 457-65.
Levy, Neil. “Downshifting and Meaning in Life.” Ratio 18 (June 2005): 176-89.
Lewis, C. S. “De Futilitate.” in Christian Reflections. Grand Rapids, MI: William B. Eerdmans Publishing Company, 1995.
Lewis, C. S. “On Living in an Atomic Age,” in Present Concerns. San Diego: Harcourt, Inc., 1986.
Luper-Foy, Stephen. “The Absurdity of Life.” Philosophy and Phenomenological Research 52 (1992): 85-101.
Lurie, Yuval. Tracking the Meaning of Life: A Philosophical Journey. Columbia, MO: University of Missouri Press, 2006.
MacIntyre, Alasdair. After Virtue, 3rd Edn. Notre Dame, IN: University of Notre Dame Press, 2007.
Makkreel, Rudolf A. “Dilthey, Wilhelm,” in The Cambridge Dictionary of Philosophy, ed. Robert Audi. Cambridge: Cambridge University Press, 2001.
Markus, Arjan. “Assessing Views of Life: A Subjective Affair?” Religious Studies 39 (2003): 125-43.
Martela, Frank, and Michael F. Steger, “The Three Meanings of Meaning in Life: Distinguishing Coherence, Purpose, and Significance,” The Journal of Positive Psychology, 11:5 (2016): 531-45.
Martin, Michael. Atheism, Morality, and Meaning. Amherst, NY: Prometheus Books, 2002.
Mawson, Timothy. God and the Meanings of Life: What God Could and Couldn’t do to Make Our Lives More Meaningful. London: Bloomsbury, 2016.
Mawson, Timothy. “Recent Work on the Meaning of Life and Philosophy of Religion.” Philosophy Compass 8 (2013): 1138-1146.
Mawson, Timothy. “Sources of Dissatisfaction with Answers to the Question of the Meaning of Life.” European Journal for Philosophy of Religion 2 (Autumn 2010): 19-41.
May, Todd. A Significant Life: Human Meaning in a Silent Universe. Chicago: University of Chicago Press, 2016.
McDermott, John J. “Why Bother: Is Life Worth Living?” The Journal of Philosophy 88 (November 1991): 677-83.
McGrath, Alister E. Surprised by Meaning. Louisville, KY: Westminster John Knox, 2011.
Metz, Thaddeus. “The Concept of a Meaningful Life.” American Philosophical Quarterly 38 (April 2001): 137-53.
Metz. Thaddeus. “Could God’s Purpose be the Source of Life’s Meaning?” Religious Studies 36 (2000): 293-313.
Metz, Thaddeus. “God’s Purpose as Irrelevant to Life’s Meaning: Reply to Affolter.” Religious Studies 43 (December 2007): 457-64.
Metz, Thaddeus. God, Soul and the Meaning of Life (Elements in the Philosophy of Religion). Cambridge: Cambridge University Press, 2019.
Metz, Thaddeus. “The Immortality Requirement for Life’s Meaning.” Ratio 16 (June 2003): 161-77.
Metz, Thaddeus. Meaning in Life. Oxford: Oxford University Press, 2016.
Metz, Thaddeus. “The Meaning of Life,” The Stanford Encyclopedia of Philosophy (Summer 2007 Edition), Edward N. Zalta (ed.).
Metz, Thaddeus. “New Developments in the Meaning of Life.” Philosophy Compass 2 (2007): 196-217.
Metz, Thaddeus. “Recent Work on the Meaning of Life,” Ethics 112 (July 2002): 781-814.
Metz, Thaddeus. “Utilitarianism and the Meaning of Life.” Utilitas 15 (March 2003): 50-70.
Morris, Thomas V. Making Sense of It All: Pascal and the Meaning of Life (Grand Rapids: William B. Eerdmans Publishing Company, 2002.
Moser, Paul K. “Divine Hiddenness, Death, and Meaning,” in Philosophy of Religion: Classic and Contemporary Issues, ed. Paul Copan and Chad Meister, 215-27. Malden, MA: Blackwell Publishers, 2008.
Munitz, Milton K. Does Life Have A Meaning? Buffalo, NY: Prometheus Books, 1993.
Nagel, Thomas. “The Absurd.” The Journal of Philosophy 68 (1971): 716-27.
Nagel, Thomas. Secular Philosophy and the Religious Temperament: Essays 2002-2008. Oxford: Oxford University Press, 2010.
Nozick, Robert. “Philosophy and the Meaning of Life.” in Philosophical Explanations. Cambridge, MA: Belknap, 1981. 571-79; 585-600.
O’Brien, Wendell. “Meaning and Mattering.” The Southern Journal of Philosophy 34 (1996): 339-60.
Oliva, Mirela. “Hermeneutics and the Meaning of Life.” Epoché 22:2 (Spring 2018): 523-39.
Pascal, Blaise. Pensées. Translated by A. J. Krailsheimer. London: Penguin Books, 1995.
Perrett, Roy W. “Tolstoy, Death and the Meaning of Life.” Philosophy 60 (April 1985): 231-45.
Pritchard, Duncan. “Absurdity, Angst, and the Meaning of Life.” The Monist 93 (January 2010): 3-16.
Rosenburg, Alex. The Atheist’s Guide to Reality: Enjoying Life Without Illusions. New York: Norton, 2011.
Rosenberg, Alex, and Tamler Sommers. “Darwin’s Nihilistic Idea.” Biology and Philosophy 18 (2003): 653-68.
Ruse, Michael. A Meaning to Life. Oxford: Oxford University Press, 2019.
Ruse, Michael. On Purpose. Princeton: Princeton University Press, 2017.
Russell, Bertrand. “A Free Man’s Worship.” in Why I Am Not a Christian and Other Essays on Religion and Related Subjects. New York: Touchstone, 1957. 104-16.
Russell, L. J. “The Meaning of Life.” Philosophy 28 (January 1953): 30-40.
Sartre, Jean-Paul. Existentialism & Humanism. Translated by Philip Mairet. London: Methuen, 1973.
Sartre, Jean-Paul. Nausea. Translated by Lloyd Alexander. New York: New Directions, 1964.
Schopenhauer, Arthur. Essays and Aphorisms. Translated by R. J. Hollingdale. London: Penguin Books, 2004.
Seachris, Joshua. “Death, Futility, and the Proleptic Power of Narrative Ending.” Religious Studies 47:2 (June 2011): 141-63.
Seachris, Joshua. “From the Meaning Triad to Meaning Holism: Unifying Life’s Meaning” Human Affairs 49:4 (2019).
Seachris, Joshua W. “The Meaning of Life as Narrative: A New Proposal for Interpreting Philosophy’s ‘Primary’ Question.” Philo 12 (Spring-Summer 2009): 5-23.
Seachris, Joshua W. “The Sub Specie Aeternitatis Perspective and Normative Evaluations of Life’s Meaningfulness: A Closer Look,” Ethical Theory and Moral Practice 16 (2013): 605-620.
Seachris, Joshua, ed. Exploring the Meaning of Life: An Anthology and Guide. Malden, MA: Blackwell, 2012.
Seachris, Joshua, and Stewart Goetz. eds. God and Meaning: New Essays. New York: Bloomsbury Academic, 2016.
Setiya, Kieran. Midlife: A Philosophical Guide. Princeton: Princeton University Press, 2017.
Sharpe, R. A. “In Praise of the Meaningless Life.” Philosophy Now 25 (Summer 1999): 15.
Sherry, Patrick. “A Neglected Argument for Immortality.” Religious Studies 19 (March 1983): 13-24.
Sigrist, Michael J. “Death and the Meaning of Life.” Philosophical Papers 44:1 (March 2015): 83-102.
Singer, Irving. The Creation of Value. Volume 1 of Meaning in Life. Baltimore: The Johns Hopkins University Press, 1996.
Smart, J. J. C. “Meaning and Purpose.” Philosophy Now 24 (Summer 1999): 16.
Smith, Michael. “Is That All There Is?” The Journal of Ethics 10 (January 2006): 75-106.
Smuts, Aaron. “The Good Cause Account of the Meaning of Life.” Southern Journal of Philosophy 51:4 (2013): 536-62.
Steger, Michael F. “Experiencing meaning in life: Optimal functioning at the nexus of well-being, psychopathology, and spirituality,” (pp. 165-184) in The Human Quest for Meaning, Ed. P. T. P. Wong. New York: Routledge, 2012.
Suckiel, Ellen Kappy. “William James on the Cognitivity of Feelings, Religious Pessimism, and the Meaning of Life.” The Journal of Speculative Philosophy 17 (2003): 30-39.
Svendsen, Lars. A Philosophy of Boredom. Trans. by John Irons. London: Reaktion Books, 2005.
Tartaglia, James. Philosophy in a Meaningless Life. London: Bloomsbury Academic, 2015.
Taylor, Richard. “Time and Life’s Meaning.” Review of Metaphysics 40 (1987): 675-86.
Taylor, Richard. “The Meaning of Life.” in Good and Evil. New York: Macmillan Publishing, 1967.
Thomas, Joshua Lewis. “Meaningfulness as Sensefulness.” Philosophia (2019): https://doi.org/10.1007/s11406-019-00063-x.
Thomson, Garrett. On the Meaning of Life. London: Wadsworth, 2003.
Tolstoy, Leo. “A Confession.” in Spiritual Writings. Maryknoll, NY: Orbis Books, 2006.
Trisel, Brooke Alan. “Futility and the Meaning of Life Debate.’ Sorites 14 (2002): 70-84.
Trisel, Brooke Alan. “Human Extinction and the Value of Our Efforts.” The Philosophical Forum 35 (Fall 2004): 371-91.
Trisel, Boorke Alan. “Human Extinction, Narrative Ending, and Meaning of Life.” Journal of Philosophy of Life 6:1 (April 2016): 1-22.
Vernon, Mark. After Atheism: Science, Religion, and the Meaning of Life. New York: Palgrave Macmillan, 2008.
Waghorn, Nicholas. Nothingness and the Meaning of Life: Philosophical Approaches to Ultimate Meaning Through Nothing and Reflexivity. London: Bloomsbury, 2014.
White, Heath. “Mattering and Mechanism: Must a Mechanistic Universe be Depressing?” Ratio 24 (September 2011): 326-39.
Wielenberg, Erik J. Value and Virtue in a Godless Universe. Cambridge: Cambridge University Press, 2006.
Williams, Bernard. “The Makropulos Case: Reflections on the Tedium of Immortality.” in The Metaphysics of Death, ed. John Martin Fischer, 73-92. Stanford, CA: Stanford University Press, 1993.
Wisnewski, J. J. “Is the Immortal Life Worth Living?” International Journal for Philosophy of Religion 58 (2005): 27-36.
Wolf, Susan. “Happiness and Meaning: Two Aspects of the Good Life.” Social Philosophy and Policy 14 (December 1997): 207-25.
Wolf, Susan. Meaning in Life and Why It Matters. Princeton: Princeton University Press, 2010.
Wolf, Susan. “Meaningful Lives in a Meaningless World,” Quaestiones Infinitae 14 (June 1997): 1-22.
Wright, N. T. The Resurrection of the Son of God. Vol. 3. Christian Origins and the Question of God.
Minneapolis: Fortress Press, 2003.
Young, Julian. The Death of God and the Meaning of Life. London: Routledge, 2004.
Young, Julian. “Nihilism and the Meaning of Life.” in The Oxford Handbook of Continental Philosophy, eds. Brian Leiter and Michael Rosen. Oxford: Oxford University Press, 2007.

Author Information

Joshua Seachris
Email: jseachris@nd.edu
University of Notre Dame
U. S. A.

Cognitive Penetrability of Perception
and Epistemic Justification

Perceptual experience is one of our fundamental sources of epistemic justification—roughly, justification for believing that a proposition is true. The ability of perceptual experience to justify beliefs can nevertheless be questioned. This article focuses on an important challenge that arises from countenancing that perceptual experience is cognitively penetrable.

The thesis of cognitive penetrability of perception states that the content of perceptual experience can be influenced by prior or concurrent psychological factors, such as beliefs, fears and desires. Advocates of this thesis could, for instance, claim that your desire to have a tall daughter might influence your perception, so that she appears to you to be taller than she is. Although cognitive penetrability of perception is a controversial empirical hypothesis, it does not appear implausible. The possibility of its veracity has been cited in order to challenge positions that maintain that perceptual experience has inherent justifying power.

This article presents some of the most influential positions in contemporary literature about whether cognitive penetration would undermine perceptual justification and why it would or would not do so.

Some sections of this article focus on phenomenal conservatism, a popular conception of epistemic justification that more than any other has been targeted with objections that appeal to the cognitive penetrability of experience

Cognitive Penetrability of Perception and its Consequences
1. What is Cognitive Penetrability?
2. The Epistemic Problem of Cognitive Penetrability
Responses to the Epistemic Problem of Cognitive Penetrability
Conclusion
References and Further Reading

1. Cognitive Penetrability of Perception and its Consequences

a. What is Cognitive Penetrability?

Our perceptual experiences present to us (accurately or not) facts in the world. For instance, you can have an experience as if a bird is singing or as if this ball is red. In these cases, that a bird is singing and that this ball is red can be said to be the representational contents of your experiences.

The cognitive penetrability of perception is a controversial empirical thesis that holds that the content of perceptual experience can partly be shaped by prior or concurrent psychological factors, such as beliefs, desires, traits, moods, entertained hypotheses, conjectures, emotions, expectations, hopes, wishes, doubts, suspicions, attitudes or knowledge that can be acquired through the right training. Whether cognitive penetrability of perception is a real phenomenon is investigated by cognitive science (Raftopoulos and Zeimbekis 2015). Relevant scientific experiments are described for instance in Payne (2001), Hansen et al. (2006), and Stokes and Payne (2011).

To familiarize ourselves with the notion of cognitive penetrability of perception, let us consider two imaginary cases of cognitive penetration: Siegel’s (2013a, 2017) Angry Jack and Markie’s (2005, 2006, 2013) Expert and Novice case (adjusted for the purposes of this article).

Angry Jack

Jill believes without good reason that Jack is angry. When she meets Jack, under the influence of her unjustified belief that Jack is angry, she sees Jack as being angry. Based on her perceptual experience as if Jack is angry, she retains the same belief and, perhaps, her confidence that Jack is angry is even enhanced. Had she not had the prior belief that Jack is angry with her, she would not have seen him as being angry.

Expert and Novice

Two friends are gold prospectors. One of them is an expert at identifying gold. He has learned to do so through long experience. He began with a list of identification rules and consciously applied them. He then reached the point where he could “just see” that a nugget is gold. The other friend is a novice. He has a general sense of what gold looks like, but he is not very good at its visual identification. He nevertheless craves for making a discovery. When the two friends happen to look at a nugget in a pan, the expert’s developed gold-identification abilities come into play, and he has the perceptual experience as if the nugget is gold. The expert believes accordingly. The novice’s strong desire that it be gold comes into play too, and he also has the perceptual experience as if the nugget is gold. The novice believes accordingly. Had the novice not had a strong desire to find gold, he would not have had the perceptual experience as if the rock is gold. Had the expert not had very developed gold-identification abilities, he would not have had the experience as if the rock is gold.

These two cases are supposed to be situations in which the contents of the relevant perceptual experiences are somewhat influenced by the subject’s prior mental states. Jill’s experience is influenced by her prior belief that Jack is angry. The novice’s experience is influenced by his strong desire to find gold, and the expert’s experience is influenced by his knowledge and experience. They are possible cases of cognitive penetration of perception.

As we see in the next section, the problem that cognitive penetrability poses to theories of perceptual justification rests on the intuition that in at least some cases in which perceptual experience is cognitively penetrated, justification is affected negatively. For instance, despite her experience as if Jack is angry, there seems to be something wrong in claiming that Jill has justification for believing that Jack is angry. The same applies to the novice’s case.

Arguably, there are also cases of good cognitive penetration of perception: namely, situations in which the subject’s experience is actually a good basis for some of her beliefs just because it is cognitively penetrated.

An example might be the expert’s cognitively penetrated experience as if the pebble is gold in Expert and Novice. Siegel (2012) provides another possible example in which a cognitively penetrated experience of an expert radiologist inspecting the X-ray of a patient is contrasted with a non-penetrated experience of a non-expert who attends to the same X-ray. Lyons (2011) suggests further examples involving perceptual learning as cases of good cognitive penetration. Perceptual learning is a process based on training and experience that ends up producing changes in the subject’s perceptual abilities (Connolly 2017). Perceptual learning is a form of diachronic cognitive penetration. Lyons also imagines a case of synchronic good cognitive penetration—the Snake Case—involving the sharpening of one’s snake-detection skills in virtue of one’s unjustified belief or fear that there are snakes in one’s trail.

Before going deeper into the relations between cognitive penetration and epistemic justification, we need to have a more accurate picture of what cognitive penetration of perceptual experience consists of.

Not just any kind of influence on perception by psychological states produces cognitive penetration. Some mental states might influence perceptual experience indirectly simply because they change the location from where the subject receives the perceptual stimuli. For example, if I desire to watch TV, I will turn my head towards the TV. So my experience will change from representing the monitor of my laptop to representing the TV. The change in perception imputable to cognitive penetration must not be explainable in terms of a reception of different perceptual stimuli due to body movements, defects of our sensory organs or—more controversially—a difference in the spatio-temporal locations attended to by the subject’s covert attention (Stokes 2012 and Vance 2014).

Siegel (2012) for instance excludes voluntary shift of attention from the definition of cognitive penetration. Nevertheless, she mentions as interesting cases of cognitive penetration that involve relative indifference to stimuli or an attentional selection bias in favor of only particular loci of the stimuli.

For the time being, let us follow Siegel (2012) in accepting that in most cases of cognitive penetration this counterfactual would be satisfied: if S had a cognitive mental state different from the one she actually has, but attended to the same perceptual stimuli as those she actually attends to, S would not have the same perceptual experience. For instance, if the belief that Jack is angry were not part of Jill’s mental state, but Jill still attended to the very same features of Jack’s face, she would not have the perceptual experience as if Jack is angry.

Many philosophers of mind and epistemologists agree that perceptual experience has at least two interplaying components: sensory impressions (for example, colors, smells and tastes), and concepts (for example, the concept of bird and the concept of ball). These philosophers would claim that in order to have the experience as if, say, this ball is red, you need to combine a round and a red sense impression together with the concepts of ball and red into one suitable representational state.

As we later see, the thesis that the perceptual experience of a subject S can be cognitively penetrated is often interpreted in a disjunctive fashion as stating that the sensory impression component or the conceptual component of S’s experience can be cognitively penetrated. In the first case, S’s prior or concurrent mental states would directly change the low-level, non-conceptual part/stage of S’s experience. For instance, suppose that under the influence of her belief that Jack is angry, Jill comes to have visual sensations that typically lead to the formation of higher-level conceptual angry-face-representation. On the grounds of these sensations, it does appear to her that Jack is angry. In the second case, S’s prior or concurrent states would directly affect the part/stage of S’s experience that is conceptual. One could interpret the novice prospector case as an example of this: the novice’s strong desire to find gold produces an experience that, thanks to the concepts embedded in it, represents the pebble before him as gold.

It is important to distinguish S’s perceptual experiences and S’s doxastic states that can accompany these experiences. A perceptual experience as if P may be accompanied by a belief or judgment that P, but this belief or judgment would not be a part of the perceptual experience. Suppose for instance that S does have a perceptual experience as if this ball is red. Concurrently, S may or may not believe or judge that this ball is red. In the same way, one’s perceptual experience as if P may be accompanied by one’s reflective belief that one has a perceptual experience as if P, but this reflective belief would be something distinct from the perceptual experience. Suppose again that S has a perceptual experience as if this ball is red. Concurrently, S may or may not entertain a reflective belief that she has an experience as if this ball is red.

It does not seem implausible that S’s previous or concurrent mental states could directly influence S’s perceptual or reflective beliefs without affecting S’s perceptual experiences. Imagine, for instance, that though Jill does have a perceptual experience as if Jack is not angry, she forms an inaccurate perceptual belief that Jack is angry because she fears that Jack is angry. Alternatively, imagine that although Jill has a perceptual experience as if Jack is not angry, she forms a mistaken reflective belief that she has a perceptual experience as if Jack is angry, due to her belief that Jack is angry

Most of the philosophers involved in the debate on cognitive penetrability would not consider cases like those just described to be genuine examples of cognitive penetration of perceptual experience. The basic problem is that they do not concern effects of S’s mental states on S’s perceptual experience.

Nevertheless, for a comprehensive conception of cognitive penetrability of perception that includes cases like the ones just described, see Lyons (2011). Siegel (2015, 2017) discusses another comprehensive view according to which previous or concurrent mental states of the subject can affect the subject’s perceptions, conceived of in a broadened sense to include also, for instance, experiential judgments and patterns of attention. However, Siegel is careful in using the expression “perceptual farce” just to refer to this general view and in distinguishing it from the more specific view that perceptual experience is cognitively penetrable.

The remainder of this article takes cognitive penetrability to be a phenomenon pertaining to the conceptual component or the sensory impression component of experience.

b. The Epistemic Problem of Cognitive Penetrability

Perceptual experience is, so to speak, the tribunal by which most beliefs can be checked with respect to their epistemic status. The epistemological problem of cognitive penetrability essentially stems from a clash of two conflicting intuitions about the credentials of this tribunal. The first intuition says that perceptual experiences in general possess the kind of intrinsic features that would make the beliefs based on them justified. The second, contrasting intuition says that badly cognitively penetrated experiences—such as the experiences of Jill in Angry Jack and the novice in Expert and Novice—cannot actually justify the beliefs based on them (see Lyons 2016). As it will shortly become clear, the philosophical question underlying this clash of intuitions is whether the causal history—or etiology—of an experience can affect its justificatory power.

It is important to appreciate that although cognitive penetrability is a controversial empirical hypothesis, scientific investigation is not crucially relevant to this epistemological debate. Those who share the intuition that perceptual experiences have intrinsic features that make the beliefs based on them justified typically take this claim to be true a priori of any possible contentful experience as such. In consequence, if cognitive penetration were incompatible with the justificatory power of perceptual experience, even if our hardwiring ruled out cognitive penetrability, the mere possibility of a rational being suffering from cognitive penetration of perception would constitute a threat to that intuition (Markie 2013 and Tucker 2019).

To probe these complex issues, we need now to introduce some basic epistemological notions and individuate one theory of perceptual justification to use as a good example.

Internalists about epistemic justification claim that all the factors that make a subject S possess justification for believing a proposition are (i) reflectively accessible to S or (ii) mental states of S. In case (i), the view is called accessibilism; in case (ii), it is called mentalism. Factors that provide S with justification could for instance be other beliefs of S or her experiences. Externalists about justification deny both (i) and (ii) (see Pappas 2014 and Poston 2018). For example, according to a prominent form of externalism called reliabilism, what renders a belief of S justified is its being produced by a (statistically) reliable process, regardless of whether the process is reflectively accessible to S or not, and of its being wholly mental or not (see Goldman 1979).

Phenomenal conservatism (Huemer 2001 and 2007) is the theory of epistemic justification that many if not most early twenty-first century internalists invoke to account for the justificatory power of experiences. (See Audi 1993 and Pryor 2000 for similar views.) In accordance with it, it is a priori true that:

(PC) If S has a seeming that P, S thereby has prima facie justification for believing P.

Seemings (or appearances) are typically conceived of as experiences provided with a propositional content. (Some phenomenal conservatives think of a perceptual seeming as, specifically, the conceptual component of an experience. For others, a perceptual seeming is made of the conceptual component together with the sensory impression component of an experience.) Although seemings may include more than perceptual experiences—some philosophers think there are, for example, rational, moral and mnemonic seemings—we focus here on perceptual seemings.

(PC) is to be interpreted as stating that if S has a seeming that P and no defeating evidence, S possesses both prima facie and all things considered justification for believing P; whereas if S does have defeating evidence, S possesses only prima facie justification for believing P. Defeating evidence can be any reason for S to believe that P is false or that the seeming that P is deceptive. The ‘thereby’ in (PC) indicates that S’s justification for P comes solely from her seeming that P. Since it does not result from any belief of S, this justification is immediate.

Phenomenal conservatism is customarily taken to be an internalist—both accessibilist and mentalist—theory of justification because it fits with (though it does not entail) the assumption that S’s justification depends only on mental factors reflectively accessible to S—namely, S’s appearances and the absence of defeating evidence.

Let us now investigate the problem of cognitive penetrability in relation to phenomenal conservatism. This is indeed the theory of justification that has been mostly discussed in this context. (See Siegel 2012 and Tucker 2014 about the significance of cognitive penetrability for other theories of epistemic justification.)

Phenomenal conservatism accounts for the internalist intuition that perceptual experiences in general have intrinsic features capable of justifying the beliefs based on them. Suppose S has an experience with content P. If (PC) is correct, S thereby has at least prima facie justification for believing P. Phenomenal conservatism has attracted objections by many epistemologists—both internalist and externalist—who share the contrasting intuition that it is in many cases implausible that a cognitively penetrated experience can justify— even only prima facie—a belief.

Siegel (2012) has described a way in which this intuition becomes palpable: cognitive penetration of perceptual experience seems to allow for the elevation of S from a worse epistemic position to a better one in cases in which such an elevation appears illegitimate or impossible. This epistemic elevation may occur when the penetrating state is unjustified or when it is justified. An instance of the first case is the one in which S gets support for an initially unjustified belief B entertained by her from B itself, through the mediation of an experience cognitively penetrated by B. This is what arguably happens to Jill in Angry Jack: Jill gets support for her initially unjustified belief (B) that Jack is angry from the very same belief B, thanks to the mediation of the perceptual experience as if Jack is angry, cognitively penetrated by B. An instance of the second case would be one where S gets additional support for a justified belief on the basis of a perceptual experience cognitively penetrated by that very same belief. Imagine that, before meeting Jack, Jill forms a justified belief (B) that Jack is angry, for she receives a furious email from him. This prior justified belief B makes Jill have the experience as if Jack is angry when she meets him later on. Thanks to this experience, Jill would get additional support for B.

To facilitate our discussion let us introduce the downgrade thesis (Siegel 2013a and Teng 2016). This thesis holds that a badly cognitively penetrated perceptual experience as if P provides less prima facie justification for believing P than a non-penetrated perceptual experience sharing the same content P. Precisely, if the whole content of the experience is badly cognitively penetrated, the experience as a whole is downgraded; and if only a part of it is badly cognitively penetrated, only that part of the experience is downgraded. For example, suppose S has a badly cognitively penetrated experience as if there is a red car before her. If what is badly cognitively penetrated is just the part of S’s experience that represents the car’s color, S’s experience is downgraded only with respect to the color. Thus, S has prima facie justification for believing that there is a car before her, but less or no prima facie justification for believing that the car is red (Teng 2016).

There is an interesting similarity between the cognitively penetrated experiences of a subject S and the experiences that S would have if she were a victim of a skeptical scenario (such as the Matrix scenario or the evil demon scenario envisaged by Descartes). In both cases, S’s experiences would have anomalous etiologies. In the first case, some mental state of S would interfere with S’s normal causal chains that produce experiences of a certain type. For example, the novice prospector’s craving for gold interferes with his normal visual processes. In the second case, the distal causes of S’s perceptual experiences would be unnatural. For example, if S were in the Matrix, the external cause of her visual experience of a cat would be the Matrix rather than a cat. Despite this similarity, many internalists tend to treat the cases of bad cognitive penetration and the cases of skeptical scenarios differently.

Internalists generally agree that when S is in a skeptical scenario, the anomalous etiologies of S’s perceptual experiences do not downgrade these experiences, so they do not affect their justifying power. The reason being that the segments of the etiologies of the perceptual experiences that make S a victim of a skeptical scenario are neither accessible to nor mental sates of S, which means they could not affect S’s perceptual justification. Internalists agree that if S were in a skeptical scenario, her perceptual beliefs would be at least prima facie justified when based on appropriate experiences. Internalists have long been using this argument to attack externalists about justification. Externalists seem in fact to be committed to holding that S’s perceptual beliefs would be all unjustified if S were deceived by the Matrix or a Cartesian demon. These beliefs would therefore be all false, which would entail that they are produced by unreliable processes (Poston 2018).

When it comes to cognitive penetrability, nevertheless, many internalists and externalists agree that if a perceptual experience of S were badly cognitively penetrated, it would be downgraded to the effect that S would lack prima facie justification for believing its content (Siegel 2012 and Tucker 2013). Externalists could defend this view by insisting that the anomalous etiologies of these perceptual experiences make the processes producing the correlated perceptual beliefs unreliable. Nevertheless, it is not immediately clear why the etiologies of cognitively penetrated experiences and the etiologies of experiences in skeptical scenarios should be considered to be so relevantly different from an internalist point of view. As we see later in the article, certain responses to the epistemic problem of cognitive penetration aim to illuminate this issue too.

2. Responses to the Epistemic Problem of Cognitive Penetrability

The debate on cognitive penetrability and perceptual justification has at least three basic and influential sides. One is the internalist resolute side, which aims to reject the downgrade thesis. For the most part, this is the side of the advocates of phenomenal conservatism. Another side is the externalist reliabilist one, which rejects (PC), does subscribe to the downgrade thesis and explains the weakening or annihilation of the justificatory power of badly cognitively penetrated experiences in terms of unreliability. The third side belongs to the internalist camp, but it deviates from the resolute one. This third side—called here the internalist concessive side—accepts the downgrade thesis but attempts to explain why perceptual justification is undermined in bad cognitive penetration cases, with the aim of, simultaneously, respecting internalist principles. The epistemologists belonging to this side all reject (PC), but some propose views that could be described as variants of phenomenal conservatism. Beyond these three fundamental sides, there are accounts that offer solutions to the problem of cognitive penetrability that do not fit the internalism-externalism dichotomy. The following sub-sections are dedicated to the presentation of key arguments that have been developed within all the aforementioned approaches, as well as to important objections to them.

a. Internalist Resolute Solutions

There are at least three distinct but not incompatible approaches adopted by internalists who reject the downgrade thesis: (i) the defeasibility approach, according to which cognitive penetration does not affect prima facie justification but can only influence all things considered justification; (ii) the intuitive plausibility approach, which rejects the downgrade thesis by heavily relying on internalist intuitions about the irrelevance of etiology as a justificatory factor and intuitions about the justifying power that perceptual experiences have thanks to their intrinsic features; and (iii) the different epistemic status approach, according to which in bad cognitive penetration cases the subject lacks not epistemic justification but rather some other epistemic property or status.

i. The Defeasibility Approach

According to the defeasibility approach, all cases of bad cognitive penetration can be construed as situations where S does have defeating evidence; namely, S suspects, believes or is in some other sense aware that (1) her perceptual experience would have been different if her prior mental state had been different; or S suspects, believes or is in some other sense aware that (1) and that (2) her prior mental state was unjustified or an unreliable guide to truth (see Siegel 2012 and Huemer 2013b). For instance, in Expert and Novice, arguably, the novice is in some sense aware that (1) he would not have had the experience as if the pebble is gold if he had not had the desire to find gold; or he is in some sense aware of both (1) and that (2) one’s craving for gold can make one’s perceptual experience of gold unreliable.

The advocates of this strategy contend that in all cases of bad cognitive penetration, S’s prima facie justification remains untouched. S would instead lack all things considered justification in virtue of having an evidential defeater. These epistemologists emphasize that this is coherent with the account of prima facie justification based on (PC) (Huemer 2013b).

An expected criticism says that in many cases of bad cognitive penetration, S is not actually aware that her perceptual experience is cognitively penetrated or that her cognitively penetrated experience is an unreliable guide to truth, though S could become aware of it (McGrath 2013b and Markie 2013). In response one might appeal to a weaker notion of evidential defeater. One might contend that S would have an evidential defeater even if one were just able to become aware of it, without being actually aware of it (see Siegel 2012). But this would not resolve all problems because the mental state that should work as an evidential defeater might be such that S could not possibly become aware of it (Siegel 2012 and Markie 2013). For example, think of a variant of Angry Jack in which Jill, because of inborn or induced cognitive deficiencies, is incapable of coming to believe that her perceptual experience would have been different if she had had a different prior cognitive state.

The main reason of concern about the defeasibility approach, however, stems from the intuition, which some epistemologists have, that in the case of bad cognitive penetration the subject would lack even prima facie justification (Markie 2005, Huemer 2013b and Tucker 2014).

ii. The Intuitive Plausibility Approach

Phenomenal conservatives may try to defend the contention that in the case of bad cognitive penetration, S would at least have prima facie justification by highlighting its plausibility against a background of internalist intuitions. A key thesis adduced in this context is that perceptual experiences have justifying power in virtue of being experiences, rather than in virtue of having a particular sort of etiology (see Lyons 2016). In accordance with this view, perceptual experiences can differ in their epistemic power only in virtue of their intrinsic factors, not because of their etiologies.

Let us see how this response can be developed. The intuitive plausibility approach aims to support the claim that (i) reflectively inaccessible etiologies of perceptual experiences in cognitive penetration cases play no role in determining whether or not perceptual experiences provide prima facie justification, and the claim that (ii) perceptual experiences possess intrinsic justificatory force. (i) and (ii) are two sides of the same coin.

A way to support (i) is to appeal to the absence of essential differences between bad cognitive penetration cases and zap-like cases (Siegel 2012). ‘Zap-like’ is the expression used by Siegel (2013a) to refer indifferently to scenarios involving bump-on-the-head situations (that is, cases in which S has a hallucination caused by a knock or bump on her head) and skeptical scenarios (involving, for instance, evil demons or the Matrix). Internalists may insist that cognitive penetration cases are not substantially different from zap-like cases. After all, the etiologies of perceptual experiences in cognitive penetration cases are processes reflectively inaccessible to the subject S, just as the etiologies of zap-like cases. Furthermore, the etiologies of perceptual experiences in cognitive penetration cases are processes that do not seem to be subject to S’s rational control, just as the etiologies of zap-like cases. It may appear plausible that the etiology of S’s perceptual experience in a zap-like scenario plays no role in determining whether or not S’s perceptual experience provides S with prima facie justification for her beliefs. (For instance, it may appear plausible that if an evil demon causes Jill’s perceptual experience as if Jack is angry, this fact cannot interfere with the prima facie justification for believing that Jack is angry, which Jill possesses in virtue of her experience. For the evil demon’s actions are reflectively inaccessible to Jill and are not subject to Jill’s rational control.) Since the cases of cognitive penetration are not relevantly different from the zap-like cases in terms of their etiologies, it can be argued that the latter play no role in determining whether or not S’s perceptual experience provides S with prima facie justification for her beliefs.

Although internalists may welcome this defense of (i), many externalists will not concede at the outset that justification is not negatively affected in zap-like cases. They will contend that since the relevant perceptual experiences are misleading in these cases, the correlated belief-formation processes are unreliable. These externalists would conclude that if we appeal to absence of essential differences, we must accept that prima facie justification is negatively affected in cases of bad cognitive penetration too.

A different criticism of this defense of (i) targets the claim that the etiologies of perceptual experiences in cognitive penetration cases are not subject to S’s rational control, just as the etiologies of zap-like cases. The claim is that whereas S may in certain cases be able to avoid bad cognitive penetration by controlling known factors that lead to it, S could not by assumption control the factors that make her a victim of zap-like cases (Siegel 2012 and 2013a). But even if it were established that the etiologies of perceptual experiences in cases of cognitive penetration are not subject to S’s rational control, there could be a debate about whether the etiologies of perceptual experiences in cases of cognitive penetration are in some sense attributable to S in a way that the etiologies of experiences in zap-like cases are not (Siegel 2013a). Internalist accessibilists can nevertheless insist that despite these complications, it is the shared inaccessibility of the etiologies of zap-like cases and cognitive penetration cases that make these cases homologous. S0 the claim would be that if S is unaware of the defective etiology in bad cognitive penetration cases, just as it happens to S in zap-like cases, the etiology must be irrelevant to S’s justification in those cases.

A more direct way to defend (i) is adducing the phenomenology (or subjective features) that a cognitively penetrated perceptual experience shares with a non-penetrated perceptual experience with the same content (see Siegel 2012). For instance, Jill’s perceptual experience as if Jack is angry looks the same when it is the effect of cognitive penetration and when it is not. The perceptual experiences in these two cases are identical in terms of what is introspectively accessible. It could therefore be argued that whether or not an experience is the effect of cognitive penetration is irrelevant to what one has prima facie reason to believe or not. Only evidence of a distorting etiology could be a defeater and affect all things considered justification (Huemer 2013a, see also Silins 2016).

Another way to support (i) is appealing to the intuition that it is implausible that S’s justification for an attitude A could depend on reasons that S could not adduce to explain whether A is justified or not. For instance, an argument by Huemer in defense of (i) considers a case in which S is unable to draw an epistemically significant distinction between the penetrated part and the non-penetrated part of the content of one and the same perceptual experience. Imagine I have one partly cognitively penetrated perceptual experience as if there is a gun and a box with eggs in the fridge. The gun-like part of my perceptual experience is cognitively penetrated, whereas the box-like is not.

I accept E [that there is a box with eggs in the fridge] on the basis of my visual experience. G [that there is a gun in the fridge] also appears to be equally well supported by my visual experience, and I have no reason for thinking the experience representing G to be any less reliable, nor epistemically inferior in any manner whatsoever, to the experience representing E. Nor have I any other grounds for doubting G. Nevertheless, while I accept E, I refuse to accept G, for no apparent reason . . . This attitude . . . strikes me as obviously irrational. I conclude that . . . [I] epistemically ought to accept G . . . If S would have no rational way of explaining why she believed E while refusing to accept G, then S would be irrational to believe E while refusing to accept G (Huemer 2013a, pp. 745–746).

This argument assumes that whether S is justified or unjustified in believing P depends on whether S can potentially appeal to the reasons that make herself justified or unjustified (Siegel 2013b). Given this assumption, S is not unjustified in believing P unless she can rationally explain why she is so. According to this line of thought, justification depends only on reflectively accessible factors. For S’s being in principle able to appeal to the reasons that determine whether she is justified or not in believing P requires S to be able to reflectively access those reasons. Given this, the etiology of perceptual experiences in cognitive penetration cases is irrelevant to S’s justification insofar as it is reflectively inaccessible to S. Setting aside general criticism of accessibilism, a concern about this strategy is that it is not uncontroversial that S can be justified or unjustified in adopting an attitude A only if S is potentially able to rationally explain why she is justified or unjustified in adopting A. (See two apparent counterexamples in McGrath 2013a and in Siegel 2013b).

We have considered ways of supporting or questioning (i)—the thesis that reflectively inaccessible etiologies of perceptual experiences in cognitive penetration cases are irrelevant to prima facie justification. Let us turn to (ii)—the thesis that perceptual experiences possess intrinsic justificatory force. (ii) is directly supported by an apparently straightforward argument resting on an intuition about what attitude S is rationally supposed to adopt, from her point of view, when S entertains a given mental state (McGrath 2013a). If S has an experience as if P and no evidence against P, the most reasonable attitude to take from S’s point of view is belief, rather than disbelief or a suspension of judgment. A parallel argumentative line interprets perceptual experiences as evidence (McGrath 2013a). Considering that S, as a rational believer, has to match her belief to the evidence E available to her, S should form only beliefs that fit E, whatever E might be. Even if, unbeknownst to S, E were acquired through a biased search or flawed method of evidence-gathering, E would constitute the evidence available to S. So, S should adjust her doxastic attitude in a way to fit E, independently of its etiology.

A further way of defending (ii) might be appealing to coherence requirements derived from an experience as if P. Suppose S does not have justification for believing P, but nevertheless S does believe P. In this case it is rational for S to believe, say, P-or-Q and disbelieve, say, Not-P. In general, if S is in a mental state M, S is rationally required to think in a particular way in virtue of coherence requirements derived from being in M, regardless of the credentials of M. One could argue that, in the same way, S has prima facie justification for believing R if S has a perceptual experience as if R, in virtue of coherence requirements and independently of the credentials of the experience—so independently of its etiology (see McGrath 2013a).

However, a reply would be that even if it is rational for S to believe P-or-Q when S believes P, in this case S does not necessarily have justification for believing P-or-Q. For S may not have justification for believing P in the first instance (McGrath 2013a and Ghijsen 2016). The intuition that this reply exploits is that the kind of rationality that would provide S with justification for believing P-or-Q is not reducible to coherence requirements. The rationality resting solely on coherence is a sort of conditional rationality: it can provide S with justification for P-or-Q only if S has justification for believing P in the first instance.

An illuminating distinction is the one between rational commitment and justification. If S believes P without justification, she is rationally committed to, for instance, disbelieving not-P and believing P-or-Q, but she does not have justification for disbelieving not-P and believing P-or-Q. Rational commitment is a mere coherence requirement (Tucker 2013 and McGrath 2013a, 2013b).

iii. The Different Epistemic Status Approach

This approach aims to substantiate the thesis that if S is in a case of bad cognitive penetration, ordinarily S does not lack (prima facie) justification but some other epistemic status. Various epistemic statuses have been proposed.

A popular candidate is knowledge, or else warrant—namely, the additional property that a true belief needs to have in order to become knowledge (Tucker 2010 and Huemer 2013a). The no knowledge/warrant approach says that in bad cognitive penetration cases S does not lack justification. Rather, S possesses justification without having knowledge or warrant. For instance, S could have justification for believing P without her belief tracking the truth, or without her belief arising from a reliable belief-forming mechanism, or without her belief arising from a belief-forming mechanism that works properly (Huemer 2013a). This is what presumably happens in evil demon cases or Gettier-style scenarios (see Siegel 2013a for a formulation of cognitive penetration cases as Gettier cases). A general concern about this strategy stems from the mentioned impression that there are substantial differences between perceptual experiences badly cognitively penetrated and the perceptual experiences of a victim of a skeptical scenario or a Gettier-style scenario (Tucker 2010 and Markie 2013). In all these cases, the subject S basing her beliefs on her perceptual experiences lacks knowledge and warrant. Nevertheless, in bad cognitive penetration cases, S may also appear to be blameworthy for having her experiences in a way that the victim of a skeptical scenario or a Gettier-style scenario may not (Tucker 2010). If justification essentially depended on the absence of blameworthiness, the fact that S lacks knowledge or warrant in bad cognitive penetration cases would be redundant or insufficient to explain our intuitions.

To dispel this concern Tucker (2010) adduces the Weirdo thought experiment. Weirdo successfully begs a demon to turn himself into a victim of a skeptical scenario and erase this request from his memory. Tucker insists that it is intuitive that when Weirdo becomes a victim of a skeptical scenario, though he is blameworthy (or lacks blamelessness) for having his deceptive perceptual experiences and he lacks knowledge and warrant, Weirdo is nevertheless justified in his beliefs about the external world (Tucker 2010, 2011). This suggests that S’s being blameworthy (or lacking blamelessness) plays no role in determining whether S is justified in bad cognitive penetration cases (assuming that there is no principled distinction between Weirdo’s blameworthiness and S’s blameworthiness due to cognitive penetration).

To question the no knowledge/warrant approach, Markie (2013) uses a different thought experiment. Suppose a novice gold prospector and an expert are in the same skeptical scenario. The expert’s experience as if the nugget before him is gold is a non-penetrated perceptual experience or a case of good cognitive penetration (given the external stimuli provided by the demon), whereas the novice’s perceptual experience as if the nugget is gold is partly caused by his “wishful seeing,” so it is a case of bad cognitive penetration (see also Tucker 2010 and McGrath 2013b). Markie stresses that the novice’s epistemic status appears worse than the expert’s despite their both lacking knowledge and warrant due to the skeptical scenario. This suggests that what explains the intuitive inadequacy of the epistemic status of the novice, and the intuitive inadequacy of the epistemic status in any bad cognitive penetration case, must be something different from knowledge and warrant.

Tucker (2010) observes that Markie’s case does not necessarily show that bad cognitive penetration affects justification. He suggests that although both the novice and the expert in the skeptical scenario lack knowledge and warrant, what renders them different from an epistemic point of view is simply this: only the novice is epistemically blameworthy in having his experience. Tucker thus proposes a novel candidate for rescuing justification: a victim of bad cognitive penetration does not lack epistemic justification but epistemic blamelessness. She is both justified and blameworthy.

Epistemologists have considered appealing to the absence of other candidates to explain why bad cognitive penetration cases are epistemically defective; for instance: epistemically virtuous belief or proper function of the cognitive faculty (McGrath 2013b); positive evaluation of the subject’s cognitive character (Tucker 2013 drawing from Skene 2013); practical appropriateness of belief-formation (Fumerton 2013).

b. Externalist Concessive Solutions

Externalist reliabilists—like Lyons (2011, 2016) and Ghijsen (2016)—typically agree with concessive internalists (which we consider in Section 2.c) on the truth of the downgrade thesis (Teng 2016). The major point of departure of the concessive reliabilists from the concessive internalists regards the explanation of why prima facie justification is negatively affected by bad cognitive penetration. Concessive reliabilists offer a traditional externalist account, which adduces the unreliability of the processes that produce bad cognitive penetration.

Cognitive penetration is epistemically bad—when it is bad—because and when it cuts us off from the world around us, when it makes us less sensitive to our environments, when it makes us more likely to believe p whether or not p is actually true (Lyons 2016, p. 3).

Bad cognitive penetration of perceptual experience can be construed as a phenomenon that renders the process of belief-formation unreliable with respect to its statistically tracking the truth, or as a phenomenon that makes a perceptual experience as if P an inappropriate ground for S’s belief that P (see Lyons 2011, 2016).

The contemporary debate of cognitive penetration and epistemic justification typically presupposes that cognitive penetration may either worsen or enhance the epistemic status of perceptual experience (see Section 1.a). A virtue of concessive reliabilism is the illuminating explanation that it offers for distinguishing the cases of bad cognitive penetration from the cases of good cognitive penetration (Ghijsen 2016). According to Lyons (2011, 2016), whereas the cases of bad cognitive penetration are those that affect reliability negatively, the cases of good cognitive penetration are those that affect reliability positively. And this is so regardless of the penetrating states being a (justified or unjustified) belief or a non-doxastic state like a desire or a fear.

Another asserted virtue of the concessive reliabilist account is that it offers a unitary solution to the problem of cognitive penetration and the problem of why perceptual experiences can have or lack justificatory power when experience is unpenetrated. In particular, it explains the cases in which S is affected by bad cognitive penetration and the cases in which S is a victim of a skeptical scenario by claiming that both situations are essentially cases in which S’s belief-production processes are unreliable (Ghijsen 2016). As we see in Section 2.c, the responses to the cognitive penetration problem by concessive internalists do not offer unitary solutions of this type. One might adduce this consideration to argue that the reliabilist accounts are preferable (see Ghijsen 2016).

A way to question this reliabilist response to the cognitive penetrability problem is raising standard objections to reliabilism about justification (see Becker 2018). Moreover, Tucker (2014) has argued that this reliabilist response fares no better than internalist resolute solutions. Suppose S’s perceptual experience as if P is cognitively penetrated by her desire that P but P happens to be actually true most of the times when this cognitive penetration obtains. To accommodate suppositions of this type, reliabilists might need to bite the bullet and claim that the output-beliefs of such processes would be actually justified, though this may appear counterintuitive. In a similar fashion, resolute internalists insist that justification is safe from the threat of cognitive penetration. For further criticism see, for instance, Vahid (2014).

c. Internalist Concessive Solutions

This section surveys the principal internalist concessive solutions to the cognitive penetrability problem. As previously mentioned, these accounts accept the downgrade thesis and reject (PC), but they might be described as modifications of phenomenal conservatism that confine the existence of the justificatory power of perceptual experiences to particular circumstances: when certain enabling factors are present or some disabling factors are absent (Chudnoff 2019).

We first examine three versions of what Lyons (2016) calls inferentialism: Siegel’s process inferentialism, McGrath’s receptivity approach, and Markie’s knowledge-how account. Inferentialism rests on the assumption that the proper way to assess epistemically a perceptual experience of S (and S’s beliefs based on it) is checking the way in which S has produced the perceptual experience, roughly in the same way in which we epistemically assess a belief B of S by checking the way in which S has inferred B from other beliefs. A key assumption is that in any case of bad cognitive penetration, the epistemic status of the relevant experience is downgraded as a result of the experience having a rationally assessable etiology but failing to meet certain standards of epistemic rationality. Whether a perceptual experience has justificatory power thus depends on its causal history (Lyons 2011, 2016). Since the factors that determine S’s perceptual justification—the etiologies of S’s perceptual experiences—are thought of as mental processes of S which are possibly reflectively inaccessible to S, inferentialism is typically considered to be an internalist mentalist view (Lyons 2016).

At the end of this section we examine Chudnoff’s presentational conservatism, an internalist (partly) concessive account that does not qualify as inferentialist.

i. Process Inferentialism

Siegel (2013a, 2013b) maintains that a perceptual experience gets epistemically downgraded whenever it has a checkered past; namely, its etiology is similar with respect to its psychological elements to the etiology of a (possible) belief that has the same content and proves unjustified. Consider this example that draws an analogy between wishful seeing and wishful thinking. John’s wishfully seeing that Jack is angry consists of John’s visual experience as if Jack is angry, produced by an etiology involving cognitive penetration by John’s desire that Jack is angry. John’s experience has a checkered past because its etiology is similar with respect to its psychological elements to the etiology of an unjustified belief that Jack is angry, which John could have out of his wishful thinking.

Note that a cognitively penetrated perceptual experience may not have a checkered past. Nevertheless, all beliefs based on cognitively penetrated experiences with checkered past are ill-formed, and so unjustified (Siegel 2013a).

The internalist who—like Siegel—endorses the downgrade thesis must explain why a perceptual experience may lose its justificatory force because of cognitive penetration, but it does not when the subject is simply in a zap-like state. Siegel (2013a) maintains that the etiology of a perceptual experience when the subject is in a zap-like state results from an arational process, whereas the etiology of a perceptual experience badly cognitively penetrated results from a rationally assessable but irrational process. People might find it counterintuitive that these processes are rationally assessable. A process inferentialist may insist, however, that rationally assessable etiologies are those that lie within the cognitive system of the subject, whereas arational etiologies are external to the subject’s cognitive system. Another possibility is that rationally assessable etiologies are those on which the subject has some type of rational control, which is impossible in zap-like cases (Siegel 2012, 2013a).

Process inferentialism has further problems. It is to a good extent indeterminate, by this account, which etiologies of perceptual experiences are defective and why. For it is unclear in what precise respects and to what extent the etiologies of perceptual experiences should share similarity in structure with the etiologies of ill-formed beliefs to qualify as defective (Lyons 2016). Furthermore, although there are paradigmatic instances of ill-formed beliefs (for example, those based on wishful thinking or jumping to conclusions), the distinction between well-formed and ill-formed beliefs is not always clear-cut. So, the only way to draw these distinctions might ultimately be by relying on people’s intuitions, which might diverge (Siegel 2013a). If bad etiologies cannot be identified by means of an effective criterion, process inferentialism is ineffective in distinguishing good cognitive penetration cases from bad ones. If the only way to draw this distinction with precision were appealing to a reliabilist criterion, process inferentialism would not fulfill its internalist ambitions (Lyons 2016).

Another possible source of difficulty for process inferentialism turns on relevant dissimilarities between experiences and beliefs. All perceptual experiences possess—many epistemologists contend—a distinctive phenomenology capable of turning them into justification-providing states; but this phenomenology is not to be found in any belief. This might indicate that the features of the etiologies of perceptual experiences are irrelevant to their justificatory power, and that drawing epistemological conclusions from analogies between perceptual experiences and beliefs is ultimately misguiding (see Vance 2014 and Silins 2016).

For responses to these and other concerns, and an updated defense of process inferentialism, see Siegel (2017, 2018).

ii. The Receptivity Approach

McGrath’s (2013a, 2013b) receptivity approach puts emphasis on the relation between perceptual experiences and their bases. Beliefs can be based on other mental states. In this account, perceptual experiences can do so too. McGrath maintains that one’s seemings can produce other seemings in one’s mind, and draws a distinction between receptive and non–receptive seemings. A receptive seeming is the input and a non-receptive seeming is the output of a quasi-inference—a process that constitutes the transition from one seeming to another. More precisely,

A transition from a seeming that P to a seeming that Q is “quasi-inferential” just in case the transition that would result from replacing these seemings with corresponding beliefs that P and Q would count as genuine inference by the person (McGrath 2013b, p. 237).

Receptive seemings are unconditional justification-providing states of a subject S, whereas non-receptive seemings give S justification only if the relevant quasi-inference is good. Receptive seemings are given to S, whereas non-receptive seemings arise after S’s own doing. The former seemings provide S with justification without being epistemically assessable. The latter seemings are epistemically assessable due to their stemming from S’s own making (McGrath 2013b).

A good quasi-inference can be characterized by a comparison with a good inference between beliefs. A good inference is one that results in a justified output-belief. Assuming for simplicity that only two beliefs participate in the inference, what is involved in a good inference is a transmission of justification from one belief to another. This happens only if the first belief is justified and sufficiently supports the second. Furthermore, a good inference requires for the subject S some sort of appropriate rationalization (which need not involve higher-order thinking or justification)—for example, S’s correct grasp of the epistemic relation of support between the two beliefs, S’s correct use of background information stored in S’s cognitive system as relevant knowledge-how, or a mix of these two. This rationalization would not be appropriate, for instance, if it depended on factors that would make S jump to conclusions, such as expectations, desires and moods (McGrath 2013b). Analogously, in a good quasi-inference between seemings, what is involved is the transmission of the property, which a seeming might possess or lack, of making S have justification for believing its content. Only receptive seemings have this property by default. In a good quasi-inference, the receptive seeming transmits this property to the non-receptive seeming. As a result, S can be justified in believing the content of the non-receptive seeming. Yet, if the non-receptive seeming is not sufficiently supported by the receptive seeming—because an output-belief with the content of the first seeming would not be sufficiently supported by an input-belief with the content of the second seeming—the non-receptive seeming does not receive the relevant epistemic property. In this case, the quasi-inference is not good, and S does not wind up having justification for believing the content of the non-receptive seeming (McGrath 2013a, 2013b).

The receptivity approach explains the downgrade of perceptual experience affected by bad cognitive penetration by adducing the features of a correlated quasi-inference: the downgrade happens when the quasi-inference is bad (McGrath 2013a, 2013b). Take Angry Jack. In the receptivity approach, Jill initially entertains a receptive seeming about Jack’s face that has the intrinsic property of giving Jill justification for believing that Jack is not angry. Under the influence of cognitive penetration by her unjustified belief that Jack is angry, this receptive seeming is replaced in Jill’s mind with a non-receptive seeming that Jack is angry. This is a bad quasi-inference because the receptive seeming does not support the non-receptive seeming, as belief in the content of the first would not support belief in the content of the second. Hence, Jill is not justified in believing that Jack is angry.

It is unclear whether this approach can accommodate a disunified view of perception—one that distinguishes between sensations (low-level and non-conceptual) and seemings (high-level and conceptual) (Lyons 2016). What McGrath calls non-receptive seemings are states with conceptual content—so proper seemings. However, McGrath seems to concede that receptive seemings are not necessarily states with conceptual content—they may be sensations. This means that, for McGrath, a perceptual experience may arise from a quasi-inference whose input—the receptive seeming—is constituted by mere sensations. Yet a quasi-inference requires all seemings involved to have believable contents, and thus conceptual contents (see Lyons 2016). Moreover, suppose that perception is actually disunified and that the proponent of the receptivity approach denies that mere sensations can be the inputs of quasi-inferences. They should conclude that, for example, the transition in Jill’s mind leading to her perceptual experience that Jack is angry is not a quasi-inference. A consequence would be that this perceptual experience would be a receptive seeming, and thus a justification-provider. Many would find this counterintuitive (see McGrath 2013b and Lyons 2016).

Another concern is that the receptivity approach does not address what might actually be at stake in cases of bad cognitive penetration: the cognitive penetration of receptive seemings, rather than non-receptive seemings (Lyons 2016). Take again Angry Jack. Suppose the correct description of what happens is this: because of her unjustified belief that Jack is angry, Jill has a cognitively penetrated receptive seeming that Jack’s face has anger features. This receptive seeming produces in Jill’s mind, via a quasi-inference, a non-receptive seeming that Jack is angry. If this were the correct description of what happens in Angry Jack, the proponents of the receptivity approach should conclude that Jill is justified in believing that Jack is angry on the basis of her non-receptive seeming that Jack is angry. For this non-receptive seeming is actually supported by Jill’s receptive seeming that Jack’s face has anger features.

Lyons (2016) complains that the receptivity approach treats cognitively penetrated non-receptive seemings as person–level phenomena, though it is intuitive that perceptual experiences do not result from our own doing. According to Lyons, transitions between seemings cannot be controlled by the subject and could at best be thought of as produced by unconscious inferential mechanisms—this would explain the impression that all seemings are given to us. Advocates of the receptivity approach might concede that all seeming-to-seeming transitions are produced by sub-personal mechanisms. An unpalatable consequence for the receptivity approach (which claims that all seemings produced by sub-personal mechanisms are receptive seemings) would be, however, that all seemings should be thought of as receptive, and thus as always capable of conferring prima facie justification.

Ghijsen (2016) notes that it is hard to find a coherent characterization of the background knowledge that the subject must have to carry out good quasi-inferences. Suppose the background knowledge required to appropriately rationalize the transition from a receptive seeming that this nugget is yellowish in a given way F to a non-receptive seeming that this nugget is gold is the propositional knowledge that whatever looks yellowish in a way F is gold. How could this knowledge be acquired by the subject? It could not be acquired via quasi-inferences from receptive seemings of objects looking yellowish in a way F to non-receptive seemings of objects looking gold. For these quasi-inferences presuppose the background knowledge that we want to characterize. If this background knowledge were conceived of in terms of knowledge-how, it would have better prospects for helping. However, what exactly would this knowledge-how consist of? If this account is meant to be internalist, it cannot coincide with the subject’s mere ability to reliably recognize gold when she comes across it. Thus, the problem remains open.

iii. The Knowledge-How Account

The last inferentialist account we survey, developed by Markie (2013), holds that S’s perceptual experience as if P is epistemically appropriate—namely, it provides S with prima facie justification for believing P—if it results from S’s knowledge–how about the proposition that P. This knowledge-how consists of S’s being disposed to have the perceptual experience as if P in response to S’s attending to particular features of her overall experience and S’s being disposed to do so in virtue of her having background knowledge–that these particular features of her experience indicate that P is true (Markie 2013). Consider an expert orthopedic who has a perceptual experience as if (P) the X-ray shows a knee suffering from osteochondritis. The experience provides the orthopedic with prima facie justification for believing P, for the experience is epistemically appropriate. This is so because the experience results from her knowledge-how about P. This knowledge-how involves both her being disposed to entertain that specific perceptual experience in response to her attending to the particular features of her overall experience, and her having that disposition in virtue of having background knowledge that these particular features of her experience indicate that P is true.

More accurately, Markie analyzes S’s knowing-how as being constituted by (i) S’s disposition to have a perceptual experience as if P after her shift of attention to relevant features of her overall experience, (ii) S’s possession of background information that anything displaying those features is appropriately connected in some factual sense with P (for example, background evidence or justification that any object provided with these features is actually gold), and (iii) the character of S’s disposition being at least partly determined by S’s background information.

For Markie, S’s knowledge-how about P need not be accompanied by S’s reliable practice. (In the evil demon scenarios, the expert knows how to identify gold, though he fails to identify it reliably.) Furthermore, even when S’s practice is reliable, this alone does not provide S with the relevant knowledge-how. S’s reliable practice must be accompanied with S’s understanding that the right object or type of object (for example gold) has been identified by her.

Markie himself acknowledges that both the method of S’s acquiring the relevant knowledge-that and the latter’s relationship with S’s knowledge-how require further specification. One might also doubt that knowledge-how always coexists with knowledge-that, and that knowledge-how depends on knowledge-that in case of coexistence (Lyons 2016). Furthermore, the knowledge-how account of cognitive penetration is afflicted by a problem analogous to one that affects McGrath’s. Markie’s account requires all epistemically appropriate perceptual experiences to depend on S’s understanding and doing. For it is S’s knowledge-that which determines S’s disposition to form appropriate perceptual experiences in response to given features of her experience. But this knowledge-that is an agent-level factor. So, the knowledge-how account holds that the formation of appropriate perceptual experiences happens at personal level, which is implausible (Lyons 2016).

Another difficulty of McGrath’s receptivity account seems to afflict also the knowledge-how account. Markie’s account might not address what is really at stake in cases of bad cognitive penetration. For bad cognitive penetration might directly affect the features of S’s experience that S attends to and in response to which she forms her perceptual experiences (Lyons 2016). Markie considers this criticism and bites the bullet: for him, if cognitive penetration directly affected these features, S’s experiences would still be capable of conferring justification, provided they were produced through the exercise of S’s relevant knowing-how.

iv. Presentational Conservatism

Chudnoff’s (2019) presentational conservatism is a restrained version of phenomenal conservatism that is both accessibilist and mentalist. Presentational conservatism imposes the following additional condition necessary for a perceptual experience to supply immediate justification: the experience must have a presentational phenomenology.

Suppose you see a picture of a dog with an occluded middle part. Your perceptual experience is presentational with respect to the left part of the dog, its right part, but not with respect to the middle part of the dog. This is so even though the middle part of the dog is somehow represented in the picture.

According to Presentational Conservatism it is only those contents with respect to which an experience has presentational phenomenology that prima facie justifies on its own, that is, immediately. If it justifies other contents, then it does so mediately. That the justification is mediate does not mean that it is remote or difficult to attain. Your experience of the partly occluded dog, for example, justifies you in believing various things about the dog’s middle both because they are made likely by the propositions about the dog’s rightward and leftward parts that it immediately justifies, and even entailed by some of the propositions about the whole dog that it immediately justifies (Chudnoff 2019, p. 6).

Chudnoff suggests three different ways in which presentational conservatism can account for cases of bad cognitive penetration, depending on what proposition is taken to be the target and what part of one’s experience cognitive penetration is taken to affect. Chudnoff focuses on the Angry Jack example. Let us consider all three accounts in turn.

Here is the first. Consider the proposition (a) Jack’s eyes and mouth are neutrally shaped, and the proposition (b) Jack is angry.

Jill’s experience immediately justifies her in believing (a) because it is both represented and presented; Jill’s experience doesn’t immediately justify her in believing (b) because though represented it isn’t presented; Jill’s experience would mediately justify her in believing (b) if she had reason to think that if (a) is true then (b) is true; but she doesn’t; so it doesn’t (Chudnoff 2019, p. 10).

Chudnoff suggests that Jill’s experience does not have presentational phenomenology with respect to (b) because anger is a mental state and, as such, is invisible. So, it cannot presentationally seem to Jill that Jack is angry

This account could be extended to other cases of cognitive penetration in which the penetrated perceptual experience results in a mental state without presentational phenomenology. In all these cases the perceptual experiences would be downgraded (see Brogaard 2018 for a similar strategy).

Epistemologists and philosophers of mind who believe that high-level properties are genuinely presented in our experiences might deny, however, that Jill’s experience that Jack is angry does not have presentational phenomenology. These philosophers might raise similar objections to analogous accounts of experience downgrade. This exposes a general weakness of presentational conservatism: since it is somewhat controversial what things and features can genuinely be presented in perceptual experience (Siegel 2016), if presentational conservatism is endorsed, it becomes equally controversial what sort of beliefs can be immediately justified by our perceptual experiences.

This is Chudnoff’s second explanation. Consider again proposition (a) and the proposition (c) Jack’s eyes and mouth express anger. Chudnoff thinks that although anger is not visible, one can see facial organs expressing anger. Facial organs expressing anger is something that can presentationally seem to one to be the case. By these lights, a presentational conservative can claim that Jill’s experience has presentational phenomenology with respect to both (a) and (c). Hence,

Jill’s experience immediately justifies her in believing (a) because it is both represented and presented; Jill’s experience immediately justifies her in believing [c] because it is both represented and presented; but Jill’s justification for believing (a) defeats Jill’s justification for believing [c] because she knows that if (a) is true, then [c] is not true . . . Though Jill’s experience prima facie justifies her in believing that Jack’s eyes and mouth express anger, all things considered Jill does not have justification for believing that Jack’s eyes and mouth express anger because she has justification for thinking that Jack’s eyes are horizontal, as is his mouth and she knows that horizontal eyes and mouth do not express anger (Chudnoff 2019, pp. 10–11).

What is affected in this case is only all things considered justification. Chudnoff suggests that the justification for (a) defeats the justification for (c), and not the other way around because Jill’s experience has stronger presentational phenomenology with respect to (a). Had Jill’s experience stronger presentational phenomenology with respect to (c), the justification for (c) would defeat that for (a).

Both explanations above assume that cognitive penetration does not change Jill’s experience with respect to the low-level neutral characteristics of Jack’s face. Chudnoff’s third explanation assumes that cognitive penetration causes Jill’s experience of Jack to have low-level, angry-face features. Chudnoff acknowledges that in this case Jill’s experience would have presentational phenomenology with respect to the proposition that the features of Jack’s face express anger. Therefore, her perceptual experience would provide immediate justification for (c) and, indirectly, for (b). Some epistemologists would find this result counterintuitive.

d. Other Options

This section presents four miscellaneous responses to the epistemic problem of cognitive penetrability that do not clearly fit the internalism-externalism dichotomy.

i. Sensible Dogmatism

Brogaard’s (2013) sensible dogmatism holds that experiences are mere collections of sensory impressions. Brogaard calls phenomenal contents of an experience the sensory impressions that constitute the experience. Furthermore, Brogaard calls phenomenal seemings the “interpretations” of experiences—that is to say, the conceptual or propositional ingredients of perception.

Sensible dogmatism is a special version of phenomenal conservatism that implies the downgrade thesis. This is its core principle:

If it seems to S as if [P] and the seeming is grounded in the content of S’s . . . experience, then, in the absence of defeaters, S thereby has at least some degree of justification for believing that [P] (Brogaard 2013, p. 278).

S’s seeming that P is grounded in a phenomenal content Q of an experience E that S has just in case (i) reliably, if Q is a content of S’s experience E, it seems to S as if P and (ii) reliably, if it seems to S as if P, P is true. (i) can be understood as: in most ‘hypothetical situations’ closest to the actual one in which Q is a content of S’s experience E, it seems to S as if P. (ii) prevents seemings from being grounded in the content of experiences by ‘sheer’ coincidence. (ii) does not require P to be actually true; it just requires P to be true in most of the closest ‘hypothetical situations’ where S has the seeming that P (Brogaard 2013).

Sensible dogmatism can explain the novice prospector case as follows: the novice is not justified in his belief that P because (i) is not met. Since the desire to find gold is not present in most of the closest possible situations where the novice has the same sensory experience of the pebble, this sensory experience does not lead him, in those situations, to have a seeming that the pebble is gold (Brogaard 2013). Another way in which sensible dogmatism can explain the novice case is this: suppose the novice’s desire is present in most or all of the closest possible situations where he has the sensory experience of the pebble, leading him to having the seeming that the pebble is gold even in cases where it is not so. Then, (ii) is not satisfied. For the content of his seeming that the pebble is gold would not be true in most of the closest possible situations where it would seem to him that the pebble is gold (Brogaard 2013). In conclusion, since the novice’s seeming that this pebble is gold is not grounded in the content of his experience, his seeming does not justify his belief that the pebble is gold. It is easy to see, on the other hand, that the expert prospector’s seeming is grounded in the content of his own experience, so this seeming justifies his belief (Brogaard 2013).

Given the reliabilist component of Brogaard’s position, sensible dogmatism appears to be an externalist view. Yet Brogaard insists that it is a weak internalist position, for the mental states that provide S with justification are accessible to S, though the factors that determine whether those mental states are justification-providing are not.

The reliabilist components of Brogaard’s position make it inherit problems from externalist reliabilism. Think for instance of the consequence of sensible dogmatism that the seemings of a Matrix’s victim would not provide her with justification because (ii) would not be met in the Matrix scenario (see Vahid 2014).

ii. The Imagining Account

Teng’s (2016) imagining account bases her defense of the downgrade thesis on a possible psychological explanation of how cognitive penetration is produced in a subject S presented in Macpherson (2012). Suppose S entertains a perceptual experience. According to Macpherson, one possible cognitive-penetration-causing mechanism involves the interaction of imagination and perceptual experience. In particular, it involves (i) the production of an imaginative experience by some mental state of S, and (ii) the interaction of this imaginative experience with S’s perceptual experience. The upshot is a novel phenomenal state of S with both the perceptual experience and the imaginative experience as contributors. As Teng emphasizes, since imaginative experiences are experiences in a sense fabricated by S, the phenomenal states resulting from a combination of an imaginative experience and a perceptual experience of S are to be considered to be partly fabricated by S as well. Cognitively penetrated experiences could be states of this type.

Teng finds it intuitive that an experience of S supplies S with prima facie justification for believing its content only if S does not fabricate (consciously or unconsciously) the experience. She infers from this that no imaginative experience of S could be a prima facie justification-provider. Teng concludes that since any cognitively penetrated experience of S is partly fabricated by S, it must be epistemically downgraded with respect to the fabricated part (Teng 2016).

A potential difficulty of this account concerns the explanation of the cases of good cognitive penetration. Teng submits that these cases might be explained by mere attentional shifts of S involving no imagining and capable of rendering certain objective features of the world more salient to S. She also suggests that S’s imagining might explain some specific cases of good cognitive penetration. For imagining could occasionally facilitate the perception of independent reality rather than interfering with it. Consider for instance the following experiment:

J. Farah (1985 and 1989) asked her participants to detect the presence of a faint letter H or T in a square while the participants projected a mental image of H or T onto the same location. It turned out that their detection was more accurate when they were imagining the same letter than a different one (Teng 2016, p. 25).

iii. The Analogy with Emotions

Vance’s (2014) account explains why a perceptual experience can be downgraded by its inappropriate etiology through drawing an analogy between cognitively penetrated experiences and cognitively penetrated emotional states.

Suppose S has an unjustified background belief that all foreigners are dangerous. One day S meets some foreigners and her background belief causes S to feel fear. Had she not had her unjustified belief, she would not have felt fear. On the basis of her fear, she forms the belief that the people before her are dangerous. Her fear is in this case downgraded: it cannot provide justification for her belief that those people are dangerous because it is grounded in a belief constituting a defective reason for her feeling. When emotions are grounded in such a defective way, their justificatory power decreases or ceases (Vance 2014). An emotional state with an etiology starting with a non-defective reason for the emotion could nevertheless be a justification-provider. For instance, S’s fear of a snake that S spots in her trail caused by her justified background belief that snakes are dangerous can provide S with justification for believing that walking on the trail is unsafe (Vance 2014).

Vance stresses that emotional states and perceptual experiences share extrinsic properties—such as psychological and epistemic features of their etiological structure—and intrinsic properties—such as their intentional character and distinctive phenomenology. From this, he derives that perceptual experiences, as well as emotions, can be downgraded with respect to their justificatory power. He submits that, in analogy with emotional states, this typically happens when perceptual experiences are grounded in unjustified beliefs.

A possible criticism of Vance’s account is that it is controversial whether the similarities between emotions and experiences could outweigh their differences in such a way that they both turn out to be rationally assessable states and in a similar way (Silins 2016).

iv. The Sensorimotor Theory of Perception

Vahid’s (2014) account of the cognitive penetrability problem and defense of the downgrade thesis rely on a conception of perceptual experience different from the traditional ones that conceive of perception as something given to us. Vahid’s conception is part of the extended cognition view of mental processes, which maintains that mental processes are partly constituted by environmental components situated out of the subject’s body. Think of Otto—a memory-impaired man—who uses his notebook to take notes that help him remember things he would otherwise forget. Otto’s cognition can be said to have been extended to his notebook.

While, on the received view, the notebook is not part of Otto’s cognitive processes, [the extended cognition thesis] takes Otto and his notebook to form a cognitive system where the information stored in the notebook functions as Otto’s non-occurrent, dis-positional beliefs. Cognitive processes are not, thus . . . purely in the head (Vahid 2014, p. 453).

Similarly, perceptual experiences may not be only in the subject S’s head. The sensorimotor theory of perception—one of the extended perception accounts—turns on the thought that perceptual experience is not just produced by S’s brain processes but is constituted by the ways in which these processes enable S to interact with her environment. In this account, S’s perceptual experience depends on both the features of S’s perceptual apparatus and those of the world to which this apparatus is sensitive.

[W]hen looking at a red apple, the sensation of seeing the apple . . . merely consists in our understanding or knowledge of a class of relevant counterfactuals, e.g., that if one were to move one’s eyes or body with respect to the apple, the sensory signals change in a way characteristic of red, rather than green, apples. One’s experience of seeing a red apple just is the knowledge of the class of the relevant sensorimotor contingencies (Vahid 2014, pp. 454–455).

In this view, perceptual experiences result from S’s expectations, assumptions, suppositions, understanding or implicit knowledge about what would happen in terms of new inputs from the world if S interacted in specific ways with the things the perceptual experiences are about (see Vahid 2014). (This theory is closely related to a model of the mind called predictive coding—see Hohwy 2012 and Clark 2013.)

To understand Vahid’s account of the cognitive penetrability problem, let us go back to Expert and Novice and Angry Jack. Vahid maintains that only the expert has implicit knowledge of the counterfactuals describing the perceptual consequences of his interaction with the nugget—or, at least, that the expert’s knowledge of them is more thorough than the novice’s. So, when faced with a gold nugget, the two prospectors actually have different cognitively penetrated experiences. For the expert’s experience is constituted by more numerous and detailed perceptual expectations than those of the novice’s experience. This enables us to distinguish the good cognitive penetration of the expert’s perceptual experience and the bad cognitive penetration of the novice’s perceptual experience. Angry Jack is interpretable along similar lines. Jill’s initial unjustified belief that Jack is angry penetrates her experience of Jack’s face by producing in Jill all the typical perceptual expectations that constitute perception of anger. In this case, we can say that Jill’s perceptual experience is badly penetrated because most of her expectations are mistaken (Vahid 2014).

Why is the novice’s belief that the nugget is gold not justified by his perceptual experience? And why is Jill’s belief that Jack is angry not justified by her experience? To answer these questions Vahid appeals to an explanationist conception of epistemic justification according to which a proposition is justified as long as it is the best available explanation of the subject’s evidence.

In the version of the angry-looking Jack case . . . the truth of Jill’s belief is not the best explanation of her incorrect expectations and assumptions that constitute her experience of seeing Jack’s face. Only correct expectations and suppositions reflect the facts about the external world . . . Likewise, in the gold-digging case, the truth of the novice’s belief that the pebble is gold is not the best explanation of his (thin) class of sensorimotor knowledge constituting his output experience as [the] less complex and simpler hypothesis [that the novice does desire to find gold] can discharge this function (Vahid 2014, p. 457).

See Ghijsen (2018) and Macpherson (2017) for discussion and criticism.

3. Conclusion

This article has provided an introductory map to the contemporary debate on the problem of cognitive penetrability of perception for epistemic justification. Internalist accessibilists typically do not concede that justification is hostage to cognitive penetration and put forward resolute responses to the cognitive penetration problem. On the other hand, externalist reliabilists together with some internalists from the mentalist camp concede that cognitive penetration may affect justification negatively and attempt to provide explanations of why and how this can happen. There are a few alternative accounts of the cognitive penetration problem that cannot easily be classified within the internalism-externalism framework.

4. References and Further Reading

Audi, Robert. 1993. The Structure of Justification. Cambridge: Cambridge University Press.
Becker, Kelly. 2018. “Reliabilism.” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/reliabil/
Brogaard, Berit. 2013. “Phenomenal Seemings and Sensible Dogmatism.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
Brogaard, Berit. 2018. “Bias-Driven Attention, Cognitive Penetration and Epistemic Downgrading.” In Christoph Limbeck and Friedrich Stadler (eds.), Philosophy of Perception. Publications of the Austrian Ludwig Wittgenstein Society. De Gruyter.
Chudnoff, Elijah. 2019. “Experience and Epistemic Structure: Can Cognitive Penetration Result in Epistemic Downgrade?” https://philpapers.org/archive/CHUEAE-2.pdf (accessed on 1/5/2019).
Clark, Andy. 2013. “Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science.” Behavioral and Brain Sciences 36: 3, 181–204.
Connolly, Kevin. 2017. “Perceptual Learning”. In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/sum2017/entries/perceptual-learning/.
Farah, Martha J. 1985. “Psychophysical Evidence for a Shared Representational Medium for Mental Images and Percepts.” Journal of Experimental Psychology 114:1, 91–103
Farah, Martha J. 1989. “Mechanisms of Imagery-Perception Interaction.” Journal of Experimental Psychology: Human Perception and Performance 15:2 pp. 203–211.
Fumerton, Richard. “Siegel on the Epistemic Impact of “Checkered” Experience.” Philosophical Studies 162:3, 733–739
Ghijsen, Harmen. 2016. “The Real Epistemic Problem of Cognitive Penetration.” Philosophical Studies 173:6, 1457–1475
Ghijsen, Harmen. 2018. “Predictive processing and foundationalism about perception.” Synthese. Open access. https://doi.org/10.1007/s11229-018-1715-x
Goldman, Alvin I. 1979. “What is Justified Belief?” In George Pappas (ed.), Justification and Knowledge. Dordrecht: Reidel.
Hansen, Thorsten., Olkkonen, Maria., Walter, Sebastian and Gegenfurtner, Karl R. 2006. “Memory Modulates Color Appearance.” Nature Neuroscience 9:11, 1367–1368.
Hohwy, Jakob. 2013. The Predictive Mind. Oxford: Oxford University Press.
Huemer, Michael. 2001. Skepticism and the veil of perception. Lanham, MD: Rowman and Littlefield.
Huemer, Michael. 2007. “Compassionate Phenomenal Conservatism.” Philosophy and Phenomenological Research 74:1, 30–55.
Huemer, Michael. 2013a. “Epistemological Asymmetries between Belief and Experience.” Philosophical Studies 162:3, 741–748.
Huemer, Michael. 2013b. “Phenomenal Conservatism Über Alles.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
Lyons, Jack C. 2011. “Circularity, reliability, and the cognitive penetrability of perception.” Philosophical Issues 21, 289–311.
Lyons, Jack C. 2016. “Inferentialism and Cognitive Penetration of Perception.” Episteme 13:1, 1–28
Macpherson, Fiona. 2012. “Cognitive Penetration of Color Experience: Rethinking the Issue in Light of an Indirect Mechanism.” Philosophy and Phenomenological Research 84:1, 24–62.
Macpherson, Fiona. 2017. “The relationship between cognitive penetration and predictive coding.” Consciousness and Cognition 47, 6–16
Markie, Peter J. 2005. “The mystery of direct perceptual justification.” Philosophical Studies 126, 347–373.
Markie, Peter J. 2006. “Epistemically appropriate perceptual belief.” Noûs 40:1, 118–142.
Markie, Peter J. 2013. “Searching for true dogmatism.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
McGrath, Matthew. 2013a. “Siegel and the Impact for Epistemological Internalism.” Philosophical Studies. 162, 723–732
McGrath, Matthew. 2013b. “Phenomenal Conservatism and Cognitive Penetration: The “Bad Basis” Counterexamples.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
Pappas, George. 2014. “Internalist vs. Externalist Conceptions of Epistemic Justification.” In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/fall2017/entries/justep-intext/.
Payne, Keith B. 2001. “Prejudice and Perception: The Role of Automatic and Controlled Processes in Misperceiving a Weapon.” Journal of Personality and Social Psychology 81:2, 181.
Poston, Ted. 2018. “Internalism and Externalism in Epistemology.” Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/int-ext/
Pryor, James. 2000. “The Skeptic and the Dogmatist.” Noûs 34:4, 517–549.
Raftopoulos, Athanassios and Zeimbekis, John. 2015. “Cognitive Penetrability of Perception: An Overview.” In Athanassios Raftopoulos and John Zeimbekis (eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives. Oxford University Press.
Siegel, Susanna. 2012. “Cognitive penetrability and perceptual justification.” Noûs 46, 201 –22.
Siegel, Susanna. 2013a. “The Epistemic Impact of the Etiology of Experience.” Philosophical Studies 162:3, 697–722.
Siegel, Susanna. 2013b. “Reply to Fumerton, Huemer, and McGrath.” Philosophical Studies 162:3, 749–757
Siegel, Susanna. 2015. “Epistemic Evaluability and Perceptual Farce.” In Athanassios Raftopoulos and John Zeimbekis (eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives. Oxford University Press.
Siegel, Susanna. 2016. “The Contents of Perceptions.” In Edward N. Zalta (ed.), Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/archives/win2016/entries/perception-contents/.
Siegel, Susanna. 2017. The Rationality of Perception. New York: Oxford University Press.
Siegel, Susanna. 2018. “Can Perceptual Experiences Be Rational?” Analytic Philosophy 59:1, 149–174
Silins, Nicholas. 2016. “Cognitive Penetration and the Epistemology of Perception.” Philosophy Compass 11:1, 24–42
Skene, Matthew. 2013. “Seemings and the Possibility of Epistemic Justification.” Philosophical Studies 163: 539–59.
Stokes, Dustin. 2012. “Perceiving and desiring: a new look at the cognitive penetrability of experience.” Philosophical Studies 158: 3, 477–492.
Stokes, Mark B. and Payne, Keith B. 2011. Mental control and visual illusions: Errors of action and construal in race-based weapon misidentification. The Science of Social Vision: The Science of Social Vision 7: 275–295.
Teng, Lu. 2016. “Cognitive Penetration, Imagining, and the Downgrade Thesis.” Philosophical Topics 44:2, 405–426.
Tucker, Chris. 2010. “Why open-minded people should endorse dogmatism.” Philosophical Perspectives 24:1, 529–545.
Tucker, Chris. 2011. “Phenomenal Conservatism and Evidentialism in Religious Epistemology.” In Kelly James Clark and Raymond J. VanArragon (eds.), Evidence and Religious Belief. Oxford: Oxford University Press.
Tucker, Chris. 2013. “Seemings and Justification: An Introduction.” In Chris Tucker (ed.), Seemings and Justification. Oxford: Oxford University Press.
Tucker, Chris. 2014. “If Dogmatists Have a Problem with Cognitive Penetration, You Do Too.” Dialectica 68:1, 35–62.
Tucker, Chris. 2019. “Dogmatism and the epistemology of covert selection.” https://philpapers.org/rec/TUCDAT (accessed on 1/5/2019)
Vahid, Hamid. 2014. “Cognitive penetration, the Downgrade Principle, and Extended Cognition.” Philosophical Issues 24:1, 439–459.
Vance, Jonna. 2014. “Emotion and the new epistemic challenge from cognitive penetrability.” Philosophical Studies 169:2, 257–283.

Author Information

Christos Georgakakis
Email: c.georgakakis@abdn.ac.uk
University of Aberdeen
United Kingdom

and

Luca Moretti
Email: l.moretti@abdn.ac.uk
University of Aberdeen
United Kingdom

Simone Weil (1909—1943)

The French philosopher Simone Weil is a confronting and disconcerting figure in modern philosophy. This is not simply because she was so many things at once—ascetic and mystic, teacher and factory worker, labour activist and political militant, social thinker and piercing moral psychologist, critical Marxist and heterodox Christian theologian—but because of the striking “untimeliness” of her thought. For unlike philosophers in the analytic tradition, she insisted that life and philosophical reflection are connected on the deepest ethical level; and, unlike those in the postmodern tradition, she felt free to draw on terms like “truth,” “reality,” “the sacred,” “justice,” “soul,” and “God.”

Weil, of course, was not an analytic philosopher, nor a proto-postmodernist. She came to philosophy in the interwar years in a philosophical milieu of political radicalism, phenomenology, and emerging existentialism. As did most of her contemporaries, she saw philosophy in terms of the nature and challenges of the human condition, though she differed from the existentialists as to what this meant.

Whereas Jean-Paul Sartre and Simone de Beauvoir saw things in terms of the individual’s radical freedom to choose their values in a Godless world, Weil took a different path. Her concern was not to perfect herself as a replacement God figure, creating values out of a supposed absolute freedom, but to face up to, to have attended to, the real existence of other people. Whereas the existentialism of Sartre saw him faced with the challenge of showing how morality was even possible, Weil took the possibility of morality as a given—as an essential and fundamental modality of human life and experience, however partial and flawed its manifestations—and sought to show what it was to take morality seriously.
Taken that way, moral life rested on our capacity to care for others, where this meant to care for them as they were, and not as a means or obstacle to any end of our own, even that of our moral perfection or virtue. To refuse this attention was to read the world so that nothing and no-one was sacred, not even oneself. This reading gave us the world of power and so the sovereignty of force, and it was the ultimate logic of force “that it turn[ed] anyone subject to it into a thing.”
Such a reading of the world denied the ethical, yet equally it was precisely this denial the ethical sought to overcome. Here, for Weil, was a fundamental contradiction at the heart of ethical life. It was not a contradiction that meant the impossibility of that life, rather it showed us that the ethical was, ultimately, and at its foundations, something supernatural.
This article looks at Weil as a moral philosopher in a tradition that runs through Plato to Kant: one who took morality with a seriousness, with an utter commitment, alien to those philosophers tempted by scepticism or, in reaction, by a desire to find some rational foundation on which to securely rest an otherwise threatened edifice.

Life
Writings
Suffering, Oppression, Liberty
Affliction, Detachment, the Impersonal, and the Sacred
Uprootedness and the Needs of the Soul
The Moral Ground
References and Further Reading

1. Life

Simone Weil was born in Paris on February 3, 1909, the second of two children born to comfortably off agnostic and secular Jewish parents. Her father was a medical doctor, and her brother, the 3-year older Andre, would become one of the most renowned mathematicians of the 20^th century.

From the start Weil was both intellectually precocious and morally disconcerting. The intellectual capacity ran in the family (indeed, at 14, Weil would have a personal crisis in the face of what she considered her brother’s far superior abilities), but the moral sensitivity was her own and showed itself in various ways (for instance, refusing at age 5 to accept a necklace as a present on grounds of the discriminatory nature of luxury, and the very next year refusing to eat more sugar than that allotted to French troops as they battled the Germans).

She was educated at a number of schools and by private tutors before attending the Lycée Henry-IV as a pupil of the greatest philosophy teacher of the period, Émile Chartier (“Alain”). In 1928, and at her second attempt, she gained admission to the Ecole Normale Superieure, beating Simone de Beauvoir into second place in the Exam for General Philosophy and Logic. She studied philosophy there, graduating in 1931 with a diplome d’etudes superieures on the basis of her thesis “Science et perfection dans Descartes.” The same year, she passed the French Civil Service Examination (the agregation) and was appointed to a girls’ secondary school in the regional centre Le Puy, where she taught until 1936, with many breaks to pursue union activities, investigate Communist labour organizations in Germany, and fight on the Republican side in the Spanish Civil War.

After burning her foot badly stepping into a camouflaged pot of hot cooking oil, she left Spain and spent time in Portugal, then Italy, where she had her first mystical experiences.

The outbreak of World War II saw her in Paris, then, after the German invasion, in Marseille, publishing essays and doing what she could for those, often Jews like herself, seeking escape from Vichy France and the Nazi threat. In 1942, she accompanied her parents first to Morocco, then to New York, though she herself, determined to contribute to the Free French cause, soon returned to Europe, now to London. Weakened by inadequate nutrition and anguish, she died of tuberculosis on the evening of August 24, 1943, and, while not a baptised Catholic, was buried in a pauper’s grave in the Catholic Section of Bybrook Cemetery in Ashford, Kent.

2. Writings

Weil’s writings (collected now in 20 volumes) were produced in a mere 15 years. Much—including much of that which is most widely known—was published posthumously. Most of the work published in her lifetime was in the form of short essays for small political and literary journals, addressed to particular audiences. Such writings form only a small part of her collected work.

During her short life, she was most widely known as a political writer of the Left, an unorthodox and critical Marxist. Her most important work in this genre (though unpublished until 1955) was Reflections Concerning the Causes of Liberty and Social Oppression (1934). Around 1935, and especially after her first mystical experience in 1937, her writings took what many believed to be a new, religious direction. These writings, essays, notebooks, and letters she entrusted to the lay Catholic theologian Gustave Thibon in 1942, when, with her parents, she fled France. With the editorial help of Weil’s spiritual consultant (and sparring partner) Fr. Perrin, selections of these writings first made Weil widely known in the Anglo-American world. The serious effort for a complete publication of all Weil’s writings was largely the result of Albert Camus’ discovery of Weil’s writings while an editor at Gallimard (in 1951, he called her “the only great mind of our time.”) In 1988, Gallimard completed publication of her writings.

3. Suffering, Oppression, Liberty

In Memoirs of a Dutiful Daughter, de Beauvoir reports her first and perhaps only personal interaction with Weil in, most likely, 1929. “A great famine had just begun to devastate China,” she writes, and:

I was told on hearing the news she [Weil] had wept; these tears commanded my respect even more than her philosophical talents. I envied her for having a heart that could beat right across the world. One day I managed to approach her. I don’t remember how the conversation began; she declared in no uncertain terms that one thing alone mattered in the world today: the Revolution that would feed all the people on earth. I retorted, no less peremptorily, that the problem was not to make men happy, but to find a meaning for their existence. She looked me up and down: “It is easy to see you have never gone hungry,” she said. Our relationship stopped there. (239)

In this small exchange we see much of that which would shape Weil’s thought. What was basic for human life, and so a philosophy that dealt with the concerns of such a life, was not a quest for meaning, but rather a search for sustenance, for food. The food required was, in the end, both physical and spiritual, for there were needs of the body and needs of the soul. First there was, however, the need for physical sustenance. It followed that the primordial caring constitutive of the ethical must look always and first to the physical needs of other human beings. “It is an eternal obligation toward the human being not to let him suffer from hunger when one has a chance of coming to his assistance.”

This eternal obligation (eternal because constitutive) placed us as human beings into a shared community of mutual obligations.

For the early Weil, this eternal ethical obligation seemed, as it did at the time to many others, to be clearly and equally a political obligation (“revolution”). The task was to comprehend and, so far as possible, to deliver a social order that, because it enabled us to attend to the material needs of others, allowed those needs to be met.

It was here she found Karl Marx essential. “Marx’s truly great idea,” she wrote, was “that in human society as well as in nature nothing takes place otherwise than through material transformations.” It followed that to effectively meet our fundamental obligation required we uncover “the material conditions which determine our possibilities of action… conditions… defined by the way in which man obeys material necessities in supplying his own needs, in other words, by the method of production.”

For Weil, Marx could be understood as attempting to bring about a social order that enabled all in it to live, and so to be treated as ends-in-themselves. As such, it had to be a society free from oppression; and so a society in which all could (and did) attend to others, rather than viewing them indifferently, or as facilitating or hindering some personal or sectional interest or goal.

The trouble with Marx was not his failure to see this, it was his failure to understand the ultimate roots of oppression, and so what it would mean to overcome it. Thus, he thought that what we had to do was encourage the productive forces of capitalism so that they broke asunder the chains of labouring necessity; and he thought that the way to do this was to banish private property and so the drive for surplus value extraction.

However, as she saw it, this was not enough, and she pointed out that Marx himself at times seemed clearly to appreciate this. For the roots of the oppression that diminished, even sometimes obliterated, our capacity to attend to the basic needs of others did not lie solely, even mainly, in the fact of private property. She made the point this way:

“In the factory”… [Marx] writes in Capital, “there exists a mechanism independent of the workers, which incorporates them as living cogs… The separation of the spiritual forces that play a part in production from manual labour, and the transformation of the former into power exercised by capital over labour, attain their fulfilment in big industry founded on mechanization. The detail of the individual destiny of the machine-worker fades into insignificance before the science, the tremendous natural forces and the collective labour which are incorporated in the machines as a whole and constitute with them the employer’s power.” Thus the worker’s complete subordination to the undertaking and to those who run it is founded on the factory organization and not on the system of property [emphasis added]. (OL 9-10)

For Weil, the logic of “the factory system” that Marx had pointed to, even as he had missed its importance, was not limited simply to that system. It was, rather, a matter of the division—inherent to any social order above the most rudimentary—between intellectual and physical labour. This division was, at the same time, a division between people, dividing the human world into “two categories of men: those who command and those who obey.” This division undermined the foundations of ethical life because those who commanded could not avoid “reading” those they ordered about as—in the light of their being ordered about—means (or obstacles) to the desired ends. Such power over others as instruments or obstacles did two things to those who wielded it: it “intoxicated” them so that they no longer saw their own vulnerability before the necessities and contingencies of the world (their “ultimate fragility”), nor did they see, because of this intoxicated blindness, the humanity (and so the suffering) of those they lorded it over.

Still, as she saw it at this stage (before her discovery of the “enigma” of affliction), this did not mean that the capacity to attend to, and to care for, the suffering of others demanded “a miracle,” and so was something “supernatural.” What it demanded was, rather, a certain technique of compassion. “Human beings,” she wrote, “are so made that the ones who do the crushing feel nothing; it is the person crushed who feels what is happening.” If, in such a world—that is to say, in our world—ethical life was to find its footing, the challenge was clear: “unless one has placed oneself on the side of the oppressed,” she wrote, unless one “feel[s] with them, one cannot understand.”

4. Affliction, Detachment, the Impersonal, and the Sacred

At this point, for all its elegance and clarity, Weil’s moral philosophy was, ultimately, nothing out of the ordinary. Ethical life presupposed caring for others; and caring for others counted most essentially when others were in need, and so when they were suffering. The moral task was to let it register as it registered in and on the suffering one. It demanded an attentive compassion, understood as “the rarest and purest form of generosity.”

As an intellectual or theoretical stance, all this was unobjectionable, even admirable. However, it could not be simply and completely an intellectual or theoretical stance, for ethical life was also and fundamentally, a practical matter. Marx himself had insisted on this. He said, “the philosophers have only interpreted the world in various ways; the point, however, is to change it.” To change it in an ethical direction and from an ethical stance, however, one had to do more than simply say or think that one understood the oppression, and so the suffering, one sought to identify, alleviate, and eliminate. This was the problem with “the major Bolshevik leaders,” for they pretended “to create a free working class and yet none of them—definitely not Trotsky, and neither I think, Lenin… have… stepped foot into a factory and therefore have the least idea of the real conditions which determine the servitude or freedom of the workers.”

Obligations might be acknowledged, even fought for in revolutionary struggle, but to be truly recognised as the obligations, they had to penetrate. The point was particularly clear with suffering. For to acknowledge suffering as an ethical reality, it was not enough to endorse the description “so and so is suffering,” for that might be done by an entirely disinterested or impartial observer; rather, one needed to be penetrated by that suffering, and, out of the practical necessity involved in that penetration, to do what one could to meet the obligation that suffering imposed.

Here lay the real problem, and one that only came home to Weil when, in an effort to live up to and to live out her ethical vision, she went to work with those she saw at the time as most clearly as of the class of those “who obey”: oppressed, menial, piece-working factory labourers. In this decision and project, she meant to place herself “on the side of the oppressed,” to “feel with them,” and so to understand and to act. Here she would live—and in living, demonstrate—the fundamental penetrative point of the ethical, of obligation, in (and into) the realm of force.

What happened, however, was that she found—in others and in herself—something that seemed to tear the realm of force and the ethical life irretrievably apart: she discovered that suffering that is affliction (malheur, literally “calamitous misfortune”). The suffering “seared the soul.”

It was affliction that turned her moral philosophy away from the conventional and that led her to speak of ethical life in religious terms; and it was affliction that made, or allowed, her to see that what made a human being sacred, what made them the kind of being whose suffering counted, was no ascriptive empirical fact about them, no matter how essential to their “personality,” but was, rather, the impersonal in them.

Affliction was suffering that robbed its bearer of all dignity, both in the eyes of others and in their own eyes. It left them “mutilated,” valueless, worthless. It involved the twinned and catastrophic impact of physical pain (which might be simply the fear of such pain), and social humiliation, social degradation. Affliction, she wrote in a letter to Father Perrin, “takes possession of the soul and marks it through and through with its own particular mark, the mark of slavery,” and it was what she found, in her co-workers and so in herself, as they laboured for Alsthom and Renault. “The affliction of others entered into my flesh and my soul… There I received forever the mark of slavery” (WG 66-67).

What this experience showed her was that her initial political reading of the conditions essential to the morality of attentive caring was ultimately a superficial one: one that did not take morality and its demands on us seriously enough. While there was no doubt that things could be done to reduce the opportunities and occasions for suffering, affliction showed us that human identity, and so the human sense of self dignity and the dignity of others, was inherently fragile, able to be shattered at any time by the unforeseen contingencies of necessity and force that left “the victim writhing on the ground like a half-crushed worm,” “like a butterfly pinned alive into an album.” Unless this terrible and eternal fact had been allowed to penetrate us, even the best-intentioned reforms, even especially those driven by revolutionary righteousness, would produce, in due course, their own half-crushed worms, their own pinned-alive butterflies.

To take morality seriously meant taking affliction seriously, for if suffering mattered at all, it certainly mattered here. It was just at this point, however, where everything was in the balance, that the inadequacy of her previous understanding revealed itself, for with affliction caring attention—being penetrated by the object—was “impossible.” In the essay “The Love of God and Affliction”, she wrote that the afflicted:

…have no words to express what has happened to them. Among the people they meet, even those who have suffered much, those who have never had contact with affliction (properly defined) have no idea what it is. It is something specific, irreducible to any other thing, like sounds we cannot explain at all to a deaf-mute. And those who themselves have been mutilated by affliction are in no state to bring help to anyone at all, and nearly incapable of even desiring to help. (WG 120)

In fact, it was not simply that those who had never experienced affliction could not comprehend it, it was that any normal, “healthy” human being naturally fled from such recognition, from such penetration: “thought flees from affliction as promptly, as irresistibly, as an animal flees death,” and it did so for a like reason—for affliction manifested that force that turns a human being into a thing. It might not do so by killing outright, but—in a way even more shocking—it managed the paradoxical horror of “turn[ing] a human being into a thing while he is still alive.”

To care for the afflicted, to have been penetrated by affliction, and so to have enacted and lived that point where ethical life meets force (and—the same thing—to make real the point where justice meets and condemns slavery), was to love “where there is nothing to love.” This was why “when compassion truly produces itself, it is a miracle more astonishing than walking on water, healing the sick or even the resurrection of the dead.”

To understand the miracle that gave ethical authority power in a world of amoral force and necessity meant understanding what it was “to love human beings in so far as they come to be “read” by themselves and others “as nothing.”

This idea of attending to, of caring for, and so being penetrated by, a suffering that removed from its bearers “everything that makes us human” meant for Weil two things.

First, that what grounded our attention, our love, did not rest on or presuppose any positive (“valuable”) ascriptive fact about a person (for instance, their sense of rights, of freedom, their dignity or demand for respect, even their sense of hope or longing for the good). All these things, as she saw it, were matters merely of our “personality,” and it was our personality that, in affliction, was destroyed and annihilated. If there was to be any moral connection here, what was crucial could not be anything personal and individuating; as it were, something that stood there, able, as Eric O. Springsted put it, to “overcome circumstances, no matter how bad they are.” To the contrary, and as affliction showed us and the intoxication of power blinded us, “We possess nothing in the world—a mere chance can strip us of everything.”

And second, that to be penetrated by such suffering, such affliction, and so to recognise and respond to it, meant losing one’s own “personality,” one’s own individuality (“the power to say ‘I’”), and so to oneself experience the “void” of the living non-existence that is affliction. This was to be “de-created.” It was to accept the death, the absence, of all that made up our personality, and so to all that was particular in us that “attached” us to the world, and so made of it a kind of fantasy world, focally arrayed, and not something independent, impartially available, and so real. She wrote:

The reality of the world is the result of our attachment. It is the reality of the self which we transfer into things. It has nothing to do with independent reality. That is only perceptible through total detachment. Should only one thread remain, there is still attachment. (G&G 14)

Affliction destroyed the “I” of attachment, but it did not destroy or extinguish the possibility of ethical life and so the obligation to attend to such affliction. How could it? The void was real, as the necessity of avoiding, of fleeing, from it, brought home. It followed that the ultimate ground of value in us—the one that survived affliction insofar as it grounded an absolute obligation to meet and alleviate that personality annihilating suffering—was the “impersonal” in us, not the “personal.” In the 1933 essay “Human Personality,” she wrote:

Neither the person nor the human person in him or her is holy to me… Far from it: it is that which is impersonal in a human being. All that is impersonal in humankind is holy, and that alone. (SE 10,13)

Weil found it natural, even necessary, to speak of the impersonal in terms of our “soul,” and so of that which was “holy” in us, that which was “sacred,” and to view the de-creative capacity to attend to the impersonal in terms of “grace.” She found it equally natural, even necessary, to see the paradigm instance of this impersonality and its recognition, in the caring, afflicted, sacrifice of the Christ of the Crucifixion. However, just as often she spoke of the impersonal in terms of truth and (for her an aspect of the same thing) beauty, and it is this way of speaking that is perhaps the most instructive for philosophers, deriving as it does, and in her own unique way, from the philosopher she most valued, Plato.

For Weil, the pursuit of truth and our receptivity to beauty demanded, and so exhibited, the same kind of open, loving attention to the impersonal that was constitutive of the ethical life and its justice bringing gaze. She pointed, as she often did, to mathematical truth to explain the point. “If a child is doing a sum and does it wrong,” she wrote, “the mistake bears the stamp of his personality. [But] if he does the sum exactly right, his personality does not enter into it at all.” Her idea was that any error here would have to be explained in terms of something individual to the child calculator—for obviously a sum, being mistaken, could not explain itself. However, a sum done “exactly right” just was explained, and completely explained, by itself; it is what, by arithmetical necessity, emerged in an act of attention filled with, penetrated by, the relevant numbers and (so) their relationships. Here there was nothing essentially personal, as there was in any mistaken calculation, only the impersonal—and so universal—truth of the sum as revealed in an act of pure attention.

Of course, a sum done rightly possessed a beauty that one done wrongly lacked, and it was here truth and beauty came together. Not only because the perception or awareness of the beautiful demanded just that impersonal attention ethical life demanded, but—and this was the astounding and contradictory, indeed the redeeming aspect of affliction—because that which we selflessly attended to, that which we allowed to penetrate us as it was in itself, and so in all its truth, was, for that very reason, seen and experienced, even in the horrors of affliction, as (also, at the same time, eternally) beautiful. This, for Weil, was just how it was when it came to loving attention.

For Weil the internal tie between truth and beauty and loving attention—the tie that was constitutive, so “eternal,” in ethical life—found expression in the occasional miracles of compassionate awareness we might come across in life. However, we could find it expressed, too, in two works of supreme beauty: Homer’s Iliad, and the Gospels. In the authors of both, as they shaped their texts, we find expressed “the sense of human misery [that] was the precondition for justice and love.” Here was to be found “the incredible bitterness” of detached, sacred, justice as it penetrated into ethical void of the world of force.

In the Iliad, Weil wrote, this bitter justice:

proceeds from tenderness and that spreads over the whole human race, impartial as sunlight. Never does the tone lose its coloring of bitterness; yet never does the bitterness drop into lamentation. Justice and love, which have hardly any place in this study of extremes and of unjust acts of violence, nevertheless bathe the work in their light without ever becoming noticeable themselves, except as a kind of accent. Nothing precious is scorned, whether or not death is its destiny; everyone’s unhappiness is laid bare without dissimulation or disdain; no man is set above or below the condition common to all men; whatever is destroyed is regretted. Victors and vanquished are brought equally near us; under the same head, both are seen as counterparts of the poet, and the listener as well. (25)

Homer, in the Iliad, saw the infinite value and fragility of human life with a loving, “impersonal,” and (so) unsentimental compassion. He was penetrated by all—Greek and Trojan, defeated and momentarily victorious, Achilles and Priam—and, bathed in his impersonal love, fashioned from their lives an object of supreme, eternal, beauty.

5. Uprootedness and the Needs of the Soul

In December 1942, Weil arrived in London from New York, desperate to contribute to the cause of the Free French. In nine months, she would be dead.

In those months, she returned to the political concerns first broached in Oppression and Liberty. She did so reluctantly, and only because her proposal to train and lead a corps of front-line nurses had been rejected (de Gaulle, on reading her proposal, had exclaimed, “but she’s mad!”). Instead she was set to work analysing political documents sent to London from Resistance Committees in France, many of which concerned the reconstruction of France after the hoped-for Allied victory.

Weil’s contributions to this literature—Draft for a Statement of Human Obligation and The Need for Roots: Prelude towards a declaration of duties towards mankind—were never finally completed, but what was completed lets us see how she brought the moral seriousness she had developed and explored in the years since 1934 to those political concerns she had always had. While she may not have sought the task, she embraced it as a necessity. That was because while it was one thing, and a great thing, to have attended to the suffering and affliction of others, much of that suffering was the result of “social force,” and so the obligation to respond to that suffering had to address those forces. After all—as she had acknowledged from the start—morality at any stage beyond the socially rudimentary led inevitably to politics.

The very titles brought out, in a way only implicit in Oppression and Liberty, the untimeliness of her moral and political thought. For she did not begin with rights, nor with the ideal of liberal freedom encapsulated in Hobbes’ remark that a free man “is he that… is not hindered to do what he has a will to.” She built, rather, on the internal ethical connection between need and obligation:

Obligation is concerned with the needs in this world of the souls and bodies of human beings, whoever they may be. For each need there is a corresponding obligation: for each obligation a corresponding need. There is no other kind of obligation, so far as human affairs are concerned. (SE 21)

Needs and obligations were more fundamental than rights of any kind. Indeed, to think rights fundamental to “social conflicts” was itself a grave moral error, for it “inhibit[ed] any possible impulse of charity on both sides.” She continued:

Relying almost exclusively on this notion [“rights”], it becomes impossible to keep one’s eyes on the real problem. If someone tries to browbeat a farmer to sell his eggs at a moderate price, the farmer can say ‘I have the right to keep my eggs if I don’t get a good enough price.’ But if a young girl is being forced into a brothel she will not talk about her rights. In such a situation the word would sound ludicrously inadequate. (SE 21)

For Weil, rights were “middle level” moral concepts. They were not, and could not be, fundamental or “eternal.”

An obligation which goes unrecognised by anybody loses none of the full force of its existence. A right which goes unrecognised by anybody is not worth very much… Rights are always found to be related to certain conditions. Obligations alone remain independent of conditions. They belong to a realm situated above all conditions, because it is situated above this world. (NR 18)

The fundamental political obligation imposed equally on all of us, and just because of our shared humanity, was the obligation, according to our responsibilities and the extent of our power, to work to reduce to the barest minimum “all the privations of soul and body which are liable to destroy or damage the earthly life of any human being whatsoever.”

Her early claim, as de Beauvoir reported it, “that one thing alone mattered in the world today: the Revolution that would feed all the people on earth,” had deepened and ramified through her discovery of affliction. Affliction may have been grounded in our physicality, but it was much more than that. True affliction arose from “an event that grasps a life and uproots it attacks it directly or indirectly in all its parts—social, psychological, physical.”

Thus, to counter affliction it was not enough to propose a politics that met humanity’s bodily needs (food, shelter, warmth, rest, exercise, breathable air, and potable water), though all this was essential and basic; there had, too, to be a politics that met those needs of the soul crushed, violated, and extinguished, in the deracinated degradation of the afflicted. For while it was the “impersonal” in us that was sacred, this sacredness found its sacramental expression in just that concern for the attachments of the “I” that soul-wearing affliction obliterated. If affliction involved the uprooting of life, then countering it politically meant respecting the human need for roots.

“A human being,” Weil wrote, “has roots by virtue of his real, active and natural participation in the life of a community which preserves in living shape certain particular treasures of the past and certain particular expectations for the future.” This meant that the political challenge we faced—insofar as we concerned ourselves with justice, and not merely the demands, challenges, and threats of force—was immense. This was because “in an epoch like ours”—ruled by the worship of money, driven by a false (because force-centred) conception of greatness, and committed to an assertive, individualistic, “rights”-based (mis)conception of justice in the context of the loss of any living sense of “the sacred”—we were all of us uprooted. This is something that Marx and Weber had noted, too, but without understanding it as an ethical, and so a spiritual, sickness.

Weil had, by this time, no faith in revolutionary politics as the path to a more just, more rooted, human world. Indeed, she had come to see the hope, even the pursuit, of revolution as “the opium of the people.” A politics that recognised and so opposed affliction had to be a moral politics, and ultimately therefore a supernatural politics, for it was “only what comes from heaven that can make a real impress on the earth.” What was required—as an ideal, if never, here in the material domain, as a fully achievable actuality—was a politics, so a shared political vision, that embodied and expressed “poignantly tender feelings” for the “beautiful, precious fragile and perishable object” that is a human being.

This, for Weil, was a politics of equality, not the assertive competitive equality of rights (“to place the notion of rights at the centre of social conflicts is to inhibit any possible impulse of charity on both sides”). It was the political equality of the universal, the eternal, mutual community of needs-based human obligations. Equality, she wrote, “consists in a recognition, at once public, general, effective and genuinely expressed in institutions and customs, that the same amount of respect and consideration is due to every human being because this respect is due to the human being as such and is not a matter of degree.”

Such a world, such a political society, was not, nor could it be, a world entirely without force, a world without those who give orders and those who obey. The very point of the ethical life, of justice, was to bring that life, that justice, to the recalcitrant material world of force and power; it was not to annihilate it in its own orgy of affliction producing, because affliction is blind, power.

What mattered was that the division between order and obedience, between intellectual and physical labour, was absolutely minimised, and that the division that remained rested in the real consent of those who, here, obeyed. A clear and instructive instance of such consent was, she felt, to be found in friendship, for friendship was alive and real and meaningful only when “each wished to preserve the faculty of free consent both in himself and in the other.”

Placed on the level of politics, such a demand, Weil insisted, could only ever be answered in and from the contingencies of real political history. However, as a general point, and one deeply relevant to the modern centralising state and its uprooting capitalist economics, what was called for, what was demanded, was just that she had first pointed to in Oppression and Liberty: the cooperative and systematic decentralisation of society in such a way that no human being was deprived of the “relative and mixed goods (home, country, traditions, culture, etc.) which warm and nourish the soul and without which, apart from sanctity, a human life is impossible.”

Such a cooperative and systematic decentralisation would open up the possibility of our becoming rooted in the world, so in place and in history, in a way that linked and balanced particularity and universality, the local and the global.

That possibility, if it were to be real one, depended on our capacity to shape social force in ways that encouraged the conditions of mutual and attentive human respect, and so human self-respect. On one level, that simply meant organising our lives so as to facilitate the mutual and universal provision of our physical needs, but to be completed (and so to comprehend affliction), it had too to meet the needs of the soul. That, for Weil, meant balancing and harmonising what were, considered in themselves, antithetical needs. Indeed, it was just this antithetical character that allowed us to see the essential challenges for any politics of attention. Human beings, as beings free from the annihilating horrors of affliction, needed to organise themselves in such a way that they found an ordered world in which there was also individual freedom, a world in which there was true equality but also (for it was essential to any non-rudimentary social order) hierarchy, a world in which there was both the responsibility of command and necessity for freely provided consensual obedience, a secure world, but one that allowed for a certain level of risk, a world shaped by an absolute and fundamental concern for truth, but also one that allowed for a real freedom of opinion, and a world that had a place for both private and collective property. These antithetical but also complementary needs of the soul constituted the principles and the challenges of political wisdom. Only through their having real effect might we have any hope for a “flowering of fraternity, joy, beauty and happiness.”

6. The Moral Ground

In one crucial sense, Weil had no time for traditional philosophical concerns for a “foundation” or a “ground” of morality and the ethical life. Any such efforts—like Kant’s attempt to ground the absolute obligation to treat people as ends-in-themselves in their “reverence for the [rational] Law,” or Aristotle’s attempt to ground our ethical concerns in the individual’s drive for self-development, or Hume’s attempt to derive ethical life from our “limited sympathies” in the context of more general prudential and utilitarian calculations—did not work and could not work. Any individual-centred account went astray from the start, for moral life was, at its heart, a matter of inter-human attention and care, while any account that, like Hume, viewed the essential inter-human aspect in terms of limited sympathies and local concerns was focally individualistic, and so provided no basis on which the “supernatural” universal mutuality of moral obligation might have arisen.

However, there was another sense in which Weil was concerned to find a ground for morality. For if she could not give an account of how the capacity for selflessly receptive attention to the suffering of others arose in and from the human condition, and so from human nature, then her moral vision would simply hang there, a fantasy interesting, if at all, only for what it revealed of its author’s personality.

Weil’s morality might invoke the supernaturalness of eternally binding human obligation, but it could only do this and avoid fantasy if that supernatural aspect had its origins in human nature, as indeed, Weil thought, it clearly did.

On what natural foundation then, on what natural primitive fact, did the human capacity, such as it was, to attend to the suffering, ultimately the affliction, of other people arise and (to the extent it did) develop? For Weil, the crucial point was that human beings—primitively, and all things being equal—reacted differently to “things” than they did to other human beings, and that this was the case because of a certain basic or fundamental “power” we exercised over each other. As she wrote in her early essay, “The Iliad or The Poem of Force”:

Anybody who is in our vicinity exercises a certain power over us by his very presence, and a power that belongs to him alone, that is, the power of halting, repressing, modifying each movement that our body sketches out. If we step aside for a passer-by on the road, it is not the same thing as stepping aside to avoid a billboard; alone in our rooms we get up, walk about, sit down again quite differently from what we do when we have a visitor. (5)

Consider the case of the passerby; and assume a primitive situation—one where we what we have is simply a passer-by, not (say) someone we already “read” as an enemy, means, or obstacle. When we see the other person, headed towards us and our path, we “hesitate” in a way we do not if we see, instead, a billboard in the way. There is, with the person, but not the billboard, a certain reciprocal power that modifies “each movement our body sketches out.” Here, in this primitive, “impersonal,” but reciprocity recognising reaction of human to human, is found “that interval of hesitation, wherein lies all our consideration for our brothers in humanity.”

For Weil, such impersonal recognition of the human is the primitive ground of that attention that fills the space “between the impulse and the act,” and in doing this makes the other real for us, one with us, and so one of us. It was, indeed, just this hesitation and the capacity for attention it expressed and opened up for further elaboration that embedded in our (inter)relationship that fundamental equality that meant consent was essential to justice between us. And—perhaps even more fundamental—it was an impersonal hesitation before the human that presupposed and acknowledged that which—through the de-creative powers of affliction—could be destroyed and annihilated by the impact of the “empire of force.” This primitive human perception/reaction, this attentive hesitation that recognised our reciprocity and (so) mutuality, expressed the eternal moral fact on which all of obligation arose and rested. For in our hesitation in the face of the passer-by, in their power to halt, repress, and modify each movement “our body sketches out,” lies an implicit recognition: the recognition of the “supernatural” fact that:

…at the bottom of the heart of every human being, from earliest infancy until the tomb, there is something that goes on indomitably expecting, in the teeth of all experience of crimes committed, suffered, and witnessed, that good and not evil will be done to him. It is this above all that is sacred in every human being. (SE 10)

It was here, “beyond space and time,” and as revealed in our primitive natural history, that Justice, that the Good, revealed itself in its eternal purity. It was here that Weil finally brought together her two most influential historical interlocutors, Kant and Plato. For the ground of our duty to treat others always and never merely as means, but ends in themselves, arose, not from “reverence for the (moral) law,” but from our primitive and reciprocal expectation that in the world, and so “in the teeth of all experience of crimes committed, suffered, and witnessed,” “good and not evil” will be done to us. This “indomitable expectation” is where morality enters the world of force and necessity. It is where the supernatural and the natural world make contact in the sacredness of the impersonal obligation to meet human needs.

7. References and Further Reading

a. Primary

Waiting on God. tr. Emma Cruwfurd, (Harper & Row, New York, 1973.)
Formative Writings: 1929–1941. eds. Dorothy Tuck McFarland and Wilhelmina Van Ness, (University of Massachusetts Press, 1987.)
Intimations of Christianity Among the Greeks. tr. Elisabeth Chas Geissbuhler, (Routledge Kegan Paul, London, 1957.)
Letter to a Priest. tr. Arthur Wills, (G. P. Putnam’s Sons, New York, 1954.)
The Need for Roots. tr. Arthur Wills, (Routledge Classics, London, 2002.)
Gravity and Grace. tr. Emma Crawford and Mario van der Ruhr, (Routledge Classics, London, 2002.)
The Notebooks of Simone Weil. tr. Arthur Wills, (Routledge, London, 2003.)
On Science, Necessity, & The Love of God. tr. Richard Rees, (Oxford University Press, 1968.)
Oppression and Liberty. tr. Arthur Wills and John Petrie (Routledge Classics, London, 2001.)
The Iliad, or the Poem of Force. tr. Mary McCarthy, Chicago Review 18:2 1965.
Simone Weil: First and Last Notebooks. tr. Richard Rees, (Oxford University Press, 1970.)
Simone Weil: Lectures on Philosophy. tr. Hugh Price, (Cambridge University Press, 1978.)
Simone Weil—Selected Essays: 1934–1943. tr. Richards Rees, (Oxford University Press, 1962.)
Simone Weil: Seventy Letters. tr. Richard Rees, (Oxford University Press, 1965.)
On the Abolition of All Political Parties. tr. Simon Leys, (Black Inc., Melbourne, 2013.)

b. Biographical

The deep connection between Weil’s thought and life has seen many authors explore her philosophy through her biography. Here are some of those.

Cabaud, Jacques, Simone Weil, (Channel Press, New York, 1964.)
Fiori, Gabriella, Simone Weil: An Intellectual Biography. tr. Joseph R. Berrigan, (University of Georgia Press, 1989.)
Gray, Francine Du Plessix, Simone Weil, (Viking Press, New York, 2001.)
McLellan, David, Utopian Pessimist: The Life and Thought of Simone Weil, (New York: Poseidon Press, 1990.)
Perrin, J.B. and Thibon, G., Simone Weil as We Knew Her. tr. Emma Craufurd, (Routledge & Kegan Paul, 1953.)
Pétrement, Simone (1976) Simone Weil: A Life. tr. Raymond Roenthal, (Pantheon, New York, 1977.)
White, George A., ed. (1981). Simone Weil: Interpretations of a Life, University of Massachusetts Press (1981.)
Yourgrau, Palle, Simone Weil, Critical Lives Series, (Reaktion Press, London, 2011.)
Weil, Sylvie, At Home with André and Simone Weil. tr. Benjamin Ivry, (Northwestern University Press, 2010.)

c. Secondary

Allen, Diogenes, Three Outsiders: Pascal, Kierkegaard, Simone Weil, (Wipf and Stock, Eugene, 2006.)
Blanchot, Maurice, The Infinite Conversation. tr. Susan Hanson, (University of Minnesota Press, 1993.)
Bell, Richard H., Simone Weil, (Rowman & Littlefield,1998.)
Chenavier, Robert, Simone Weil: Attention to the Real. tr. Bernard E. Doering. (University of Notre Dame Press, 2012.)
Dietz, Mary, Between the Human and the Divine: The Political Thought of Simone Weil, (Rowman & Littlefield, 1988.)
Doering, E. Jane, Simone Weil and the Specter of Self-Perpetuating Force. (University of Notre Dame Press, 2010.)
Doering, E. Jane, and Eric O. Springsted, eds. The Christian Platonism of Simone Weil, (University of Notre Dame Press, 2004.)
Finch, Henry Leroy, Weil and the Intellect of Grace, (Continuum International, New York, 1999.)
Irwin, Alexander, Saints of the Impossible: Bataille, Weil, and the Politics of the Sacred, (University of Minnesota Press, 2002.)
McCullough, Lissa, The Religious Philosophy of Simone Weil, (I. B. Tauris, London, 2014.)
Morgan, Vance G., Weaving the World: Simone Weil on Science, Mathematics, and Love, (University of Notre Dame Press, 2005.)
Moulakis, Athansios, Simone Weil and the Politics of Self-Denial. tr. Ruth Hein, (University of Missouri Press, 1998.)
Plant, Stephen, Simone Weil: A Brief Introduction, (Orbis Books, 2007).
Radzins, Inese Astra, Thinking Nothing: Simone Weil’s Cosmology, (Vanderbilt University, 2005.)
Rhees, Rush, Discussions of Simone Weil, (SUNY Press, 2005.)
Rozelle-Stone, Rebecca A., and Stone, Lucien, Simone Weil and Theology, (Bloomsbury, New York, 2013.)
Springsted, Eric O. (2010) Simone Weil and the Suffering of Love. Wipf and Stock Publishers.
Veto, Miklos, The Religious Metaphysics of Simone Weil. tr. Joan Dargan, (State University of New York Press, 1994.)
von der Ruhr, Mario, Simone Weil: An Apprenticeship in Attention, (Continuum, London, 2006.)
Winch, Peter, Simone Weil: “The Just Balance,” (Cambridge University Press, 1989.)

Author Information

Tony Lynch
Email: alynch@une.edu.au
University of New England
Australia

Haskell Brooks Curry (1900-1982)

Haskell Brooks Curry was a mathematical logician who developed a distinct philosophy of mathematics. Most of his work was technical: he was the major developer of combinatory logic, which nowadays plays a role in theoretical computer science. This formalism was originally intended to be a basis for a system of symbolic logic in the usual sense, but the original system turned out to be inconsistent, and the core which was consistent later became a formalism that is a kind of prototype of the computer languages called functional, in which programs are allowed to apply to and change other programs. It is essentially equivalent to the lambda-calculus (λ-calculus) of Alonzo Church. (See the article on λ-calculi in this encyclopedia.)

Curry’s work on combinatory logic led him to a notion of formal system which is different in some respects from the one which has since become standard. In addition, Curry became interested in proof theory, especially the work of Gerhard Gentzen. Curry wanted to use these ideas in his search for a consistent system of logic based on combinatory logic. Curry also did some work on computing in the early days, including work on the ENIAC (one of the first electronic computers) immediately after World War II. Finally, he also became known for a philosophy of mathematics that he called formalism, which he originally considered as denying mathematics as the science of formal systems (in his sense), but which he later extended to include formal methods in general. This idea of formalism is probably better thought of today as a form of structuralism.

Biography
Combinatory Logic
Gentzen-style Proof Theory
War Work and Computing
Formalism: the Philosophy of Mathematics
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Biography

Haskell Brooks Curry was born on September 12, 1900 at Millis, Massachusetts. His father was Samuel Silas Curry, president of the School of Expression of Boston, Massachusetts. The School of Expression was originally founded by Anna Baright in 1879 as the School of Elocution and Expression. It was renamed in 1885, after Anna Baright married Samuel Silas Curry. It became Curry College in 1943. His mother was Anna Baright, who was Dean of the School of Expression. He graduated from high school in 1916 and entered Harvard University with the intention of going into medicine. During his first year, he took a mathematics course at the suggestion of his advisor and did very well. In the Spring of 1917, the United States entered World War I, and Curry responded by enlisting in the army, becoming a member of the Student Army Training Corps on October 18, 1918. He felt he would never play a direct role in the war if he continued with his pre-medical course, so he changed his major to mathematics with the idea of going into the artillery. The war ended on November 11, 1918, and Curry left the army on December 9, 1918, but he kept on in mathematics, receiving his A. B. degree in 1920.

For the next two years he studied electrical engineering at MIT in a program that involved working half-time at the General Electric Company. Because he was usually interested in why an answer was correct when the engineers seemed interested only in the fact that it was correct, he decided that he would be better off pursuing a degree in pure science, and in 1922 he switched to physics. He returned to Harvard, where for the year 1922–23 he was a half-time research assistant to P. W. Bridgman, who later won the Nobel prize in physics. In 1924 he received his A.M. in physics (from Harvard). But by this time his interests had shifted still further, and he now switched to mathematics. (During this period, both of his parents died, his father dying in 1921 and his mother in 1924.)

He continued to study mathematics at Harvard until 1927, where he was a half-time instructor during the first semester of 1926-27 but otherwise studied full-time. He was also involved in the business affairs of his family, the School of Expression.

During this period, Curry had become interested in logic. Originally, all of his logic was reading on the side, and at one point he was supposed to be working on a dissertation on a topic in differential equations assigned to him by George D. Birkhoff. Furthermore, he was getting advice from various faculty members at Harvard and elsewhere to stay away from logic. This advice was especially strong from Norbert Wiener, who was at MIT and who was a member of the same birdwatching club as Curry. But Curry had become too interested in logic to stop thinking about it. He was especially interested in the first chapter of Principia Mathematica [Russell and Whitehead 1910-1913], which he started reading in 1922 when he was 21 years old, and where a system of propositional logic is defined by means of axioms and two primitive rules. The first one is detachment, which says that from not -$p$ or $ q $ and from $ p $ to deduce $ q $ (this is equivalent to modus ponens, which says that from $ p \supset q $ and $ p $ to deduce $ q $). The second one is substitution, which says that given any formula, any formula obtained by substituting another formula for a variable can be deduced; for example, if from the formula $p \supset p$, one can substitute $\neg q \vee r$ for $p$ to get $\neg q \vee r \supset \neg q \vee r$. Curry noticed when he first saw this that the rule of substitution is much more complicated than detachment in the sense that today we would find it more complicated to implement in a computer language. In 1926-27, as a result of trying to analyze substitution down to its simplest elements, Curry had the idea for using operators which he called combinators, the term we still use today. He used these operators to analyze this rule of substitution, and he concluded that this idea might lead to a dissertation. When he took this idea to several professors, he got a different reaction than he had previously had about staying away from logic. This was especially true of Norbert Wiener at MIT, who said that his opinion had been that logic was a subject to be avoided “unless you had something to say,” and since Curry clearly had something to say, “strength to your right arm!”

However, there was no faculty member at Harvard who could supervise a dissertation on this topic. So Curry decided that it would be useful to teach for a year, and, after getting a recommendation for the position from George D. Birkhoff, assumed an instructorship at Princeton for the year 1927-28. During a library search there he found the paper by Moses Schönfinkel, [Schönfinkel 1924], a report of a talk given at Göttingen in 1920, which had clearly anticipated his ideas. Curry was shocked at this anticipation because he had thought his ideas were completely original, and he ran to the office of Oswald Veblen, who, although primarily a geometer, was interested in the foundations of mathematics and who was also the PhD supervisor of Alonzo Church, to tell him about the anticipation. Veblen calmed Curry down by saying, “Good, I am always glad when somebody has one of my ideas, for it shows that I am on the right track.” To find out more about Schönfinkel, Veblen then took Curry to see the Russian topologist Pavel Alexandroff, who was visiting Princeton that year. Alexandroff reported that Schönfinkel was in a mental hospital and was unlikely to resume his mathematical work, but that at Göttingen were several mathematicians, including Paul Bernays, who were probably betteer paced to discuss these topics. It was thus decided that Curry should go to Germany.

As part of an application for financial support for that trip, Curry wrote his first published paper, [Curry 1929]. Before leaving for Germany, Curry married, on July 3, 1928, Mary Virginia Wheatley of Hurlock, Maryland. (Virginia had been a student at the School for Expression, where they met.) After the wedding, the Currys left for Germany, where they spent the year 1928-29 at Göttingen. During that year, Curry first met the logician Alonzo Church, who was there for half the year.

That year at Göttingen was enough for Curry to complete his dissertation. His referee was David Hilbert, although he actually did most of his work with Paul Bernays, and he was examined on July 24, 1929. At this examination, Hilbert asked Curry a question on another topic (called automorphic functions), which Hilbert assumed that Curry would not know. As it happened, Curry had taken a course on that very subject at Harvard, and Curry was able to give a good answer. Hilbert responded by asking in great surprise, “Wo haben sie das gelernt?” (“Where did you learn that?”) The dissertation was published (in German) as [Curry 1930].

Curry now needed a job, and he took up a position as an Assistant Professor at the Pennsylvania State College (Penn State – Penn State became the Pennsylvania State University in 1953). Eventually, most people who knew Curry came to associate him with Penn State, but when he first went there he did not plan to stay long. He had been at Harvard, Princeton, and Göttingen, and at Penn State he felt cut off from most of his former academic community. Furthermore, in those days, Penn State did not support research. (Later, thanks partly to Curry’s influence, Penn State changed its policy, and it is now a major research institution.) But his arrival there coincided with the beginning of the great depression, and the demand for logicians in the academic world was not very high. So he remained and settled down at Penn State, staying there, with the exception of several leaves of absence, until his retirement in 1966. He progressed normally through the academic ranks, becoming an Associate Professor in 1933 and a full Professor in 1941.

Everybody who knew the Currys was aware of how friendly and helpful they always were. Curry always did more for colleagues and students than be a source of important ideas (although, of course, his ideas have been of tremendous importance). He was always willing to listen to anybody who wanted to talk to him, to discuss their ideas, and to give whatever encouragement he could. His office door was always open. Also well known wherever the Currys lived was the hospitality they both showed. There were always many parties and other, less formal, gatherings. Curry also had a playful sense of humor.

The first of his leaves of absence was a year at the University of Chicago in 1931-32 as a National Research Council Fellow. (The original award was supposed to extend into the following year, but the second year was cancelled for Curry because he had a job to go back to and there were other National Research Council Fellows who did not. It was, after all, the depths of the Great Depression.) In 1938-39, Curry was in residence at the Institute for Advanced Study in Princeton.

Otherwise, Curry spent the 1930s at Penn State teaching and carrying on his research. During this period he was on the reviewing staff of the Zentralblatt für Mathematik und ihre Grenzgebiete (1931-1939). In 1936, he became a founding member of the Association for Symbolic Logic; he was Vice President in 1936-37 and President in 1938-40 as well as being a member of the Council as ex-president during 1942-46.

During this period, the Currys also began their family: Anne Wright Curry (later Mrs. Richard S. Piper) was born on July 27, 1930, and Robert Wheatley Curry followed on July 6, 1934.

By the end of the 1930s, Curry was established as one of the most important mathematical logicians in the United States and, in fact, in the entire world. As such, he was asked to present his views on the nature of mathematics to the International Congress for the Unity of Science held at Cambridge, Massachusetts at the beginning of September 1939. The result was a long manuscript of which he presented a shorter version to the Congress, [Curry 1939]. A series of papers on the philosophy of mathematics began with this paper and continued for the rest of his life.

In the following year, 1940, Curry became a member of the Board of Trustees of Curry College, formerly the School of Expression, the institution of which his father had been president. He remained a member until 1951. Later, on June 5, 1966, the college presented him with the honorary degree of Doctor of Science in Oratory.

When the United States entered World War II, Curry decided to put logic aside for the duration of the war. From 1940 until 1942 he had been a member of the National Committee on War Preparedness of the American Mathematical Society and the Mathematical Association of America. On May 25, 1942, he left Penn State and went to the Frankford Arsenal, where he worked as an applied mathematician until January 1944; then he went to the Applied Physics Laboratory at Johns Hopkins University, where he remained until March, 1945. Next he went to the Ballistic Research Laboratories at the Aberdeen Proving Ground, where he stayed until September, 1946. During his last three months there, he was Chief of the Theory Section of the Computing Laboratory and for one month he was Acting Chief of the Computing Laboratory; it was during this period that he became involved with the ENIAC computer. As a result of this experience he was a consultant in the field of computing methods to the United States Naval Ordinance Laboratory from June 1, 1948 until June 30, 1949.

In September, 1946, Curry returned to Penn State. He wanted to pursue his work on electronic computers, and so he tried to interest the university in acquiring some computing equipment. He was unsuccessful in this. He persisted until a colleague pointed out to him that if he did succeed, he would probably be made head of the program without any increase in salary. He then decided that this colleague was right and gave up the attempt. This effectively limited him from pursuing computing theory.

He was, however, getting back to logic. In Amsterdam in the summer of 1948, during the Tenth International Congress of Philosophy, it was proposed to him that he write a little book of under 100 pages on the subject of combinatory logic for the new North-Holland series in logic. He felt that there was too much unpublished research on the subject to write such a short book, and so he sent them instead his philosophical manuscript from 1939 with a few minor revisions. This appeared as [Curry 1951]. But this idea did suggest to him the project that eventually led to his two volumes with the title Combinatory Logic [Curry and Feys 1958] and [Curry et al.1972]. Feeling that he needed a collaborator, especially one who was better than he was at exposition, he decided to work with Robert Feys, who had published some papers on combinatory logic. Curry thus obtained a Fullbright grant and spent the year 1950-51 at Louvain in Belgium. After his return to Penn State, he and Feys continued their work, and the manuscript of [Curry and Feys 1958] was completed in 1956. The book appeared in 1958, published by North-Holland.

Meanwhile, money finally became available at Penn State for graduate students. Edward J. Cogan first approached Curry before he left for Louvain, and worked with Curry after he returned in 1951, finishing his dissertation in 1955. Kenneth L. Loewen also studied with him during this period, but left to take an academic position elsewhere in 1954 and did not finish his dissertation until 1962.

After the completion of [Curry and Feys 1958] Curry turned his attention to Gentzen-style proof theory. He had done some previous work on this, including a series of lectures delivered at Notre Dame University in Indiana in April, 1948 (which resulted in his book [Curry 1950]), and he felt that it formalized the kind of reasoning used in the development of the part of combinatory logic as a system of logic in the usual sense, and so he felt that it should be settled before he began work on [Curry et al. 1972]. He thus began work on what became his book [Curry 1963]. This work was made easier when, in 1959, he became Evan Pugh Research Professor and was thus relieved of undergraduate teaching duties. The manuscript of [Curry 1963] was completed in 1961.

By this time, there were two more graduate students, Bruce Lercher and Luis E. Sanchis, both of whom completed their dissertations in 1963.

From February to September, 1962, the Currys took a trip around the world, visiting a number of universities where Curry gave lectures.

In 1964, Curry met two new future collaborators. J. Roger Hindley arrived at Penn State for a lectureship which served as something of a postdoctoral position after finishing his dessertation at Newcastle-upon-Tyne, and Jonathan P. Seldin arrived as a beginning graduate student. Curry was just beginning work on [Curry et al. 1972]. Unfortunately, Feys had died in 1961, and Curry, left to work alone, soon realized that he needed collaborators. In 1965, he invited Hindley to join him on the project.

In 1966, Curry retired from Penn State after being there for 37 years. He then went to Amsterdam, where for the next four years he was Professor of Logic, History of Logic, and Philosophy of Science, and also Director of the Instituut voor Grondslagenonderzoek en Philosophie der Exacte Wetenschappen, both at the university of Amsterdam. Seldin went to Amsterdam on a Graduate Fellowship from the United States National Science Foundation, and completed his dissertation in 1968, after which he joined Curry and Hindley as a co-author of the book they were then writing. Curry had one more graduate student in Amsterdam, Martin W. Bunder, who finished his dissertation in 1969.

The manuscript of [Curry et al. 1972], was completed in May, 1970, just before Curry retired from the University of Amsterdam. He returned to State College, Pennsylvania (the town in which Penn State is located), where he continued his mathematical work, writing reviews (especially for Mathematical Reviews) and occasional papers. John A. Lever wrote a master’s thesis with him there in 1977 after obtaining special permission from the university authorities to work under a retired professor. In 1971-72, Curry accepted a visiting position at the University of Pittsburgh. Otherwise, he and Virginia remained at State College, except for some occasional trips, until his death on September 1, 1982. Curry left his papers to the library at Penn State.

Curry’s hobby throughout his life was bird watching, and by the end of his life, Curry had a reputation as an amateur ornithologist.

2. Combinatory Logic

a. Beginning Period

Curry invented combinatory logic independently by analyzing the operation of the substitution of a well-formed formula for a propositional variable in the system of propositional logic of the first chapter of [Russell and Whitehead 1910-1913]. He intended combinatory logic to be a foundation for mathematical logic and perhaps also for all of mathematics. Much of the subject is extremely technical. This will be as non-technical an introduction as it is possible to write.

The basic idea here is that of a function, which is a mathematical operation which does something to an input. Thus, for example, there is the numerical function which squares its argument (i.e., multiplies it by itself). Mathematicians usually write that if $f$ is the squaring function, then for each possible argument (input) $x$, $f(x) = x^{2}$. Then, if this function is applied to the number 3, we get $f(3) = 3^{2} = 9$.

In combinatory logic, the application of a function to an argument, such as $f(3)$, is written $(f3)$ or $f3$. Also, the need for functions of more than one variable is avoided by allowing the value of a function to be another function. For example, suppose, in traditional notation, $f(x,y) = x – y$. Then let $g(x) = h_x(y)$ where $h_x(y) = x – y$. Then $f(3,y) = 3 – y = h_3(y)$. In combinatory notation, $ (gx)y = x – y$ and $ (g3)y = 3 – y$. In this notation, we use association to the left for application, so that $gxy = (gx)y$.

This method of using only functions of one argument has come to be called currying, and the function $g$ of the previous paragraph is often called $(curry f)$. Curry himself learned of this use of his name in his last years, and he protested because he had gotten the idea from Schönfinkel, but this use of Curry’s name has stuck.

Other combinators are:

The identity operator $\mathsf{I}$, with the property that $\mathsf{I} x = x$.
The constancy operator $\mathsf{K}$ with the property that $\mathsf{K} xy = x$. Thus, $\mathsf{C} x$ is a constant function whose value for any argument is $x$.
The compositor $\mathsf{B}$, with the property that $\mathsf{B} xyz = x(yz)$. This says that to apply $\mathsf{B} xy$ to $z$, first apply $y$ to $z$ and then apply $x$ to the result.
The diagonalizer $\mathsf{W}$ with the property that $\mathsf{W} xy = xyy$.
The distributor $\mathsf{S}$ with the property that $\mathsf{S} xyz = xz(yz)$.

Note that $\mathsf{I}$ can be defined in terms of the other operators, since $\mathsf{W} \mathsf{K} x = \mathsf{K} xx = x$, so $\mathsf{I} = \mathsf{W} \mathsf{K}$. Also, since $\mathsf{S} \mathsf{K} \mathsf{K} x = \mathsf{K} x(\mathsf{K} x) = x$, $\mathsf{I}$ can be defined as $\mathsf{S} \mathsf{K} \mathsf{K}$.

Now suppose we want to say that an operation, say addition, is commutative (i.e. the order of adding does not matter). The traditional way of writing this in mathematics is $x + y = y + x$. But this is not a property of $x$ and $y$; it is a property of +. To say this in the language of combinatory logic, we would write $+xy = +yx$. Now suppose we have an operator $\mathsf{C}$ (for “commutator” with the property that $ (\mathsf{C} x)yz = xzy$. Then $+yx = (\mathsf{C} +)xy$, and we can say that + is commutative by writing $ (\mathsf{C} +) = +$. This operator $\mathsf{C}$ is called a combinator.

The defining rules for these combinators have been written above with the equality symbol, which is symmetric. But it is often useful to read these equations only from left to right. Then these equations would be called contractions, so that $\mathsf{I} x$ contracts to $x$, $\mathsf{C} xyz$ contracts to $xzy$, $\mathsf{K} xy$ contracts to $x$, $\mathsf{B} xyz$ contracts to $x(yz)$, $\mathsf{W} xy$ contracts to $xyy$, and $\mathsf{S} xyz$ contracts to $xz(yz)$. Terms are reduced to other terms by performing sequences of 0 or more contractions on subterms of the original term. For example, the reduction of $\mathsf{S} \mathsf{K} \mathsf{K} x$ to $x$ is as follows:

$\mathsf{S} \mathsf{K} \mathsf{K} x \rhd \mathsf{K} x (\mathsf{K} x) \rhd x$.

(Here I am using the symbol ‘$\rhd$’ to indicate a reduction.) Note that there are some terms which cannot be reduced. These terms are said to be in normal form. On the other hand, some terms can lead to infinite reductions, for example

$\mathsf{W} \mathsf{W} \mathsf{W} \rhd \mathsf{W} \mathsf{W} \mathsf{W} \rhd \ldots$.

Curry decided to found mathematical logic on a system of combinators whose primitive combinators were $\mathsf{B}, \mathsf{C}, \mathsf{K},$ and $\mathsf{W}$. (He did not yet understand the role of $\mathsf{S}$, which he got from Schönfinkel.) The part of combinatory logic that deals with the basic properties of the combinatory terms without reference to logical connectives and quantifiers is now called pure combinatory logic. He was going to add logical connectives and quantifiers until he had developed a complete system of logic; this part of the subject he called illative combinatory logic. This word “illative” is a word Curry coined himself, based on the Latin word illatum, the past participle of infero, which means “to conclude”.

He proved several important results in this context. First of all he proved that if $X$ is any combination of combinators and the variables $x_{1}, x_{2}, \ldots x_{n}$, there is a term $F$ in which the variables $x_{1}, x_{2}, \ldots x_{n}$ do not appear such that $Fx_{1}x_{2}\ldots x_{n} = X$. Curry used the notation $ [x_{1}, x_{2}, \ldots , x_{n}]X$ for this $F$. For example, since $\mathsf{S} \mathsf{I} \mathsf{I} x \rhd \mathsf{I} x (\mathsf{I} x) \rhd xx$, we can take $\mathsf{S} \mathsf{I} \mathsf{I}$ to be $ [x]xx$. He also gave axioms for the system so that this $F$ was uniquely determined by $X$ and the variables in question. (The existence of such an abstract for every term $X$ and all variables $x_{1}, x_{2}, \ldots , x_{n}$ is called combinatory completeness.) Another of the things he proved early on (in his dissertation) is that the basic system of combinators, without any axioms for any logical connectives or quantifiers, is consistent.

Using the notation of combinators, Curry wrote what is normally written $(\forall x)A$ as $\Pi X$, where $Xx = A$. This operator $\Pi$ was present in his dissertation, but none of its properties were developed there. Instead, Curry started writing a series of papers expanding combinatory logic to include not only this universal quantifier $\Pi$, but also $\mathsf{P}$ (for implication, so that $\mathsf{P} XY = X \supset Y$, or if $X$ then $Y$) and equality $\mathsf{Q}$, so that $\mathsf{Q} xy$ means $x = y$. In 1934, Curry published [Curry 1934a] giving properties of $\mathsf{P}$ and $\mathsf{Q}$.

b. The Kleene-Rosser Paradox and its Aftermath

In 1932, Curry learned of a paper by Alonzo Church, [Church 1932]. Church’s system was based on $\lambda$-abstraction, which forms terms from variables by application and abstraction: if $x$ is a variable and $M$ is a term, then $ (\lambda x \;.\; M)$ is a term. (The outermost parentheses may be omitted if no confusion results.) For example, $ (\lambda x \;.\; x^{2})$ is the squaring function, and $ (\lambda x \;.\; x^{2})3 = 3^{2} = 9$. Here, $ (\lambda x_{1} x_{2} \ldots x_{n} \;.\; M)$, which is an abbreviation for $ (\lambda x_{1} \;.\; (\lambda x_{2} \;.\; \ldots (\lambda x_{n} \;.\; M) \ldots ))$, plays the role of Curry’s $ [x_{1}, x_{2}, \ldots , x_{n}]X$. (For a complete introduction to both $\lambda$-calculus and combinatory logic, see [Hindley and Seldin 2008]. See also the article on $\lambda$-calculi in this Encyclopedia.) Also, the variables $x$ in $\lambda x \;.\; M$ is called bound; variables not within the scope of a $\lambda$ are called free.

Reduction for Church’s system is defined by a rule that Curry called $(\beta)$: $(\lambda x \;.\; M)N$ contracts to $ [N/x]M$, which is the result of substituting $N$ for $x$ in $M$, where other bound variables are changed to avoid capture. In ordinary predicate logic, this sort of change is made by changing $ (\forall x)(x < y)$ to $ (\forall z)(z < y)$ if a term in which $x$ occurs free is substituted for $y$.

Note that reduction in Church’s system differs from reduction in combinatory logic in that if $M$ reduces to $N$, then $\lambda x \;.\; M$ reduces to $\lambda x \;.\; N$, but in combinatory logic the fact that $X$ reduces to $Y$ does not automatically imply that $ [x]X$ reduces to $ [x]Y$, since subterms of $X$ often do not really occur in $[x]X$.

In 1934, Curry received a letter from Rosser informing him that Kleene and Rosser had proved inconsistent the system of [Church 1932] and the system of [Curry 1934]. They did this by deriving Richard’s Paradox (See the article on Richard’s Paradox in this Encyclopedia.) in both systems.

Church and his students, Kleene and Rosser, then gave up on the idea of building a system of mathematical logic adequate for all of mathematics by basing the system on $\lambda$-terms. Instead, they took that part of Church’s system involving only $\lambda$-terms and treated it separately as the $\lambda$-calculus. (See the article on $\lambda$-calculi But Curry had a different reaction. He had always considered the possibility that some systems he would propose might be inconsistent, and so he reacted by beginning a careful analysis of the paradox with the idea of finding a way to define a consistent system.

This analysis lasted for several years, and by the time he took a leave of absence from Penn State to do applied mathematics for the U.S. government during World War II, he had developed a plan for research to look for consistent systems. He had already published [Curry 1941], and he had found a much simpler paradox (now known as Curry’s Paradox; see [Curry 1942b]). The plan he had developed was to look at three different kinds of systems, which differed in the logical connectives and quantifiers that were taken as primitive. The kinds of systems will be discussed here in the order Curry gave them in [Curry 1942a].

Systems based on the theory of functionality. This was Curry’s idea, dating back to 1930, that led to type assignment. He wrote $\mathsf{F} \alpha \beta$ for the predicate of functions which take arguments in $\alpha$ with values in $\beta$, and he intended $\mathsf{F} \alpha \beta X$ to mean $ (\forall x)(\alpha x \supset \beta (Xx))$. Nowadays, the category (or predicate) $\mathsf{F} \alpha \beta$ is considered a type rather than a predicate, and is usually written $\alpha \rightarrow \beta$.
Systems based on the theory of restricted generality. Curry had noted that most universal quantification is not absolute, but is over some restricted domain. (This seems obvious nowadays, but in the 1930s it ran counter to the generalising tendency of Frege and Russell.) He defined an operator $\Xi$ to stand for this restricted quantification, so that $\Xi X Y$ would stand for $ (\forall x : X)(Yx)$, or $ (\forall x)(Xx \supset Yx)$ (where here $x$ does not occur free in $X$ or $Y$).
Systems based on the theory of universal generality. These were systems based on $\Pi$ and $\mathsf{P}$, where $\Pi X$ meant $(\forall x)(Xx)$ (where $x$ does not occur free in $X$) and $\mathsf{P} XY$ means $X \supset Y$.

In 1942, Curry assumed that these kinds of systems increased in strength in the order given above. The paper [Curry 1942a] was really an abstract of future research rather than a report on completed work.

In the late spring of 1942, Curry finally came to understand the combinator $\mathsf{S}$. Rosser had published a paper on combinatory logic (based on different basic combinators from those Curry used), and he had shown how to define $[x]X$ by induction on the structure of $X$. When Curry read this paper and translated the results into his own formalism, he realized why Schönfinkel had defined all combinators in terms of $\mathsf{K}$ and $\mathsf{S}$, and he started to do the same. The use of $\mathsf{S}$ greatly increased the lengths of definitions of $[x]X$ compared with Curry’s original definition, but greatly simplified the algorithm for building them. With computer implementation has come a reversal of values: an algorithm’s speed of action is now valued more than its simplicity or “beauty”.

c. Late Period (after World War II)

After World War II, when Curry returned to Penn State (For details, see the section of the Bibliography section of this article for Curry’s work during World War II.), he slowly got back into logic. He attended the Tenth International Congress of Philosophy in the summer of 1948, and as a result of a proposal made to him there, he decided to write a long work on combinatory logic, which he intended to include everything known on the subject. Feeling he needed a collaborator, he approached Robert Feys at Louvain in Belgium. Curry used a Fulbright which he was awarded for the year 1950-51 to start this work to start, and Curry and Feys continued to work on it after Curry returned to Penn State in 1951. Curry wound up working on this work and a second volume for most of the rest of his life.

The earliest work on this book was on the basic exposition. Curry and Feys completely revised the foundations of combinatory logic, and spent a lot of time explaining Curry’s approach to formal reasoning and formal systems. They then introduced Church’s $\lambda$-calculus, and gave a new proof and analysis of the Church-Rosser Theorem, which proves pure $\lambda$-calculus consistent. The book then took up combinatory logic itself, first pure combinatory logic and then illative combinatory logic. The book finishes with two chapters on the theory of fuctionality.

However, Curry soon began to start new research to be included. At first, this included work expanding the theory of functionality. There was always more than one such theory, and different theories depended on which terms could be what we would now call types, but which Curry called F-obs. There is the basic theory of functionality, in which types are formed from atomic types by the operation that forms $\mathsf{F} \alpha \beta$ from $\alpha$ and $\beta$. (This is equivalent to forming the type $\alpha \rightarrow \beta$ from $\alpha$ and $\beta$.) This system is easily proved consistent.

Then there is the full free theory of functionality, in which any combinatory term can be a type. Curry thought that this system was consistent, and in 1954 he tried to prove that consistency. He spent over four months at this attempt by trying to prove that if, from a set of typing assumptions $\xi_{1} X_{1}, \xi_{2} X_{2}, \ldots , \xi _{n} X_{n}$ (where $X_{1}, X_{2}, \ldots , X_{n}$ may be any combinatory terms), one can prove $\xi X,$ then the deduction must take a certain specific form. After almost five months, he realized that if $\xi X$ is the conclusion of any deduction in this special form, then the term $X$ is irreducible in some sense. But the sense involved was not the sense of reduction in combinatory logic, but rather the sense of $\lambda$-calculus. The difference is that in $\lambda$-calculus, if $M \rhd N$ then $\lambda x \;.\; M \rhd \lambda x \;.\; N$, which is what one would expect. But in combinatory logic, the fact that $X \rhd Y$ does not automatically imply that $ [x]X \rhd [x]Y$, for subterms of $X$ do not necessarily occur in $[x]X$.

For Curry, the fact that the term $X$ in the conclusion of a deduction in the theory of functionality must be irreducible in the sense of $\lambda$-calculus was not very satisfactory. Curry usually thought in combinators rather than $\lambda$-terms. Thus, he set out to find a reduction among combinatory terms that would be more like $\lambda$-reduction. He began with $\lambda \beta \eta$-reduction, which is $\lambda$-calculus in which the reduction rules include ($\alpha$), the rule for changes of bound variables, ($\beta$), the basic reduction for $\lambda$-calculus, which says that $ (\lambda x \;.\; M)N \rhd [N/x]M$, the result of substituting $N$ for $x$ in $M$, and ($\eta$), the rule which says that if $x$ is not free in $M$, then $\lambda x \;.\; Mx \rhd M$. He then defined strong reduction for combinatory logic that is equivalent to $\lambda \beta \eta$-reduction. For technical reasons, he needed to take $\mathsf{C}$ as a primitive combinator instead of defining it as $\mathsf{S} \mathsf{K} \mathsf{K}$ as he had done previously, so now combinatory logic is usually defined by taking the three combinators $\mathsf{I}$, $\mathsf{K}$, and $\mathsf{S}$ as primitive combinators.

Curry soon managed to prove that the full free theory of functionality is, in fact, inconsistent. The book [Curry and Feys 1958] ends with a chapter including the proof that the full free theory is inconsistent and also some results that are true that were proved as part of the failed attempt to prove it consistent.

This volume also includes the first published proof of the Normal Form Theorem, which says that every term with a type has a normal form. (A term is said to be in normal form if it cannot be reduced. It is said to have a normal form if it can be reduced to a term in normal form.) This result has become more and more important in various systems of typed $\lambda$-calculi in the decades since this volume was published.

In the years immediately after the publication of [Curry and Feys 1958], Curry began to work on systems of restricted generality. But he only published a couple of papers on this before he began work on [Curry et al. 1972]. This volume begins with addenda to pure combinatory logic, most of which are highly technical. Curry did try to devise a general framework that would include both combinatory logic and $\lambda$-calculus by defining what he called C-systems. The idea was to set up a framework that could be used to prove results in illative systems that were based either on $\lambda$-calculus or on combinatory logic without having to give separate proofs for the two cases. But this attempt was not completely successful, since it was later found that many results still needed one proof for $\lambda$-calculus and another for combinatory logic.

Curry also extended the definition of illative combinatory logic to include any systems with new atomic constants that have special postulates associated with them, even if these new constants do not represent logical connectives or quantifiers. This allowed him to include systems of combinatory arithmetic. Arithmetic had first been represented by Alonzo Church in combinatory logic and $\lambda$-calculus by defining natural numbers as iterators: the number $n$ is represented by $\lambda f x \;.\; \underbrace{f ( f ( f \ldots (f}_{n} x) \ldots ))$, which applies $f$ to $x$ $n$ times. But by the 1960s, other ways of representing numbers as combinators or $\lambda$-terms had appeared. For this reason, Curry suggested representing numbers by taking new atomic constants to represent 0 and the successor function ($\sigma$) and including a combinator that mapped one of these numbers to the corresponding iterator. With any of these representations, a function can be represented by a combinator or $\lambda$-term if and only if it is partial recursive, or, equivalently, Turing-computable. (This result was first proved for $\lambda$-calculus independently by Church, Kleene, and Turing in 1936; see, for example, [Kleene 1936c].)

Curry also considered extensions of the results on the theory of functionality, including the introduction of a new typing operator $\mathsf{G}$ with the rule that from $\mathsf{G} \alpha \beta X$ and $\alpha Y$ follows $\beta Y (XY)$, so that the type of the value of a function may depend on the argument as well as on the type of the argument. The type $\mathsf{G} \alpha \beta$ is the type that is now usually denoted $ ( \Pi x : \alpha \;.\; \beta x)$, and is usually called the dependent function type. However, the type was only introduced, and no systems based on it were developed by Curry.

The rest of the book includes material on the theory of restricted generality and universal generality. It was shown that these kinds of systems are essentially equivalent. Systems were proved consistent that are essentially equivalent to first-order systems of logic by defining classes of canonical terms which are supposed to represent propositions and propositional functions. Attempts to find consistent systems in which the assumptions for terms to be canonical were stated as axioms of the logic were made, but most of the systems involved were later proved to be inconsistent. Finally, the theory of functionality was used to define systems of type theory in the traditional sense.

Curry spent the rest of his life continuing this work and other work he had done. The last problem he worked on was an attempt to find a reduction for combinatory terms that is equivalent to $\lambda \beta$-reduction, $\lambda$-reduction in which the contraction rules are only ($\alpha$) and ($\beta$). As of this writing, this problem is not yet settled. See Seldin’s papers [Seldin 2011] and [Seldin 2017].

3. Gentzen-style Proof Theory

Curry read Gentzen’s work [Gentzen 1934] two years after it appeared, and it did not take him long to realize that the ideas of that paper could be useful in finding a system of logic based on combinatory logic that could be proved consistent.

Gentzen had introduced two new formulations of logical systems: natural deduction systems and sequent calculi (L-systems). Natural deduction systems are covered in the article Deductive-Theoretic Conceptions of Logical Consequence in this encyclopedia. Sequent calculi are equivalent to natural deduction systems and are designed to search for proofs.

The consistency of natural deduction systems for propositional calculus and first-order predicate calculus follows from what is called the normalization theorem (due originally to Prawitz, [Prawitz 1965]). This result is equivalent to a result of Gentzen on sequent calculi: the cut elimination theorem. Curry worked out his own proof of the latter theorem. He also used a version of it to give the first published proof of the normal form theorem for ordinary basic functionality. (A proof by Turing from 1941 was not published until 1980; see [Gandy 1980b].)

Curry became convinced that a system of formal logic is not properly formalized unless there is a sequent calculus for it for which the cut elimination theorem can be proven.

Another feature of Curry’s approach is that he considered these systems as formalizing the elementary metatheory of what he called an elementary formal system. An elementary formal system is one in which there are no rules which discharge assumptions. Curry had such a formal system for combinatory logic. He used the idea that he was formalizing the elementary metatheory of an elementary formal system to justify all the operational rules. This illustrates that Curry was concerned with semantics.

4. War Work and Computing

When Curry first left Penn State to do applied mathematics for the U.S. Government, he began working on the mathematics of aiming a projectile at a moving target, the so-called fire control problem. Curry had studied this kind of mathematics as a student, and so he had little trouble doing this work during World War II.

By 1945, when Curry was at the Aberdeen Proving Ground, there was word that an electronic computer, the ENIAC, was being built for the purpose of calculating firing tables for the artillery. Curry was named to the committee that was being set up to evaluate the ENIAC when it was delivered. This committee first met in July, 1945, and early that month Curry attended a lecture on the ENIAC by Herman Goldstine. The next day, he decided to write a program to calculate the digits of $e$, the base of the natural logarithms. He finished the program in early 1946, but whether it was ever run is uncertain. Curry later reported that nobody else that he knew at the time who was working on the ENIAC in 1946 could see the point of using a computer for a result assumed to be known.

In 1949, John von Neumann and some colleagues wrote and ran programs to calculate the digits of $\pi$ and $e$. (See [Reitwiesner 1950a] and Reitwiesner et al. 1950b].) As a result, they discovered that the amateur mathematician William Shanks, who had spent over two decades starting in the middle of the 19th century calculating digits of $\pi$, and who had calculated to 707 digits, had made a mistake on digit number 528. The people who wrote the program in 1949 seem to have had no idea that Curry wrote such a program just a few years earlier. On the other hand, by 1949 there had been some changes in the ENIAC, and the program Curry wrote in 1945–46 might no longer have been compatible with the ENIAC.

Curry also became involved in writing programs to do inverse interpolation on the ENIAC, programs useful for dealing with firing tables. See [de Mol et al. 2010].

Curry’s work on programming inverse interpolation on the ENIAC led him to develop a theory of programming. Curry’s basic approach was very similar to the approach he had taken two decades earlier in analyzing the process of substitution. He broke programs down into the simplest possible elementary components and then proposed using program composition to put them together again. This approach has been compared to the later development of compilers for user languages. See [Curry 1954].

However, Curry was not able to continue to work on this development because he could not persuade Penn State to buy any computer equipment in the late 1940s.

5. Formalism: the Philosophy of Mathematics

Curry developed a distinctive philosophy of mathematics. His views developed considerably over the course of his career, but he is mostly known for his earlier works on the subject.

Curry’s earliest philosophical work, dating from 1939, proposed to define mathematics as the science of formal systems. But Curry’s approach to formal systems was not quite the same as that of most others in the field.

The usual definition of a formal system begins by defining the formal objects as words on an alphabet of symbols, or, to use the terminology more current in computer science today, strings of characters. But then some of these words are picked out as “well formed formulas” by an inductive definition with the property that each well formed formula has a unique construction from the “atomic formulas”. For example, for the propositional calculus, we are given a possibly infinite set of atomic formulas $p_{1}, p_{2}, \ldots , p_{n} , \ldots$, and a typical definition of well formed formula goes as follows:

Every atomic formula is a well formed formula.
If $P$ is a well formed formula, then $\neg P$ is a well formed formula.
If $P$ and $Q$ are well formed formulas, then $P \wedge Q$, $P \vee Q$, and $P \supset Q$ are well formed formulas.
Nothing else is a well formed formula.

If the logical system involved includes quantifiers, then the atomic formulas are themselves defined, and that definition may depend on inductive definitions. For example, if we are defining a formal system for first-order logic, we start with terms, which are built up out of atomic terms and individual variables by using basic functions, and then we have predicates, from which the atomic formulas are obtained by applying them to terms. If the first order system is a system of arithmetic, we start with the atomic term 0 and functions denoted by $\prime$ (as a superscript) and $+$ and $\cdot$ (as infixes), and then terms are defined as follows:

Every individual variable is a term.
0 is a term.
If $t$ is a term, then so is $t^{\prime}$. (This is intended to denote the number that is one more than $t$.)
If $s$ and $t$ are terms, then $s+t$ and $s\cdot t$ are terms. (The term $s\cdot t$ is often abbreviated as $st$.)
Nothing else is a term.

Once terms have been defined, atomic formulas are defined as follows:

If $s$ and $t$ are terms, then $s = t$ is an atomic formula.

And then the following clause is added to the definition of well formed formula:

If $x$ is an individual variable and $A$ is a well formed formula, then $(\forall x)A$ and $ (\exists x)A$ are well formed formulas.

Curry noted is that although these definitions of term, atomic formula, and well formed formula say they are about strings of symbols on some alphabet, they do not really depend on that fact. For him, the crucial thing was that each term and well formed formula have a unique construction, whereas any word of three or more letters has more than one construction.(For example, the string $abc$ can be formed in two ways: $c$ can be added to $ab$ on the right, or $a$ can be added to $bc$ on the left.) So while we obviously represent formal objects on a page or on a blackboard as strings of characters, it is not necessary that they actually be such strings. The strings may only be the names for these formal objects. It is only necessary that they are defined inductively so that each one has a unique construction.

Also, formal systems do not need to be systems of logic in the ordinary sense with logical connectives and quantifiers. It is possible to have a simpler formal system. An example Curry gave is what he called the “system of Sams” for natural numbers. (He got this name from the Hungarian word for number, which is szám.) In this system, the formal objects are interpreted as natural numbers. There is one primitive formal object, which I will name “0”. There is one operation, which forms $X|$ from $X$. The rules for forming the sams are as follows:

0 is a sam.
If $X$ is a sam, then so is $X|$.
Nothing else is a sam.

There is one predicate, which forms $X = Y$ from sams $X$ and $Y$. Thus, the elementary statements are those of the form $X = Y$, where $X$ and $Y$ are sams. There is one axiom, namely

0 = 0

There is also one rule of inference: From $X = Y$ to deduce $X| = Y|$. This is a very simple formal system, and it is easy to show that the theorems (provable elementary statements) are those of the form $X = X$, where $X$ is a sam.

In saying that mathematics is the science of formal systems, Curry was claiming that (pure) mathematics does not really have a subject matter. It was not what he called a contensive topic. (The word contensive is a word Curry coined to express the idea of the German word inhaltlich.) Of course, mathematical statements do have subjects and therefore subject matter, but Curry claimed that the only subject matter any mathematical statements had was other mathematics.

Curry’s attitude towards truth was that truth comes in two kinds:

Truth within a formal system (or within a given theory). This depends on how the system or theory is defined.
The acceptability of a system (or theory) for some purpose. This depends on the purpose, and Curry took this pragmatically.

In his work on combinatory logic and Gentzen proof theory, he preferred to use only constructive logic in the metatheory, this would be accepted by more people than classical logic. (In this, he did not see that most mathematicians were not familiar with constructive mathematics.) On the other hand, he had no trouble accepting classical logic in the mathematics to be used in physics. In a sense, Curry did not really believe in one absolute notion of truth.

On the other hand, once formal systems (or any other kind of theories) are created, they have properties which can be investigated, and hence have objective existence. In this sense, Curry believed in the idea that Karl Popper introduced later of the third world. In fact, Popper presented this idea at a session of the Third International Congress of Logic, Methodology, and Philosophy of Science in Amsterdam in 1967, and as it happened Curry was the chair of the session. (See [Popper 1968].) After Popper’s presentation was over, Curry told his graduate student Jonathan P. Seldin, who was also present, that he thought that Popper had made a big deal out of something that was trivially and obviously true.

Over his career, Curry changed several times the words he used to denote the formal objects of a formal system. In his earliest work on combinatory logic, he called them “entities” (using the German word Etwas as a noun in his dissertation, which was written in German). However, in a discussion with a philosopher (whom he did not name in his later years), he was told that his use of that word implied some philosophical conclusions with which he disagreed. At that point, he decided to use the word “term” instead. It is now common to refer to “combinatory terms” and “$\lambda$-terms”. However, this caused him a problem when he was dealing with a formal system of logic with quantifiers, since the terms would be what are usually called “formulas”, and there are other formal objects called “terms”. So in the end, he coined his own word by taking the first syllable of the word “object”, and started calling them obs. To some people, the word ob appeared to refer specifically to combinatory logic, but in fact Curry used the word for formal objects of any kind of formal system.

In his later work, Curry extended his definition of formal system to allow for systems whose formal objects are strings of characters. He called such systems syntactical systems, and called his earlier kind of formal systems ob systems.

Also in his later work, Curry also extended his definition of mathematics from saying that mathematics is the science of formal systems to saying that mathematics is the science of formal methods. This definition should be sufficiently broad to include all of mathematics, since if we compare piles of apples and oranges by seeing if there is a one-to-one correspondence between them, we are looking at the forms of the piles rather than the content (apples or oranges).

Curry chose the name “formalism” for his philosophy of mathematics because of David Hilbert. However, Curry’s idea of formalism is very different from the idea of other philosophers of mathematics who call themselves formalists. It is probably better to think of Curry’s formalism as a kind of structuralism.

6. References and Further Reading

a. Primary Sources

[Curry 1929] Curry, Haskell B., An analysis of logical substitution”, American Journal of Mathematics 51, 363-384.
- Curry’s first published paper, written as part of an application for a grant to go to Gottingen.
[Curry 1930] Curry, Haskell B., Grundlagen der kombinatorischen Logik”, American Journal of Mathematics 52 (1930) 509-536, 789-834.
- Curry’s dissertation, written in German at Gottingen in 1928-1929. Republished with a translation into English and an introduction on Curry’s work by Fairouz Kamareddine and Jonathan P. Seldin as Foundations of Combinatory Logic by College Publications, 2016.
[Curry 1934a] Curry, Haskell B., Some properties of equality and implication in combinatory logic”, Annals of Mathematics (2) 34, 381-404.
- This is the paper that gave Kleene and Rosser what they needed to prove inconsistent the systems of Church and Curry.
[Curry 1934b] Curry, Haskell B., Functionality in combinatory logic”, Proceedings of the National Academy of Sciences U.S.A., 20, 584-590.
- An extended abstract of item 1936 below, which Curry had some trouble getting accepted for publication because the approach originally looked strange.
[Curry 1936] Curry, Haskell B., First properties of functionality in combinatory logic,” Tohoku Mathematical Journal 41 Part II, 371-401.
- Curry’s first paper on functionality. He originally wrote it in 1932, but had trouble getting it accepted for publication. The version published in 1936 contains many misprints.
[Curry 1939] Curry, Haskell B., Remarks on the definition and nature of mathematics”, Journal of Unified Science 9, 164-169, and reprinted many times since.
- Curry’s first work on the philosophy of mathematics.
[Curry 1941] Curry, Haskell B., The paradox of Kleene and Rosser”, Transactions of the American Mathematical Society, 50, 454-516.
- Curry’s study of the paradox mentioned in the title.
[Curry 1942a] Curry, Haskell B., Some advances in the combinatory theory of quantication”, Proceedings of the National Academy of Sciences U.S.A. 28, 564-569.
- This is the paper Curry wrote just before his leave of absence from Penn State to do war work in which he set out his plans to try to send consistent systems of logic based on combinatory logic.
[Curry 1942b] Curry, Haskell B., The inconsistency of certain formal logics”, Journal of Symbolic Logic 7, 115-117.
[Curry 1949] Curry, Haskell B., A simplication of the theory of combinators”, Synthese 7, 391-399.
- The paper in which Curry published his understanding of the combinator S.
[Curry 1950] Curry, Haskell B., A Theory of Formal Deducibility, (Indiana University Press).
- Curry’s first book on Gentzen-style proof theory.
[Curry 1951] Curry, Haskell B., Outlines of a Formalist Philosophy of Mathematics (Amsterdam, North-Holland).
- This was mostly written in 1939 and is essentially the long manuscript from which the paper of 1939 was prepared as a shorter version.
[Curry 1954] Curry, Haskell B., The logic of program composition”, In Applications Scientiques de la Logique Mathematique, Actes du 2e Colloque International de Logique Mathematiques, Paris 25-30 Aout 1952, Institut Henri Poincare, (Paris: Gauthier-Villars and Louvain: Nauwelaerts). Curry’s summary of his theory of programming.
[Curry and Feys 1958] Curry, Haskell B. and Feys, Robert, Combinatory Logic, Volume I, (Amsterdam, North-Holland).
- The first volume of Curry’s great work on combinatory logic.
[Curry 1963] Curry, Haskell B., Foundations of Mathematical Logic, (McGraw-Hill, and since reprinted by Dover).
- Curry’s major work on Gentzen-style proof theory.
[Curry et al. 1972] Curry, Haskell B., Hindley, J. Roger, and Seldin, Jonathan P., Combinatory Logic, Volume II, (Amsterdam, North-Holland).
- The second volume of Curry’s great work on combinatory logic.

b. Secondary Sources (by year)

[Russell and Whitehead 1910-1913] Russell, Bertrand and Whitehead, Alfred North, Principia Mathematica, 3 volumes (Cambridge University Press).
- The first major work on logic that Curry read.
[Schönfinkel 1924] Schönfinkel, Moses, Über die Bausteine der mathematischen Logik”, Mathematische Annalen 92, 305-306.
- A work that Curry first encountered in 1927-28 which, much to his surprise, had anticipated his own idea for combinators. The paper was written by Behman, and was a report on a seminar talk Schonnkel had given at Gottingen in 1920. An English translation has appeared as “On the building blocks of mathematical logic”, in From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931, edited by Jean van Heijenoort (Harvard University Press, 1967), pp. 355-366.
[Hilbert 1925] David Hilbert, Über das Unendliche”, Mathematische Annalen 95 (1925) 161-190.
- One of the most important papers Hilbert wrote on the foundations of mathematics. Reprinted (in German) in David Hilbert, Hilbertiana: Fünf Aufsätze (Darmstadt: Wissenschaftliche Buchgesellschaft, 1964), pp. 79-108. Translation into English published as “On the infinite” in Jean van Heijenoort (editor), From Frege to Gödel: A Source Book in Mathematical Logic, 1879-1931, (Cambridge, MA and London, England: Harvard University Press 1967), pages 367-392.
[Heyting 1930] Heyting, Arend, Die formalen Regeln der intuitionistischen Logik”‘, Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physikalisch-Mathematische Klasse 1930, 42-56.
- The paper in which Heyting introduced his formal system of intuitionistic logic.
[Church 1932] Church, Alonzo, A set of postulates for the foundation of logic”, Annals of Mathematics (2) 33, 346-366.
- The paper in which Church first introduced abstraction as part of a larger system.
[Gentzen 1934] Gentzen, G., Untersuchungen über das logische Schliessen”, Mathematische Zeitschrift 39, 405-431.
- The paper in which Gentzen introduced his systems of natural deduction and his L-systems (sequent calculi).
[Kleene 1935] Kleene, Steven C. and Rosser, J. Barkley, The inconsistency of certain formal logics”, Annals of Mathematics (2) 36, 630-636.
- The paper in which Kleene and Rosser published their proof of the contradiction in the systems of Church and Curry.
[Church and Rosser 1936a] Church, Alonzo and Rosser, J. Barkley, Some properties of conversion”, Transactions of the American Mathematical Society 39, 472-482.
- The paper in which the Church-Rosser Theorem was first proved for lambda-calculus.
[Church 1936b] Church, Alonzo, An undecidable problem in elementary number theory’, American Journal of Mathematics 58, 345-363.
- The paper in which Church proved that there is a problem in elementary number theory which cannot be decided by an algorithm. The paper includes a statement by Church that a function is partial recursive if and only if it can be represented by a -term, a result that he and Kleene obtained independently about the same time.
[Kleene 1936c] Kleene, Steven C., “-denability and recursiveness”, Duke Mathematical Journal 2, 340-353.
- The paper in which Kleene first proved that a function is partial recursive if and only if it can be represented by a -term, a result he discovered independently at the same time Alonzo Church did. This formed part of the justication of the Church-Turing thesis, that a function is mechanically computable if and only if it is partial recursive if and only if it is Turing computable if and only if it is -denable.
[Rosser 1942] Rosser, J. Barkley, New sets of postulates for combinatory logics”, Journal of Symbolic Logic 7, 18-27.
- Rosser’s paper that enabled Curry to understand the combinator S, although Rosser did not use that combinator.
[Reitwiesner 1950a] Reitwiesner, George W., An ENIAC determination of pi and e to more than 2000 decimal places”, Mathematical Tables and Other Aids to Computation, 4, 11-15.
- A paper on the program run on the ENIAC to calculate digits of and e in 1949-1950. The paper shows no indication of any knowledge of the program Curry wrote to do this for e in 1945-46.
[Reitwiesner et al. 1950b] Metropolis, N. C., Reitwiesner, G., and von Neumann, J., Statistical treatment of the values of first 2,000 decimal digits of e and calculated on the ENIAC”, Mathematical Tables and Other Aids to Computation, 4, 109-111.
- The statistical analysis of the results of the program run on the ENIAC as described by George W. Reitwiesner.
[Prawitz 1965] Prawitz, Dag, Natural Deduction: A Proof-Theoretical Study, Almqvist & Wiksell, 1965. Reprinted by Dover in 2006.
- This was originally Prawitz’ doctoral dissertation, and introduced Prawitz’ ideas of proof reduction and proof normalization.
[Popper 1968] Popper, K. R., Epistemology without a knowing subject”, in van Rootselaar, B. and Staal, J. F. (editors), Logic, Methodology and Philosophy of Science III: Proceedings of the Third International Congress for Logic, Methodology and Philosophy of Science, Amsterdam 1967, (Amsterdam: North-Holland), pp. 333{373.
- This is the paper in which Popper introduced his idea of the third world. The paper had been presented in the first session of the congress (11:15 a.m. to 12:00 noon, with H. B. Curry in the chair) under the title “Epistemology and scientic knowledge”. See the program of the congress on p. 543 of the proceedings.
[Hindley and Seldin 1980a] Hindley, J. Roger and Seldin, Jonathan P. (editors), To. H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, (Academic Press).
- A collection of papers related to Curry’s work. Includes a short biography and a complete list of Curry’s publications.
[Gandy 1980b] Gandy, R. O., An early proof of normalization by A. M. Turing”, in [1980a], pp. 453{455.
- This is Turing’s earliest proof of the normal form theorem for typed-calculus with an introduction by Gandy.
[Hindley and Seldin 2008] Hindley, J. Roger and Seldin, Jonathan P., Lambda-Calculus and Combinators, An Introduction, (Cambridge University Press).
- A general introduction to lambda-calculus and combinatory logic.
[de Mol et al. 2010] de Mol, Liesbeth, Bullynck, Maarten, and Martin, Carle, “Haskell before Haskell. Curry’s contribution to programming (1946-1950)”, in Ferreira, F., Lowe, B, Mayordomo, E., and Gomes, L.M. (Eds.), Programs, Proofs, Processes, 6th Conference on Computability in Europe, CIE, 2010, Ponta Delgada, Azores, Portugal, June 30-July 4, 2010, Springer Lecture Notes in Computer Science, vol. 6158, pp. 108-117.
- A paper on Curry’s theory of programming.
[Seldin 2011] Seldin, Jonathan P., “The search for a reduction in combinatory logic equivalent to -reduction”, Theoretical Computer Science 412, 4905-4918.
- A paper describing the attempt to find a reduction in combinatory logic equivalent to -reduction, including a discussion of the technical problems involved.
[Seldin 2017] Seldin, Jonathan P., The search for a reduction in combinatory logic equivalent to -reduction, Part II, Theoretical Computer Science 663, 34-58.
- A paper giving the proofs of the key properties of the proposals given in Seldin 2011.

Author Information

Jonathan P. Seldin
Email: jonathan.seldin@uleth.ca
University of Lethbridge
Canada

Cognitive Phenomenology

Phenomenal states are mental states in which there is something that it is like for their subjects to be in; they are states with a phenomenology. What it is like to be in a mental state is that state´s phenomenal character. There is general agreement among philosophers of mind that the category of mental states includes at least some sensory states. For example, there is something that it is like to taste chocolate, to smell coffee, to feel the wind in one´s hair, to see the blue sky and to feel a pain in one´s toe. Is there also something that it is like to consciously think, to consciously judge and to consciously believe something? Are such cognitive states, when conscious, phenomenal states? Is there a clear distinction between sensory states and cognitive states? Or, can our knowledge, thoughts and beliefs influence our sensory experiences? Is there a cognitive phenomenology?

It is challenging to give a clear characterization of the cognitive phenomenology debate, since different contributors conceive of the debate in different ways. Central for the debate is the question of whether conscious thoughts possess a non-sensory phenomenology. Intuitively, there is something that it is like to consciously think, consciously judge and consciously believe something. However, the debate about cognitive phenomenology is not, strictly speaking, about whether there is something that it is like to consciously think. Rather, the debate concerns the nature of cognitive phenomenology. Is the phenomenology of cognitive states reducible to purely sensory phenomenology? Or, is there an irreducible cognitive phenomenology? A sceptic about cognitive phenomenology claims that conscious cognitive states are non-phenomenal. But, conscious cognitive states may seem to be phenomenal because they are accompanied by sensory states. For instance, when one thinks that ´Paris is a beautiful city`, one´s thought may be expressed in inner-speech and an image of Paris may accompany it. These accompanying sensory states are phenomenal states, and not the thought itself. Contrary to this, the proponent of cognitive phenomenology claims that a conscious cognitive state can have a phenomenology that is irreducible to purely sensory phenomenology.

Other debates have also been placed under the ´cognitive phenomenology’ label. There is an ongoing debate within the philosophy of perception about how cognition influences our sensory experiences. Philosophers tend to agree that, for example, an expert ornithologist´s perceptual experience of a type of bird can differ from that of a novice, even if the viewing conditions for both expert and novice are the same. The expert´s knowledge of birds can influence her experience. However, what philosophers disagree about is how the expert´s knowledge influences her experience, and how her knowledge contributes to what her experience is like.

Background
The Nature of Cognitive Phenomenology
Arguments for Cognitive Phenomenology
Implications of the Cognitive Phenomenology Debate
References and Further Reading

1. Background

a. Terminological Clarifications

When this article talks about a state being conscious, being conscious should be understood as being phenomenally conscious. A phenomenal state is a mental state that is phenomenally conscious in that there is something that it is like for the subject of that state to be in that state. Phenomenal states are states with phenomenology. What it is like to be in a phenomenal state is that state´s phenomenal character. An example of a phenomenal state is a visual experience of the blueness of the sea. Another example is an auditory experience of the sound of waves. There is something that it is like to have these experiences. There is also something that it is like to simultaneously visually experiencing the blueness of the sea and auditorily experiencing the sound of the waves (Bayne & Chalmers 2003). Our everyday conscious experiences are often complex in that they involve simultaneously thinking, feeling and experiencing within different sensory modalities. Such a complex experience is referred to as an overall phenomenal state.

Examples of sensory mental states are perceptual states, proprioception, bodily feelings and pains. Examples of cognitive states are thoughts, judgments and beliefs. According to some views, emotions and categorical perceptual experiences (such as experiencing something as being a type of bird) should also be categorized as cognitive states, or as partly cognitive and partly sensory states (see Chudnoff 2015a, Montague 2017).

b. Two Kinds of Mental States

Traditionally, it was common to distinguish between two kinds of mental states, namely sensory states and propositional attitudes. Paradigmatic examples of propositional attitudes are cognitive states such as beliefs, desires, thoughts and judgements. Propositional attitudes are intentional states since they are about or represent objects, properties or states of affair. They are states with propositional contents that can be linguistically expressed by using a ´that-clause`. The content of my belief ´that it will rain tomorrow` is ´that it will rain tomorrow`. When I believe ´that it will rain tomorrow` I am having a certain attitude towards that content, namely the attitude of belief. I could have had a different attitude towards the same content, I could for instance desire ´that it will rain tomorrow`.

According to the traditional view, sensory mental states, unlike cognitive states, have qualia. On this view, qualia are seen as phenomenal properties that can be separated from intentional or representational properties. For example, my visual experience of a red rose in front of me is intentional in that it is about or represents ´that there is a red rose in front of me`, but it is also something that it is like for me to experience the red rose. The redness that I experience is a property of my experience, a quale. While conscious sensory states are regarded as phenomenal states with qualia, conscious cognitive states are said to lack qualia. They are seen as non-phenomenal states.

Lately, this traditional view has been challenged. Firstly, proponents of intentionalism argue that when I experience a red rose I experience the redness as a property of the rose itself, and not as a property of my experience of the rose. My experience of the red rose has a phenomenal character, but this phenomenal character is embedded in the intentional content of my experience. Secondly, proponents of cognitive phenomenology challenge the assumption that cognitive states are non-phenomenal states when conscious.

c. Phenomenal Intentionality

In their seminal paper from 2002 ‘The Intentionality of Phenomenology and the Phenomenology of Intentionality’, Horgan and Tienson argue against the traditional view and argue in favour of intentionalism and cognitive phenomenology. They also argue for a view about the relation between the intentional and the phenomenal that has recently gained popularity, Phenomenal intentionalism.

According to intentionalism, all mental states are intentional, including phenomenal states. A mental state is commonly regarded as intentional if it is about or directed towards some objects or states of affairs, and if it has a content.

Phenomenal intentionality is a kind of intentionality that is said to be grounded in phenomenal consciousness (Kriegel 2011, Mendelovici 2018). According to proponents of Phenomenal intentionalism, there is a Phenomenal intentionality and all other forms of intentionality are derived from Phenomenal intentionality. While other proponents of intentionalism hold that intentionality is primary to phenomenology (see for example, Tye 1995 and Dretske 1995), proponents of Phenomenal intentionalism claim that phenomenology or Phenomenal intentionality is primary to all other forms of intentionality (Horgan & Tienson 2002, Kriegel 2011, Mendelovici 2018).

While most proponents of Phenomenal intentionalism also claim that there is a cognitive phenomenology, the two views should not be intermingled. Phenomenal intentionalism is a view about what it is that grounds the relation between phenomenal consciousness and intentionality, while cognitive phenomenology is a view about the scope of phenomenal consciousness. A proponent of cognitive phenomenology needs not accept Phenomenal intentionalism, and it is not necessary for a proponent of Phenomenal intentionalism to hold that there is a cognitive phenomenology. However, since proponents of Phenomenal intentionalism claim that all intentionality is derived from Phenomenal intentionality, it is easier to explain the intentionality of cognitive states if one holds that conscious cognitive states are phenomenal states. If one denies that there is a cognitive phenomenology and accepts Phenomenal intentionalism, one needs to tell a story about how the intentionality of cognitive states is derived from the Phenomenal intentionality of sensory states. While if one holds that there is a cognitive phenomenology one can simply claim that the intentionality of non-conscious cognitive states (such as dispositional beliefs) is derived from the Phenomenal intentionality of conscious cognitive states.

2. The Nature of Cognitive Phenomenology

The debate about whether or not there is a cognitive phenomenology can seem bewildering since there are different claims about what cognitive phenomenology is, and these claims may vary in both strength and generality.

a. Irreducible Cognitive Phenomenology

According to Elijah Chudnoff (2015a), a proponent of cognitive phenomenology should minimally accept the irreducibility thesis.

Irreducibility: ‘Some cognitive states put one in phenomenal states for which no wholly sensory states suffices’ (Chudnoff 2015a: 15).

It follows from Irreducibility that some cognitive states are such that because one is in them one is in a phenomenal state for which no wholly sensory states suffice. That is, there is a phenomenal character that is over and above the phenomenal character that accrues for sensory states. Putting one in a phenomenal state should be understood as a non-causal explanatory relation that can alternatively be picked out by ´in virtue of` or ´constitutively dependent on` (see Chudnoff 2015b).

In order to get a better grip on the Irreducibility thesis we can contrast it with an alternative view on the relation between cognitive states and phenomenal states. It is uncontroversial to claim that cognitive states can make an impact on our sensory states. For instance, judging that the sum of the angles of a triangle is 180 degrees can lead one to visualize the triangle or to express sentences such as ´the sum of the angles of a triangle is 180 degrees` in inner speech. In this case, one is in a phenomenal state since one is in a certain cognitive state, but the phenomenal state one is in is not different from the phenomenal state various wholly sensory states can put one in (Chudnoff 2015a). What Irreducibility claims is that some cognitive states can put one in phenomenal states that are different from those phenomenal states that wholly sensory states can put one in. Chudnoff uses an example from mathematics to illustrate how Irreducibility differs from the view that cognitive states merely cause one to be in a certain phenomenal state. At first you read that ´If a < 1, then 2 – 2a > 0`, and you wonder whether this is true (Chudnoff 2015a: 15). Then you realise how a´s being less than 1 makes 2a smaller than 2 and so 2 – 2a greater than 0. When you realise the truth of this mathematical proposition you might say to yourself in inner speech ´If a < 1, then 2 – 2a > 0` and you might visualize the variable ´a` and the numeral ´1`. You might also feel satisfied because you got it right. These states that you are put in are all sensory phenomenal states. However, if you believe Irreducibility and if you think that this case of realising the truth of this mathematical proposition involves cognitive phenomenology, then you also believe that these sensory states taken together cannot account for the overall phenomenal state you are in. You think that there is some phenomenal state that is left over which only the cognitive states of ´realising` or ´intuiting` can put you in.

Following Chudoff, Irreducibility is the thesis that a proponent of cognitive phenomenology must minimally accept. There are other theses figuring within the cognitive phenomenology debate that go beyond Irreducibility and make stronger and more specific claims about the nature of cognitive phenomenology.

b. Proprietary Cognitive Phenomenology

According to Irreducibility, some sensory states put one in phenomenal states for which no wholly sensory states suffice to put one in. However, it does not follow from Irreducibility that only cognitive states put one in these phenomenal states. Neither does it follow from Irreducibility that the phenomenal character of the phenomenal states that cognitive states put one in is cognitively grounded. That is, that their phenomenal character is different in kind from sensory phenomenal character (Levine 2011).

Many proponents of cognitive phenomenology hold that there is a proprietary cognitive phenomenology (See Horgan & Tienson 2002, Horgan 2011, Kriegel 2011, Kriegel 2015a, Kriegel 2015b, Pitt 2004, Pitt 2011, Siewert 1998, Siewert 2011). The kind of phenomenology that philosophers are talking about when they are talking about cognitive phenomenology must differ in kind form the kind of phenomenology one is familiar with through one´s sensory experiences. As David Pitt puts it:

I believe that the phenomenology of occurrent conscious thought is proprietary: It´s a sui generis sort of phenomenology, as unlike, say, auditory or visual phenomenology as they are unlike each other—a cognitive phenomenology. (Pitt 2011: 141)

There is something that it is like to be in a conscious cognitive state and/or to consciously entertain a cognitive content, and this phenomenology is distinct from the phenomenology one experiences when one is consciously perceiving something or feeling something. Cognitive phenomenology is, on this view, proprietary and sui generis.

Proprietary: Conscious cognitive states have proprietary or sui generis phenomenal character.

Someone who accepts Proprietary also accepts Irreducibility, but one may accept Irreducibility and deny Proprietary. For example, one could claim that knowing a lot about sparrows may influence the way one visually experiences sparrows so that one can be put in phenomenal states for which no wholly sensory states suffice. One´s knowledge does not merely cause one to attend to sparrows in a particular way. Rather, one´s knowledge puts one in a phenomenal state that one could not have been put in by wholly sensory states. In such a case, cognitive states can make a constitutive contribution to one´s perceptual experience by, for example, structuring the experience, without thereby producing a phenomenal state that is non-sensory in kind (see Levine 2011, Nes 2011). However, most philosophers hold that cognitive states can cause one to be in certain sensory states by influencing attention. Carruthers and Veillet (2011) argue that it is not clear that the sparrow expert´s experience involves irreducible cognitive phenomenology, since it is possible that her knowledge simply causes her to attend to sparrows in a different way compared with a novice. She will notice certain properties of the sparrows that the novice fails to notice, but the phenomenal state she is in is a state that wholly sensory states suffice to put her in. How should we decide between these views?

If cognitive phenomenology is proprietary, it should in principle also be possible to pick it out via introspection. Holding that cognitive phenomenology is proprietary allows one to appeal to introspection in cases where there is a dispute about whether cognitive phenomenology is involved or not. This may serve as a motivation for holding that cognitive phenomenology is proprietary, and not merely irreducible.

c. Pure and Impure Cognitive Phenomenology

We can further distinguish between three different ways of characterizing the nature of a phenomenal state: 1) A phenomenal state is purely sensory in case wholly sensory states suffice to put one in that state; 2) A phenomenal state can be partly cognitive (and partly sensory) if no wholly sensory states suffice to put one in that state and no wholly cognitive states suffice to put one in that state; 3) A phenomenal state is purely cognitive in case cognitive states suffice to put one in that state (Chudnoff 2015b). A cognitive phenomenal state is an impure cognitive phenomenal state if 2 holds but not 3. A cognitive phenomenal state is a pure cognitive phenomenal state if 3 holds. In other words, a cognitive phenomenal state is a pure cognitive phenomenal state if it is independent of sensory states.

A proponent of cognitive phenomenology needs not accept that there is pure cognitive phenomenology. It is compatible with Irreducibility that there is merely impure cognitive phenomenology. Many of the cases that are commonly appealed to in arguments for cognitive phenomenology seem to involve impure cognitive phenomenology. For instance, the overall phenomenal state one is in when one suddenly grasps a mathematical proposition arguably depends on both sensory experiences and intuiting. Proposed candidates for pure cognitive phenomenology are imageless thoughts and beliefs.

It is compatible with Irreducibility to deny that there is pure cognitive phenomenology. However, if one holds Proprietary one seems committed to accept that pure cognitive phenomenology is, at least, possible. Following Proprietary, cognitive phenomenology is different in kind from other kinds of phenomenology, and it should in principle be possible to pick out this kind of phenomenology via introspection. When one is in a phenomenal state that involves different sensory modalities—such as the state one is in when watching a TV-show—one seems able, at least roughly, to pick out and separate visual phenomenology from auditory phenomenology. This is because visual phenomenology is quite unlike auditory phenomenology. Similarly, when one is consciously thinking that p, one should be able to separate the phenomenology of thinking from the auditory phenomenology involved when expressing the content in inner-speech. On this view, cognitive phenomenology is a sui generis kind of phenomenology, as unlike auditory and visual phenomenology as they are unlike each other (Pitt 2004, Pitt 2011).

d. Attitudinal Phenomenology and Content Phenomenology

Cognitive states such as thoughts, beliefs, judgements and inferences are propositional attitudes. One may think that conscious cognitive states have attitudinal cognitive phenomenology PA:

PA: There is something that it is like to have a conscious cognitive attitude towards a content, and no wholly sensory states suffice to put one in a state with this phenomenal character.

PA is compatible with Irreducibility and Proprietary.

The claim that there is a cognitive phenomenology can also be a claim about the cognitive content that one is consciously entertaining when one is in a cognitive state. One may think that conscious cognitive states have content cognitive phenomenology CA:

CA: There is something that it is like to consciously entertain a cognitive content, and no wholly sensory states suffice to put one in a state with this phenomenal character.

A proponent of cognitive phenomenology can accept that there is an attitudinal cognitive phenomenology and deny that there is a content cognitive phenomenology. One can also hold that there is a content cognitive phenomenology, but not an attitudinal cognitive phenomenology. Or, one can accept that there is both an attitudinal cognitive phenomenology and a content cognitive phenomenology.

e. General, Particular and Individuative Cognitive Phenomenology

Cognitive phenomenology claims can be general claims such as the claim that conscious cognitive attitudes have attitudinal cognitive phenomenology, where this attitudinal cognitive phenomenology is common for all cognitive attitudes. Alternatively, cognitive phenomenology claims can be claims about there being a particular cognitive phenomenology involved when one is consciously believing, and this attitudinal cognitive phenomenology is different from the attitudinal cognitive phenomenology involved when one is having other conscious cognitive attitudes. One may also think of attitudinal cognitive phenomenology as even more fine-grained: for example, that there are different attitudinal cognitive phenomenologies involved in having different kinds of conscious beliefs.

The claim that there is a content phenomenology can be more or less general. The most general claim is that there is a content cognitive phenomenology that is common for all cognitive contents. A more particular view claims that the cognitive content phenomenology involved in consciously entertaining the content that p, say, differs from the cognitive content phenomenology involved in consciously entertaining that q. An even more particular view holds that the content cognitive phenomenology involved in consciously entertaining the content that p is different from the content phenomenology involved in consciously entertaining any other cognitive contents. Further, one could hold that the phenomenology involved in consciously entertaining the cognitive content that p may differ from person to person. For example, the content phenomenology involved when John consciously entertains the cognitive content that p, differs from the content phenomenology involved when Jane consciously entertains the cognitive content that p.

Particular claims about either attitudinal cognitive phenomenology and content cognitive phenomenology are often motivated by the view that phenomenology is individuative. That is, in virtue of having the phenomenal character it has, my belief is a belief as opposed to a judgment, a thought or an intuition. And, in virtue of having the phenomenal character it has, the content that I am entertaining, the content that p, is the very content that p as opposed to the content that q. By claiming that phenomenology is individuative one can elegantly explain how one can determine the content of one´s own phenomenal state. One knows which phenomenal state one is in, and its content, because it has the phenomenal character that it has. For instance, when I am having a visual experience of a red rose I come to know—via introspection—that I am having a visual experience of a red rose. Similarly, I come to know that I am consciously believing that p due to the phenomenal character belief that p has (Pitt, 2004, Horgan 2011, Kriegel 2011, Kriegel 2013).

3. Arguments for Cognitive Phenomenology

We can distinguish between different types of arguments for cognitive phenomenology. These arguments are generally arguments for Irreducibility, but some of them also defend stronger claims about the nature of cognitive phenomenology. This section presents the types of arguments that are most commonly used and common responses to them.

a. Arguments from Examples

Arguments from examples appeal to cases or circumstances where one seems to be in phenomenal states that involve cognitive phenomenology. For instance, there is something that it is like for me to suddenly remember that I have an appointment with a student in 5 minutes. The state that I am in when I suddenly remember something is a cognitive state. There can be sensory states involved as well; a visual image of my student may pop-up, or I may feel annoyed because I almost forgot about the appointment. The cognitive state I am in when I suddenly remember my appointment puts me in a phenomenal state, and no wholly sensory states suffice to put me in that state.

Another argument from example appeals to tip-of-the-tongue experiences, the kind of experiences one has when searching for a word that one knows but fails to retrieve (Goldman 1993). There is something that it Is like to have such experiences, and cognitive states play a role in putting one in that state, and no wholly sensory states suffice to put one in that state.

A sceptic about cognitive phenomenology may agree with the proponent of cognitive phenomenology in that the states that these arguments appeal to are phenomenal states, while denying that they are cognitive phenomenal states. According to the sceptic there is always some sensory states involved when one suddenly remembers something. When I remember that I have an appointment with my student in 5 minutes, I may visualize my student and feel annoyed by myself for almost forgetting about the appointment. The sensory states that I am in can, according to the sceptic, fully account for the phenomenal character of the state that I am in.

One can make a similar response to the tip-of-the-tongue example. When having a tip-of-the-tongue experience I am making an effort to retrieve a word, and it is the sensory feeling of making an effort that accounts for the phenomenal character of the experience.

A proponent of cognitive phenomenology can insist that if one carefully introspects one´s phenomenal states, it becomes apparent to one that these states involve cognitive phenomenology. However, such appeals to introspection are problematic because a sceptic may simply claim that she is carefully introspecting the phenomenal state she is in when she suddenly remembers something, but she finds only sensory phenomenology. Nevertheless, it seems wrong to completely dismiss appeals to introspection, as some such appeals appear more convincing than others. Charles Siewert (1999) argues that the sensory states involved in cases where one suddenly remembers something occur after the state of suddenly remembering. The state that one is in when suddenly remembering something needs not involve any sensory phenomenology at all. Following Siewert, the state of suddenly remembering is a pure cognitive phenomenal state (Siewert 1999).

b. Contrast Arguments

One of the most commonly used type of argument for cognitive phenomenology is contrast arguments. Contrast arguments for cognitive phenomenology appeal to two contrasting phenomenal states, s1 and s2, where there appears to be a difference in the phenomenal character of s1 and s2, and where this difference is best explained as a difference in cognitive phenomenology. Contrast arguments can be used when arguing for attitudinal cognitive phenomenology, content cognitive phenomenology, pure and impure cognitive phenomenology. The expert/novice argument that is introduced earlier in this article can be seen as a contrast argument.

When contrast arguments are used as argument for attitudinal cognitive phenomenology one typically appeals to cases where there is a slight change in one´s attitude towards a content. An example is the change of attitude one experiences when one suddenly grasps a mathematical proof. There is something that it is like to grasp a mathematical proof, and the state one is in when one suddenly grasps it differs from the state one was in before grasping it.

When contrast arguments are used as arguments for content phenomenology one typically appeals to a pair of situations where one is attending to the meaning of an ambiguous utterance in natural language, and where there appears to be a phenomenal difference in the states one is in depending on which proposition one takes the utterance to express (Horgan & Tienson 2002).

Contrast arguments can be more or less convincing, depending on how easy it is to give an alternative explanation of the contrast, and on whether the claim that there is a contrast is convincing.

´The foreign language argument’, due to Galen Strawson (1994), is maybe the most famous contrast argument for cognitive phenomenology: Jack is a native English speaker who does not understand French, while Jacques is a native French speaker. Both Jack and Jacques hear the same instance of the utterance ´La vie est belle`. There is something that it is like for both Jack and Jacques to hear the utterance, though what it is like for Jacques differs from what it is like for Jack. So, Jack and Jacques are put in different phenomenal states. The difference in the phenomenal character of their states can be explained by the fact that Jacques, unlike Jack, understands what is being said. Jacques, unlike Jack, has an attitude of understanding towards the content, and he is able to consciously entertain the content that is being expressed. In the case of Jacques, unlike Jack, cognitive states of understanding and entertaining a content put him in a phenomenal state, and this explains why the phenomenal state he is in differs from the phenomenal state Jack is in. In order to make the foreign language argument into an argument for cognitive phenomenology one needs to add that the phenomenal difference between Jack`s and Jacques` states is a difference in cognitive phenomenology.

However, in this case, at least some of the differences between the two phenomenal states involve differences in sensory phenomenology. From phonetic studies, we know that a sentence expressed in a language sounds different for a person who understands that language, compared to what it sounds like for a person who does not understand the language (Pinker 1995). This difference is at least partly auditory. The person who understands the language attends differently to the phonemes and prosody of the utterance compared with the person who does not understand the language. A sceptic about cognitive phenomenology may therefore agree that there is a phenomenal difference between the states that Jack and Jacques are in, but claim that the difference is a difference in purely sensory phenomenology (Lormand 1996). The proponent of cognitive phenomenology may insist that though the phenomenal states of Jack and Jacques also differ in sensory phenomenology, the differences in sensory phenomenology do not sufficiently explain the whole phenomenal difference.

A different type of contrast argument that appeals to ambiguous utterances in a familiar language has been proposed by, among others, Kriegel 2011, Horgan 2011, Horgan & Tienson 2002 and Siewert 1999. For example: it is something that it is like to hear the ambiguous utterance ´I am going to the bank` where one understands this utterance as being about the financial institution, as opposed to what it is like to hear the same instance of the utterance and understand it as being about the river bank. One is in different phenomenal states depending on which proposition one consciously entertains. Arguably, given that one accepts that there is a phenomenal difference between these states, this difference is best explained as a difference in cognitive phenomenology.

In this case, the argument is appealing to the same instance of utterance in a language that one does understand. A sceptic who agrees that there is phenomenal difference between the two states may possibly claim that the different understandings cause one to be in different sensory states, and that the phenomenal difference is due to this. However, it is less easy, compared with the foreign language argument, to see what candidates for such states would be. Surely, hearing the utterance and understanding it as ´I am going to the financial institution` may cause some emotional responses in someone who has financial problems, but it needs not have such an effect. Apparently, one needs not respond emotionally to either of the two understandings of the utterance. Also, one may, but one needs not visualize the financial institution or the river bank when hearing the utterance. Arguably, one´s sensory states can remain the same, regardless of which of the two understandings one consciously entertains, and still there is a phenomenal difference. Therefore, if there is a phenomenal difference in this contrast case, the most plausible candidate for explaining the difference is that there is a difference in cognitive phenomenology. One is put in different phenomenal states, and no wholly sensory states suffice to put one in these phenomenal states. Contrast arguments involving ambiguous utterances of this type have the virtue that if there is a phenomenal contrast in these cases, this contrast is difficult to explain away as a contrast in sensory phenomenology. One way of responding to such contrast arguments is to deny that there is a phenomenal contrast. That is, one is not in different phenomenal states in such cases.

c. The Self-Knowledge Argument

The self-knowledge argument that was originally presented by David Pitt (2004) is a very complex argument, and this article presents only a rough version of it.

The argument from self-knowledge differs from the types of arguments introduced above in that it explicitly supports a strong cognitive phenomenology claim: the claim that there is a proprietary, distinctive and individuative cognitive phenomenology. According to the argument, we can have immediate knowledge of the content of our own conscious thoughts, and the only way we can explain how such knowledge is possible is by assuming that there is a proprietary, distinctive and individuative cognitive phenomenology of thought. From this it follows that one is able to consciously do three distinct things: a) to distinguish one´s occurrent conscious thoughts from one´s other occurrent conscious mental states (cognitive phenomenology is proprietary); b) to distinguish one´s occurrent conscious thoughts from each other (cognitive phenomenology is distinctive); c) to identify each of one´s occurrent conscious thoughts as the thought it is (cognitive phenomenology is individuative).

According to the self-knowledge argument (Pitt 2004):

P1: It is possible immediately to identify one´s occurrent conscious thoughts: one can know by acquaintance (via introspection) which thought a particular occurrent thought is: but

P2: It would not be possible immediately to identify one´s conscious thought unless each type of conscious thought had a proprietary, distinctive, individuative phenomenology, so

C: Each type of conscious thought—each state of consciously thinking that p, for all thinkable contents p—has a proprietary, distinctive, individuative phenomenology.

The argument is valid. Before questioning the premises, we should say something about what it is that motivates them.

Intuitively, one does know the content of one´s conscious thoughts, and one has a privileged introspective access to one´s own thoughts that other people lack. I know when I am thinking ´that pizza is good`, and I know that the mental state I am in is a thought and not a perceptual state. So, I am able to identify my thought as a thought, and I am able to identify the content of my thought and distinguish it from other thoughts.

However, according to the premises of the argument, it is possible to ´immediately` identify one´s occurrent conscious thoughts (P1). This premise relies on a particular view on introspection of phenomenal states—the acquaintance theory—that is controversial. On this view, introspection makes one directly or immediately aware of one´s phenomenal states and their contents. No inferences are made and no causal processes are involved. If one holds a different view on introspection one can simply deny P1 and the argument for self-knowledge. In his article, Pitt strongly defends the acquaintance theory of introspection. For further reading consult Pitt 2004 and Pitt 2011.

d. An Argument for Pure Cognitive Phenomenology

Contrast arguments and arguments from examples are generally neutral when it comes to whether they are arguments for pure or impure cognitive phenomenology.

However, Kriegel´s cognitive zombie argument is an argument for pure cognitive phenomenology (see Kriegel 2015b and Chudnoff 2015b). A philosophical zombie is a being that acts and talks like a phenomenally conscious being, but who completely lacks phenomenal states. In other words, there is nothing that it is like to be a zombie (see Chalmers 1996).

Imagine a partial zombie, Zoe, who is an expert mathematician. Zoe is also a sensory zombie, in that there is nothing that it is like for her to have sensory experiences. Still, there is something that it is like for her to gain new mathematical insights. Since Zoe is a sensory zombie the phenomenal states she is in when gaining new mathematical insights are purely cognitive phenomenal states.

A sceptic may respond to this thought experiment by claiming that since Zoe is a sensory zombie, there is nothing that it is like for her to gain these insights. One may insist that cognitive states do not suffice to put one in the phenomenal states that one is normally put in when one grasps something or gains a new insight. The sceptic can either claim that the phenomenology involved in being in such phenomenal states is purely sensory, or she could hold that it is impurely cognitive phenomenal.

In order to strengthen the appeal of this thought experiment, one can turn it into a contrast-argument. Imagine that Zoe turns into a full zombie. As a full zombie, there is nothing that it is like for her to gain mathematical insights. Intuitively, there is a phenomenal contrast between the states of sensory zombie Zoe, and the states of full zombie Zoe. While there is something that it is like for the sensory zombie Zoe to gain mathematical insights, there is nothing that it is like for the full zombie Zoe to do so. If we share the intuition that there is such a contrast between the two zombies, we should also accept that pure cognitive phenomenology is possible.

Interestingly, the cognitive zombie argument appears as more challenging for proponents of impure cognitive phenomenology who deny that there is pure cognitive phenomenology, than for a sceptic who denies that there is cognitive phenomenology. Sensory states within different sensory modalities can put one in certain phenomenal states. We can imagine a zombie who lacks sensory phenomenology in all sensory modalities apart from audition. Intuitively, since she has auditory phenomenal states there is something that it is like for her to watch a movie though her experience is clearly not as rich as that of an ordinary person. Similarly, even if it is normally the case that the phenomenal state one is in when grasping a mathematical proof is a phenomenal state that both sensory and cognitive states puts one in, still there is something that it is like for Zoe the sensory zombie to grasp mathematical proofs. Though Zoe´s phenomenal states may not be as rich as that of a normal person. (For further reading, consult Kriegel 2015b and Chudnoff 2015b.)

e. Individual Differences

Philosophers of mind generally agree that conscious sensory states have phenomenal characters. We come to know what it is like to be in a certain conscious sensory state simply by being in that state. But, when it comes to irreducible cognitive phenomenology, philosophers strongly disagree about whether it exists or not. Why do they disagree?

Maybe the reason why philosophers disagree so strongly is that people simply differ? That is, some people have cognitive phenomenal states, while others do not (see Schwitzgebel 2008)? If this is the case, it can explain why highly competent philosophers on both sides of the debate come to different conclusions when introspecting their own conscious states. However, most philosophers seem to dismiss this possibility. What are the reasons for thinking that people differ so greatly in their phenomenal states? Why are there no similar controversies when it comes to disputes about sensory phenomenology?

4. Implications of the Cognitive Phenomenology Debate

What are the implications of the cognitive phenomenology debate? Why should we care about cognitive phenomenology?

One issue that arises from the cognitive phenomenology debate concerns the trustworthiness of introspection. If there is a cognitive phenomenology, then the opponents have overlooked a range of phenomenal states that they enjoy. On the other hand, if there is no cognitive phenomenology, the proponents have been positing a range of phenomenal states that they do not enjoy (Bayne & Montague 2011). Such considerations may lead us to question the reliability of introspection (Schwitzgebel 2008).

The cognitive phenomenology debate also has implications for the general debate about consciousness, since there are certain theories of consciousness that are at odds with the existence of cognitive phenomenology. For example, accounts that identify phenomenal states with intentional states with non-conceptual contents (see Tye 1995). Such views are not compatible with thoughts having a distinctive phenomenal character, since the content of a thought is conceptual.

Further, the cognitive phenomenology debate has implications for our view on the relationship between phenomenology and intentionality. Proponents of phenomenal intentionalism take phenomenology to be the source of intentionality (Kriegel 2013, Mendelvici 2018). Most proponents of phenomenal intentionalism hold that there is a cognitive phenomenology. If phenomenology is the source of intentionality, cognitive phenomenology is the source of the intentionality of cognitive states. If there is no cognitive phenomenology, the proponents of phenomenal intentionalism need to tell a different story of how phenomenology can be the source of the intentionality of cognitive states.

The cognitive phenomenology debate also has implications for the debate about whether consciousness can be naturalized. If only sensory states are phenomenal states, naturalizing cognition is part of what Chalmers (1996) labels ´the easy problem of consciousness`, while naturalizing conscious sensory states is part of ´the hard problem of consciousness`. The easy problems of consciousness are those that can be solved (in the future) by using the standard methods of cognitive science. Whereas the hard problem is that of explaining phenomenal consciousness (see “The Hard Problem of Consciousness”). If there is a cognitive phenomenology, the hard problem of consciousness becomes more expansive as it will include both sensory and cognitive phenomenal states. Arguably, therefore, if there is a cognitive phenomenology, naturalizing consciousness becomes harder. However, the hard problem remains ´hard` whether we accept that there is a cognitive phenomenology or not. If arguments convince us that there is a cognitive phenomenology, we should accept these independently of the fact that it has the consequence of expanding the hard problem.

5. References and Further Reading

Bayne, T & Chalmers, J. L. 2003. “What is the Unity of Consciousness”. In Cleeremans, A (ed.) The Unity of Consciousness. Oxford University Press.
Bayne, T. 2009. “Perception and the Reach of Phenomenal Content.” Philosophical Quarterly 59 (235): 385-404.
Bayne, T and Montague, M. 2011. “Cognitive Phenomenology: An Introduction”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
Carruthers, P and Veillet, B. 2011. “The Case against Cognitive Phenomenology”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
Chalmers, D. 1996. The Conscious Mind. Oxfords University Press.
Chudnoff, E. 2015a. Cognitive Phenomenology. Routledge.
Chudnoff. E. 2015b. “Phenomenal Contrast Arguments for Cognitive Phenomenology.” Philosophy and Phenomenological Research 90 (2): 82-104.
Dretske, F. 1995. Naturalizing the Mind. MIT Press.
Goldman, A. 1993. “Consciousness, Folk Psychology, and Cognitive Science.” Consciousness and Cognition 2 (4):364-382.
Horgan, T. 2011. “From agentive phenomenology to Cognitive Phenomenology: A guide for the perplexed”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
Horgan, T and Graham, G. 2012. “Phenomenal Intentionality and Content determinacy”. In Richard Schantz (ed.) Prospects of Meaning. De Gruyter.
Horgan, T and Tienson, J L. 2002. “The Intentionality of Phenomenology and the Phenomenology of Intentionality”. In Chalmers, D (ed.) Philosophy of Mind: Classical and Contemporary readings. Oxford University Press.
Kriegel, U. 2011. The Sources of Intentionality. Oxford University Press.
Kriegel, U. 2013. “The Phenomenal Intentionality Research Program”. In Kriegel, U (eg.) Phenomenal Intentionality. Oxford University Press.
Kriegel, U. 2015. “The Character of Cognitive Phenomenology” In Breyer, T and Gutland, C (eds.) Phenomenology of Thinking. Routledge.
Kriegel, U. 2015. The Varieties of Consciousness. Oxford University Press.
Levine, J.2011. “On the Phenomenology of Thoughts” In Bayne & Montague (eds.) Cognitive Phenomenology. Oxford University Press.
Lormand, E. 1996. “Nonphenomenal Consciousness” Nous 30(2): 242-261.
Mendelovici, A. 2018. The Phenomenal Basis of Intentionality. Oxford University Press
Montague, M. 2017. “Perception and Cognitive Phenomenology” Philosophical Studies 174: 2045-2062.
Nes, A. 2011. “Thematic Unity in the Phenomenology of Thinking” Philosophical Quarterly 62: 84 -105.
Pitt, D. 2004. “The Phenomenology of Cognition, or What it is Like to Think That P?” Philosophy and Phenomenological Research 69(1): 1-36.
Pitt, D. 2011. “Introspection, Phenomenality, and the Availability of Intentional Content”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press
Prinz, J. 2011. “The Sensory Basis of Cognitive Phenomenology”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
Schwitzgebel, E. 2008. “The unreliability of naïve introspection” The Philosophical Review 117 (2): 245-273.
Siegel, S. 2010. The Contents of Visual Experience. Oxford University Press.
Siewert, C. 1998. The Significance of Consciousness. Princeton University Press.
Siewert, C. 2011. “Phenomenal Thought”. In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
Smithies, D. 2013a. “The Significance of Cognitive Phenomenology” Philosophy Compass 8(8): 731-743.
Smithies, D. 2013b. “The Nature of Cognitive Phenomenology” Philosophy Compass 8(8): 744-754.
Spener, M. 2011. “Disagreement about Cognitive Phenomenology.” In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
Strawson, G. 1994 Mental Reality. MIT Press.
Strawson, G. 2011. “Cognitive Phenomenology: Real life” In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.
Tye, M. 1995. Ten Problems of Consciousness: A Representational Theory of the Phenomenal Mind. MIT Press.
Tye, M and Briggs, W. 2011. “Is there a Phenomenology of Thought?” In Bayne, T and Montague, M (eds.) Cognitive Phenomenology. Oxford University Press.

Author Information

Mette Kristine Hansen
Email: Mette.Hansen@uib.no
University of Bergen
Norway

Sigmund Freud: Religion

This article explores attempts by Sigmund Freud (1850-1939) to provide a naturalistic account of religion enhanced by insights and theoretical constructs derived from the discipline of psychoanalysis which he had pioneered. Freud was an Austrian neurologist and psychologist who is widely regarded as the father of psychoanalysis, which is both a psychological theory and therapeutic system. As a theory, psychoanalysis conceptualizes the mind as a system composed of three constituent elements: id, ego, and superego. It focuses on the interaction between those elements, and includes such key concepts as infantile sexuality, repression, latency and transference. Psychoanalytic therapy is an application of this conceptual schema, in which the interaction of the mind’s conscious and unconscious elements in individual cases is explored using the techniques of dream interpretation, free association and the analysis of resistance to identify repressed conflicts and bring them into the conscious mind.

Freud’s thought on religion is, perhaps fittingly, rather complex and ambivalent: while there can be little doubt as to its roundly skeptical, and at times hostile, character, it is nonetheless clear that he had a firm grounding in Jewish religious thought and that the religious impulse held a life-long fascination for him. This article charts the evolution of his views on religion from Totem and Taboo (1913), through The Future of an Illusion (1927) and Civilization and its Discontents (1930) to Moses and Monotheism (1939), focusing in particular on the parallels drawn by him between religious belief and neurosis, and on his account of the role which the father complex plays in the genesis of religious belief. The article concludes with a review of some of the main critical responses which the Freudian account has elicited.

Psychoanalysis and Religion
Freud’s Jewish Heritage
Philosophical Connections
The Orientation of Freud’s Approach to Religion
Totemism and the Father Complex
Religion and Civilization
The Moses Narrative: The Origins of Judaic Monotheism
Critical Responses
References and Further Reading
1. References
2. Further Reading

1. Psychoanalysis and Religion

At the heart of Freud’s psychoanalysis is his theory of infantile sexuality, which represents individual psychological human development as a progression through a number of stages in which the libidinal drives are directed towards particular pleasure-release loci, from the oral to the anal to the phallic and, after a latency period, in maturity to the genital. He thus saw the psychosexual development of every individual as consisting essentially of a movement through a series of conflicts which are resolved by the internalization, through the operation of the superego, of control mechanisms derived originally from an authoritative, usually parental, source. In infancy, such a progression entails a process whereby parental control involves the introduction to the child of behavioral prohibitions and limitations and necessitates the repression, displacement or sublimation of the libidinal drives.

Central to this account is the idea that neuroses, which may include the formation of psychosomatic symptoms in the individual, arise essentially either out of external trauma or through a failure to effect a resolution of the internal conflict between libidinal urges and the key psychological control mechanisms. Symptomatically, these often present as compulsive and debilitating patterns of behavior—as in hysteria, repetitive ceremonial movements or an obsession with personal hygiene—which make a normal healthy life impossible, requiring psychotherapeutic intervention in the form of such techniques as dream analysis and free association. Of particular importance, he held, is the resolution of the Oedipus complex, which arises at the phallic stage, in which the male child forms a sexual attachment with the mother and comes to view the father as a hated and feared sexual rival. That resolution, which Freud saw as essential to the formation of sexuality, entails the repression of the drive away from the mother as libidinal object and the male child’s identification with the father. The cluster of associations relating to the multifaceted relationship between son and father Freud termed “the father complex” (1957, 144) and, as we shall see, viewed it as central to a correct understanding both of the developmental psychology of human beings and to many of the central and most important social phenomena in human life, including religious belief and practice.

In his account of religion Freud deployed what Paul Ricoeur (1913—2005) terms a hermeneutic “of suspicion” (Ricoeur 1970, 32), a reductive and demystifying style of interpretation that repudiated what he saw as a masquerade of conventional meanings operating at the level of common discourse in favor of deeper, less conventional truths relating to human psychology. He sought to demonstrate by this means the true origins and significance of religion in human life, in effect utilizing the techniques of psychotherapy to achieve that goal. Freud’s general position on religion stands firmly in the naturalistic tradition of projectionism stretching from Xenophanes (c.570—c.475 B.C.E.) and Lucretius (c.99—c.55 B.C.E.) through Thomas Hobbes (1588—1679) and David Hume (1711—76) to Ludwig Feuerbach (1804—1872) in holding that the concept of God is essentially the product of an unconscious anthropomorphic construct, which Freud saw as a function of the underlying father complex operating in social groups. “The psycho-analysis of individual human beings,” he thus stated boldly in Totem and Taboo, “teaches us with quite special insistence that the god of each of them is formed in the likeness of his father, that his personal relation to God depends on his relation to his father in the flesh and oscillates and changes along with that relation, and that at bottom God is nothing other than an exalted father” (Freud 2001, 171).

The following sections examine the considerations which led him to this view, to the manner in which it found articulation in his writings on religion and to the main criticisms which it has encountered.

2. Freud’s Jewish Heritage

Freud was born to Jewish parents in the town of Freiberg, then in the Austro-Hungarian Empire. His father Jacob was a businessman descended from a long line of rabbinical scholars; a textile merchant, he went bankrupt when Sigmund was four years of age and the family were forced to move to Vienna, where they lived in genteel poverty for many years, dependent in part upon the generosity of relatives. The young Sigmund found it difficult to come to terms with the new urban surroundings and family’s reduced financial circumstances. Experience of the latter left him with a life-long fear of poverty, his overweening ambition to establish psychoanalysis as a new science and successful treatment for hysteria was as a result partially motivated by the desire to achieve financial security for his family.

In the preface to the Hebrew edition of Totem and Taboo, published in 1930, Freud described himself as being “in his essential nature a Jew and who has no desire to alter that nature,” but one who is “completely estranged from the religion of his fathers—as well as from every other religion” (Freud 2001 Preface, xiii). This phrasing marks Freud’s recognition that, notwithstanding his skepticism regarding religion, his character had largely been formed by a Judaic cultural heritage passed on to him by his father Jacob, with whom he had a rather fraught relationship. Freud’s ancestors were affiliates of Hasidic Judaism going back many generations, and included several rabbis and distinguished scholars among their number (Berke 2015, xii). While Jacob was liberal and progressive in his outlook, he retained a deep reverence for the Talmud and the Torah and had overseen Sigmund’s childhood study of the Philippson family Bible, which generated in the young Sigmund a life-long fascination with the story of Moses and his connection with Egypt. He also ensured that the boy had a traditional Jewish schooling in which he was steeped in Biblical studies in the original Hebrew. In that connection the young Freud developed a deep admiration for, and friendship with, one of his religion teachers, Rabbi Samuel Hammerschlag, who was a strong proponent of humanistic Reform Judaism. Such was his admiration for his teacher that Freud ultimately named his fifth and sixth children, Sophie and Anna, after Hammerschlag’s niece and daughter; commentators now generally agree that the patient referred to as ‘Irma’ in Freud’s pivotal The Interpretation of Dreams was in fact Anna Hammerschlag. It was Rabbi Hammerschlag’s deep humanism, more than any other feature of his character, which Freud found inspiring, inculcating in him a lasting commitment to the universality of Enlightenment values. It is notable that, in seeking to pay Hammerschlag the highest compliment possible in the obituary which he wrote for him in 1904, Freud compared him to the Hebrew prophets, but also highlighted the extent to which that aspect of his character was integrated with humanistic ideals: “A part from the same fire which animated the great Jewish seers and prophets burned in him … but the passionate side of his nature was happily tempered by the ideal of humanism of our classical German period, which governed him and his method of education” (Freud 1976 IX, 256).

Notwithstanding the positive impact of such religious influences, from adolescence onwards Freud apparently found the observances and strictures required by orthodox Jewish belief increasingly burdensome and he became overtly hostile to the religion of his forefathers and to religion in general (Goodnick 1992, 352); it is likely that this was the principal cause of the estrangement between Sigmund and his father Jacob. That the estrangement ran deep and was a source of distress to Jacob became evident on the occasion of his son’s 35^th birthday, when, in a gesture conforming with an established Jewish custom, he presented Sigmund with the family Bible which he had studied so closely as a child, newly rebound in leather. This was accompanied by a richly lyrical dedication in Hebrew, written in the style of melitzah, a literary tradition of Biblical allusion (Alter 1988, 23), referencing the relationship between them and their shared Jewish heritage. In part, the verse ran:

Son who is dear to me, Shelomoh. In the seventh in the days of the years of your life the Spirit of the Lord began to move you and spoke within you: Go, read in my Book that I have written and there will burst open for you the wellsprings of understanding, knowledge, and wisdom… For the day on which your years were filled to five and thirty I have put upon it a cover of new skin and have called it: “Spring up, O well, sing ye unto it!” And I have presented it to you as a memorial and as a reminder of love from your father, who loves you with everlasting love. (trans. and cited by Yerushalmi 1993, 71)

This attempt at effecting a rapprochement, which gently sought to remind Freud of his father’s love for him and of their shared religious and cultural heritage—implying, as one commentator puts it, “that their Bible embodies both the Jewish tradition and this love” (Gresser 1994, 31)—appeared initially not to have been successful. Freud never mentioned his father’s birthday dedication in his writings, though it was found after his death perfectly preserved in the Philippson Bible with which he had been presented, and his reductive critique of institutional religion became instead ever more sustained and pointed. Yet, at the deepest level, an ambivalence remained; as Freud acknowledged in his Autobiographical Study, “My deep engrossment in the Bible story (almost as soon as I had learnt the art of reading) had, as I recognised much later, an enduring effect upon the direction of my interest” (Freud 1959, XX 8).

The death of Jacob on 23^rd October 1896 was one of the most important events in Sigmund Freud’s life and precipitated a lengthy period of reflective contemplation on their relationship. As he confessed later that year in a letter to his friend Wilhelm Fliess, “… the old man’s death has affected me deeply. I valued him highly, understood him very well, and with his peculiar mixture of deep wisdom and fantastic light-heartedness he had a significant effect on my life… in my inner self the whole past has been awakened by this event. I now feel quite uprooted” (Freud 1986, 202). The importance of the event cannot be overestimated; Jacob’s death triggered a period of sustained self-analysis in which Freud had what he considered an epiphany: the hostility which he had often felt towards his father, which had at one point made him suspect that Jacob had been guilty of sexually abusing him, was due to the fact that as a child he saw Jacob as a rival for his mother’s love. Thus was born the ideas of the Oedipus complex to which we have referred above, which, universalized by Freud, became one of the cornerstones of psychoanalytic theory. In his 1908 preface to the second edition of The Interpretation of Dreams, the work which made his reputation globally and brought him the financial security which he had craved, Freud made clear the extent to which his articulation of the new science owed to his analytical resolution of the crisis generated by Jacob’s death: “It was a portion of my own self-analysis, my reaction to my father’s death—that is to say, to the most important event, the most poignant loss of a man’s life” (Freud 2010, xxvi). Still awaiting resolution at that point, however, was the conflict generated in Freud’s life by the demand to find a means of affirming the richness and particularity of his Jewish cultural heritage, as his father had urged in his dedication, without acceding to the Biblical and theological orthodoxies associated with it. A number of scholars (Rice, 1990; Gresser, 1994) have suggested that this problem is one of the keys to an understanding of his final work, Moses and Monotheism.

3. Philosophical Connections

Two of the major formative influences upon Freud were those of the philosophers/psychologists Franz Brentano (1838—1917) and Theodor Lipps (1851—1914). Brentano was author of the seminal Psychology From an Empirical Standpoint (1973, orig. 1874); Freud took two philosophy courses under his direction when he first enrolled at the University of Vienna, as part of which he encountered Feuerbach’s writings on religion. Freud was captivated by the scope and clarity of Brentano’s lectures and found the latter’s emphasis on the need for empirical methods in psychology and for philosophy to be informed by logical rigour and scientific findings highly congenial. Less congenial to him, perhaps, were Brentano’s rational theism and his dismissal of the notion of unconscious mental states; these were two key issues on which Freud was subsequently to diverge sharply from him.

Freud—like other gifted students of Brentano such as Edmund Husserl (1859—1938) and Alexius Meinong (1853—1920)—was enthralled by him as a teacher and scholar, describing him in correspondence as “a darned clever fellow, a genius” (in Boehlich (ed.) 1992, 95). Such was the impact of Brentano’s influence that, at one stage, Freud resolved to take his doctorate in philosophy and zoology, a proposal towards which Brentano was favourably disposed but which faculty regulations at the University prevented from being realised.

In seeking to modernise psychology, Brentano had returned to the Aristotelian definition of the subject, understanding it as “the science which studies the properties and laws of the soul, which we discover within ourselves directly by means of inner perception, and which we infer, by analogy, to exist in others” (Brentano 1973, 5). In that connection, he revitalised the famous principle of intentionality from scholasticism as the defining criterion of mental phenomena and processes: unlike the physical counterparts from which they must be distinguished, mental or psychical phenomena, he argued, are necessarily directed towards intentional objects. Further, since such phenomena are accessible to us directly by means of “inner perception,” their existence and nature comes, he argued, guaranteed with an epistemic certainty and transparency that is markedly lacking in relation to our perception of physical phenomena, where, for example, we sometimes misapprehend such subjective characteristics as colour and taste as objective properties of things.

Given this distinction between the physical and the mental, Brentano considered that one of the key problems for an empirical psychology was that of constructing an adequate picture of the internal dynamics of the mind from an analysis of the complex interplay between diverse mental phenomena, on the one hand, and the interactions between the mind and the external world, on the other. This conception was to have a profound influence upon the development of Freudian psychoanalysis, into which it was to become prominently incorporated. However, Brentano set his face implacably against admitting the notion of unconscious mental states and processes into a fully scientific psychology. In this he was in part motivated by his conviction that all mental states are known directly in introspection or “inner perception” and are thus, by definition, conscious; mental acts, he considered, are pellucid in the sense that they take themselves as secondary objects and so are consciously apprehended as they occur. Further, the positing of the existence of unconscious mental states also seemed to him to introduce uncertainty and vagueness into the field of psychology and to carry with it an implication of the impossibility of the very rigorous, empirically-based science of mind which he sought to establish.

While Freud adopted Brentano’s characterisation of the intentional nature of mental phenomena throughout his work, he did not, of course, accept that all such phenomena are conscious, and indeed extended the very notion of intentionality, in the guise of symbolic meaning, to the level of the unconscious. For the primary focus of Freud’s interest was medical and his therapeutic practice was, from the outset, predicated upon the assumption of a level of scientific understanding of aberrant behaviour and abnormal mental states. And it seemed evident to him from an early stage that the restriction of psychology to the level of conscious processes and events had made, and would continue to make, such a goal unattainable, and that it was precisely because traditional psychology had operated with that restriction that it found such occurrences problematic and inexplicable. Thus, while both Brentano and Freud were motivated by the desire to create a fully scientific science of mind, they reached diametrically opposed positions on the question of the inclusion of the unconscious in its terms of reference. In contrast with Brentano’s belief that the very notion of the unconscious lacks intellectual validity, Freud was convinced that a scientific approach to the area of the mental requires the concept of the unconscious as a critical presupposition.

Freud found strong support for this conviction in Theodor Lipps, a thinker who was as committed as Brentano to the ideal of an empirically grounded psychology governed by an experimental methodology, but who, unlike Brentano, considered that this necessitated, at a fundamental level, reference to the unconscious. Lipps’ account of the nature of the unconscious was of particular importance to the development of Freud’s thought for two reasons: In the first instance, when Freud encountered Lipps’ view that consciousness is an “organ” which mediates the inner reality of unconscious mental processes, he found in it a theory which was almost identical to one at which he had independently arrived. Secondly, in his account of humor—which also anticipated much of Freud’s later work on that subject—Lipps had extended the notion of aesthetic empathy (Einfühlung; “in-feeling” or “feeling-into”) from Robert Vischer (1847—1933) into the psychological realm to designate the process that allows us to comprehend and respond to the mental lives of others by putting ourselves in their place, which involved the key notion that meaningful interaction between humans necessitates the projection of mental states and occurrences from the self to others.

Freud adopted and integrated Lipps’ account of projection centrally in his psychoanalytic theory, regarding it as a precondition for establishing the relationship between patient and analyst which alone makes the interpretation of unconscious processes possible. But perhaps of even greater consequence in connection with the analysis of religion is the fact that concomitant to the idea of psychological projection is the notion that the human need to ascribe psychological states to others can and does readily lead to situations in which such ascriptions are extended beyond their legitimate boundaries in the human realm. As David Hume had observed, “There is an universal tendency among mankind to conceive all beings like themselves, and to transfer to every object those qualities with which they are familiarly acquainted, and of which they are intimately conscious” (Hume 1956, Section 111). It is in that way that personifications or anthropomorphisms arise: human beings, particularly at the early stage of their development, have an innate tendency to go beyond the legitimate boundaries of application of the psychological concept-range and thus to misapply human-being concepts. A child relates to its environment at large most readily through such a process: in the narratives provided by storybooks, school text-books and film and televisual animation, the child’s interest, attention, and above all, its understanding, are engaged through the attribution of anthropomorphic qualities to non-human objects and organisms: bees worry, trees are sad, ants are curious, and so on.

In his Essence of Christianity (1841; English trans. 1881), Ludwig Feuerbach had offered a sustained critique of religion predicated upon the notion that the very idea of God is such an anthropomorphic construct, with no reality beyond the human mind, and that specific characteristics attributed to God in religion (Love, Benevolence, Power, Knowledge, and so forth) embody an idealized conception of human nature and of the values esteemed by human beings. This projectionist view, which he first encountered under Brentano’s—no doubt, critical—tutelage, was one which Freud came to accept implicitly and indeed to extend, holding that the insights offered by psychoanalysis into the workings of the human mind can explain just why and how religious anthropomorphisms arise. Freud accordingly integrated his account of religion into the broader project of psychoanalysis, suggesting that “a large portion of the mythological conception of the world which reaches far into the most modern religions is nothing but psychology projected into the outer world… We venture to explain in this way the myths of paradise and the fall of man, of God, of good and evil, of immortality and the like—that is, to transform metaphysics into meta-psychology” (Freud 1914, 309. Italics in original).

4. The Orientation of Freud’s Approach to Religion

In articulating this project, Freud drew deeply upon a wide variety of anthropological sources, particularly the work of such contemporary luminaries as John Ferguson McLennan (1827—1881), Edward Burnett Tylor (1832—1917), John Lubbock (1834—1913), Andrew Lang (1844—1912), James George Frazer (1854—1941) and Robert Ranulph Marett (1866—1943) on the connection between social structures and primitive religions. Freud’s claim to originality in this context resides in his attempt to situate projectionism within the framework of psychoanalysis, ultimately interpreting the social origins and cultural significance of the religious impulse in terms paralleling his account of the father-son relationship in individual psychology.

The evolutionist paradigm, which projected a universal linear cultural development from the primitive to the civilized, with the differences found in human societies reflecting stages in that development, gradually came to function as a background assumption in Freud’s thought from an early stage. Tylor, whose Primitive Culture (1871) and Anthropology (1881) are generally regarded as foundational to the then emergent science of cultural anthropology, held that, in terms of human interaction with the world at large, civilization progresses through three developmental “stages,” from magic through religion to science, with contemporary Western culture representative of the final stage. This view was rearticulated by Frazer in his famous Golden Bough and referenced approvingly by Freud (2001, 90), though he emphasized that elements of the first two stages continue to operate in contemporary life. Accordingly, Freud gradually adopted the position of one who seeks to explicate the significance of religion in the context of a cultural milieu in which, having supplanted attempts to control the world through sympathetic magic, it has itself been superseded by science. Furthermore, Freud found in Tylor’s and Frazer’s evolutionist account of cultural progress an implication which had been affirmed explicitly by Feuerbach: “Religion is the childlike condition of humanity” (Feuerbach 1881, 13); it belongs to a social developmental stage paralleling that of the individual, through which each civilization must pass en route to the maturity of scientific understanding. It was perhaps this latter, more than any other factor, which was to suggest to Freud that the psychoanalytical techniques which he pioneered in his account of individual psychology could be applied socially, to explain the nature of the religious impulse in human life generally.

5. Totemism and the Father Complex

Some of Freud’s earliest comments on religion give immediate evidence of the psychologically reductionist direction which his thought was to take, which represented the dynamic underpinning religion as deriving from the powerfully ambivalent relationship between the child and his apparently omnipotent father. For example, in his 1907 paper “Obsessive Actions and Religious Practices” he drew attention to similarities between neurotic behavior and religious rituals, suggesting that the formation of a religion has, as its “pathological counterpart,” obsessional neurosis, such that it might be appropriate to describe neurosis “as an individual religiosity and religion as a universal obsessional neurosis” (Freud 1976 S.E. IX, 125-6), a view which he was to retain for the remainder of his life.

Freud’s first sustained treatment of religion in these terms occurs in his 1913 Totem and Taboo, in the context of his account, heavily influenced in particular by the work of James George Frazer, Andrew Lang and J.J. Atkinson, of the relationship between totemism and the incest prohibition in primitive social groupings. The prominence and strength of the incest taboo was of considerable interest to him as a psychologist, not least because he saw it as one of the keys to an understanding of human culture and as deeply linked to the concepts of infantile sexuality, Oedipal desire, repression and sublimation which play such a key role in psychoanalytic theory. In tribal groups the incest taboo was usually associated with the totem animal with which the group identified and after which it was named. This identification led to a ban on the killing or the consumption of the flesh of the totem animal and on other restrictions on the range of permissible behaviors and, in particular, it led to the practice of exogamy, the prohibition of sexual relations between members of the totem group.

Such prohibitions, Freud believed, are extremely important as they constitute the origins of human morality, and he offered a reconstruction of the genesis of totem religions in human culture in terms which are at once forensically psychoanalytical and rather egregiously speculative. The primal social state of our pre-human ancestors, he argued, closely following J.J. Atkinson’s account in his Primal Law, was that of a patriarchal “horde” in which a single male jealously maintained sexual hegemony over all of the females in the group, prohibiting his sons and other male rivals from engaging in sexual congress with them. In this account, the psycho-sexual dynamic operating within the group led to the violent rebellion of the sons, their murder of the father and their consumption of his flesh (Atkinson 1903, chapters I-III; Freud 2001, 164). However, the sons’ subsequent recognition that no one of them had the power to take the place of the father led them to create a sacred totem with which to identify him and to reinstate the practice of the exogamy which the parricide was designed to abolish: the creation of the totem yielded a totem clan within which sexual congress between members was forbidden. The identification of the totem animal with the father arose out of a displacement of the deep sense of guilt generated by the murder, while simultaneously being an attempt at reconciliation and a retrospective renunciation of the crime by creating a taboo around the killing of the totem. “They revoked their deed by forbidding the killing of the totem, the substitute for their father; and they renounced its fruits by resigning their claim to the women who had now been set free” (Freud 2001, 166). This identification, Freud asserted, confirmed the link between neurosis and religion suggested by him in 1907: given that the totem animal represents the father, then the two main taboo prohibitions of totemism, the ban on killing the totem animal and the incest prohibition, “coincide in their content with … the two primal wishes of children [to kill the father and have sexual intercourse with the mother], the insufficient repression or re-awakening of which forms the nucleus of perhaps all psychoneuroses” (Freud 2001, 153).

The parricidal deed, Freud asserted, is the single “great event with which culture began and which, since it occurred, has not let mankind a moment’s rest” (Freud 2001, 168), the acquired memory traces of which underpins the whole of human culture, including, and in particular, both totem and developed religions. Such a view, of course, presupposes the validity of the essentially Lamarckian idea that traits acquired by individuals, including psychological traits such a memory, can be inherited and thus passed through the generations. This was a controversial notion to which Freud, who never fully accepted the Darwinian account of evolution through natural selection, steadfastly adhered throughout his life, in the face of scientific criticism. He also took it as being consistent with Ernst Haeckel’s (1834—1919) view that ontogeny recapitulates phylogeny, that is, that the stages of individual human development repeat that of the evolution of humanity—which he took as scientific justification of his belief that psychoanalytical techniques could be applied with equal validity to the social as to the individual.

The counterpart to the primary taboo against killing or eating the totem animal, Freud pointed out, is the annual totem feast, in which that very prohibition is solemnly and ritualistically violated by the tribal community, and he followed the Orientalist William Robertson Smith (1846—1894) in linking such totem feasts with the rituals of sacrifice in developed religions. Such feasts involved the entire community and were, Freud argued, a mechanism for the affirmation of tribal identity through the sharing of the totem’s body, which was simultaneously an affirmation of kinship with the father. Freud saw no contradiction in such a ritual, holding that the ambivalence contained in the father-complex pervades both totemic and developed religions: “Totemic religion not only comprises expressions of remorse and attempts at atonement, it also serves as a remembrance of the triumph over the father” (Freud 2001, 169). The father is thus represented twice in primitive sacrifice, as god and as totem animal, the totem being the first form taken by the father substitute and the god a later one in which the father reassumes his human identity. The dynamic which operates in totem religions, Freud argued, is sustained by and underpins the evolution of religion into its modern forms, where the need for communal sacrifice to expiate an original sin should also be understood in terms of parricide guilt.

6. Religion and Civilization

In time Freud came to consider that the account which he had given in Totem and Taboo did not fully address the issue of the origins of developed religion, the human needs which religion is designed to meet and, consequently, the psychological motivations underpinning religious belief. He turned to these questions in his The Future of an Illusion (1927; reprinted 1961) and Civilization and its Discontents (1930; reprinted 1962). In the two works he represented the structures of civilization, which permit men to live in mutually beneficial communal relationships, as emerging only as a consequence of the imposition of restrictive processes on individual human instinct. In order for civilization to emerge, limiting regulations must be created to frustrate the satisfaction of destructive libidinal drives, examples of which are those directed towards incest, cannibalism and murder. Even the religious injunction to love one’s neighbor as oneself, Freud argued, springs from the need to protect civilization from disintegration. Given that history demonstrates that man is “a savage beast to whom consideration towards his own kind is something alien” (Freud 1962, 59), the fashioning of a value system based upon the requirement to develop loving relationships with one’s fellow man is a social and cultural necessity, without which we would be reduced to living in a state of nature. For Freud, the principal task of civilization is thus to defend us against nature, for without it we would be entirely exposed to natural forces which have almost unlimited power to destroy us.

Extending his account of repression from individual to group psychology, Freud contended that, with the refinement of culture, the external coercive measures inhibiting the instincts become largely internalized. Humans become social and moral beings through the functioning of the superego in effecting a renunciation of the more antisocial drives: “external coercion gradually becomes internalized; for a special mental agency, man’s super-ego, takes it over and includes it among its commandments… Those in whom it has taken place are turned from being opponents of civilization into being its vehicles” (Freud 1961, 11). However, the effect of such renunciations is to create a state of cultural privation “resembling repression” (Freud 1961, 43), which in order to foster social harmony must in turn be dissipated by sublimation, the creation of substitute satisfactions for the drives.

Professional work, Freud argued, is one area in which such substitutions take place, while the aesthetic appreciation of art is another significant one; for art, though it is inaccessible to all but a privileged few, serves to reconcile human beings to the individual sacrifices that have been made for the sake of civilization. However, the effects of art, even on those who appreciate it, are transient, with experience demonstrating that they are insufficiently strong to reconcile us to misery and loss. For that effect, in particular for the achievement of consolation for the suffering and tribulations of life, religious ideas become invoked; these ideas, he held, consequentially become of the greatest importance to a culture in terms of the range of substitute satisfactions which they provide.

The role which religion has played in human culture was thus described by Freud in his 1932 lecture “On the Question of a Weltanschauung” as nothing less than grandiose; because it purports to offer information about the origins of the universe and assures human beings of divine protection and of the achievement of ultimate personal happiness, religion “is an immense power, which has the strongest emotions of human beings at its service” (Freud 1990, 199). Since religious ideas thus address the most fundamental problems of existence, they are regarded as the most precious assets civilization has to offer, and the religious worldview, which Freud acknowledged as possessing incomparable consistency and coherence, makes the claim that it alone can answer the question of the meaning of life.

For Freud, then, the cultural and social importance of religion resides both in reconciling men to the limitations which membership of the community places upon them and in mitigating their sense of powerlessness in the face of a recalcitrant and ever-threatening nature. In this respect again, Freud held, group psychology is an extension of individual psychology, with the powerful father figure in patriarchal monotheistic religions providing the required protection against the threat of destruction: “Now that God was a single person, man’s relations to him could recover the intimacy and intensity of the child’s relation to his father” (Freud 1961, 19). It is in this sense, he argued, that the father-son relationship so crucial to psychoanalysis demands the projection of a deity configured as an all-powerful, benevolent father figure.

Genetically, Freud argued, religious ideas thus owe their origin neither to reason nor experience but to an atavistic need to overcome the fear of an ever-threatening nature: “[they] are not precipitates of experience or end results of thinking: they are illusions, fulfilments of the oldest, strongest and most urgent wishes of mankind. The secret of their strength lies in the strength of those wishes” (Freud 1961, 30). In declaring such ideas illusory Freud did not initially seek to suggest or imply that they are thereby necessarily false; an illusory belief he defined simply as one which is motivated in part by wish-fulfillment, which in itself implied nothing about its relation to reality. He gives the example of a middle-class girl who believes that a prince will marry her; such a belief is clearly inspired by a wish-fantasy and is unlikely to prove justified, but such marriages do occasionally happen. Religious beliefs, he suggested in The Future of an Illusion, are illusions in that sense; unlike delusions, they are not, or are not necessarily, “in contradiction with reality” (Freud 1961, 31). However, by the time he wrote Civilization and its Discontents he was prepared to take his religious skepticism a stage further, explicitly declaring religious beliefs to be delusional, not only on an individual but on a mass scale: “A special importance attaches to the case in which [the] attempt to procure a certainty of happiness and a protection against suffering through a delusional remolding of reality is made by a considerable number of people in common. The religions of mankind must be classed among the mass-delusions of this kind” (Freud 1962, 28).

Given that religion has, as Freud acknowledged, made very significant contributions to the development of civilization, and that religious beliefs are not strictly refutable, the question arises as to why he came to consider that religious beliefs are delusional and that a turning away from religion is both desirable and inevitable in advanced social groupings. The answer given in Civilization and its Discontents is that, in the final analysis, religion has failed to deliver on its promise of human happiness and fulfillment; it seeks to impose a belief structure on humans which has no rational evidential base but requires unquestioning acceptance in the face of countervailing empirical evidence: “Its technique consists in depressing the value of life and distorting the picture of the real world in a delusional manner—which presupposes an intimidation of the intelligence” (Freud 1962, 31). He took this as confirming his belief that religion is akin to a universal obsessional neurosis generated by an unresolved father complex and is situated on an evolutionary trajectory which can only lead to its general abandonment in favor of science. “If this view is right,” he concluded, “it is to be supposed that a turning-away from religion is bound to occur with the fatal inevitability of a process of growth, and that we find ourselves at this very juncture in the middle of that phase of development” (Freud 1961, 43). That Freud saw the movement from religious to scientific modes of understanding as a positive cultural development cannot be doubted; indeed, it is one which he saw himself facilitating in a process analogous to the therapeutic resolution of individual neuroses: “Men cannot remain children for ever; they must in the end go out into ‘hostile life’. We may call this education to reality. Need I confess to you that the sole purpose of my book is to point out the necessity for this forward step?” (Freud 1961, 49).

In Civilization Freud mentions that he had sent a copy of The Future of an Illusion to an admired friend, subsequently identified as the French novelist and social critic Romain Rolland. In his response, Rolland indicted broad agreement with Freud’s critique of organised religion, but suggested that Freud had failed in his attempt to identify the true experiential source of religious sentiments: a mystical, numinous feeling of oneness with the universe, “a sensation of ‘eternity’, a feeling as of something limitless, unbounded—as it were, ‘oceanic’” (In Freud 1962, 11). The occurrence of this feeling, Rolland argued, is a subjective fact about the human mind rather than an article of faith; it is common to millions of people and is undoubtedly “the source of the religious energy which is seized upon by the various Churches and religious systems” (In Freud 1962, 11). Thus, he suggested, it would be entirely appropriate to count oneself as religious “on the ground of this oceanic feeling alone, even if one rejects every belief and every illusion” (In Freud 1962, 11). In that sense, he concluded, there is an important sense in which Freud’s account of the origins of religion missed its mark to a significant degree.

Freud was clearly troubled by Rolland’s challenge, confessing that it caused him no small difficulty. On the one hand his respect for Rolland’s intellectual honesty made him take seriously the possibility that his analysis of religion might be deficient in failing to take cognizance of mystical feelings of the kind described. On the other hand, he was confronted with the obvious problem that feelings are notoriously difficult to deal with in a scientific manner. Additionally—and perhaps more importantly—Freud admitted to being unable to discover the oceanic feeling in himself, though he was not disposed on that ground to deny the occurrence of it in others. Given that such a feeling exists, even on the scale suggested by Rolland, the only question to be faced, Freud declared, is “whether it ought to be regarded as the fons et origo of the whole need for religion” (Freud 1962, 12).

Dismissing the possibility of accounting for the oceanic feeling in terms of an underlying physiology, Freud’s response was to focus on its “ideational content,” that is, the conscious ideas most readily associated with its feeling-tone. In that connection, he offered an account of the oceanic feeling as being a revival of an infantile experience associated with the narcissistic union between mother and child, in which the awareness of an ego or self as differentiated from the mother and world at large has yet to emerge in the child. In that sense, he contended, it would be implausible to take it as the foundational source of religion, since only a feeling which is an expression of a strong need could function as a motivational drive. The oceanic feeling, he conceded, may have become connected with religion later on, but he insisted that it is the experience of infantile helplessness and the longing for the father occasioned by it which is the original source from which religion derives (Freud 1962, 19).

However, while this analysis of the relation between religion and mystical experience is acknowledged as important and influential, few commentators have deemed it entirely adequate, the self-confessed absence of any direct experience of the oceanic feeling in Freud’s own case seeming to many to have led to an underestimation on his part of the significance of such feelings in the genesis of religion.

A very significant body of literature has since grown up around the idea that religion might have emerged genetically, and derive its dynamic energy, as Rolland suggested, from mystical feelings of oneness with the universe in which fear and anxiety are transcended and time and space are eclipsed. The work of thinkers as diverse as Paul Tillich (1886—1965), Ludwig Wittgenstein (1889—1951) and Paul Ricoeur (1913—2005) in this connection has proven influential and has established an ongoing dialogue between psychology and philosophy/theology (compare Parsons, 1998, 501). Additionally, Freud’s dismissal of the possibility of a physiological approach to mystical experience has been questioned. Recent scientific investigation of the neurophysiological correlates of mystical or spiritual experiences, utilizing magnetic resonance imaging (MRI) and related technologies, while extremely controversial, appears to demonstrate that some deep meditative practices trigger alterations in brain metabolism, occasioning the kind of numinous feelings specified by Rolland (compare d’Aquili, & Newberg 1999, ch. 6; Saarinen 2015, 19).

7. The Moses Narrative: The Origins of Judaic Monotheism

In 1939, while exiled in Britain and suffering from the throat cancer which was to lead to his death, Freud published his final and most controversial work, Moses and Monotheism. Written over a period of many years and sub-divided into discrete segments, two of which were published independently in the periodical Imago in 1937, the book has an inelegant structure. The many repetitions that it contains, coupled with the initial strangeness of the arguments advanced, persuaded some that it was the product of a man whose intellectual powers had fallen into serious decline. The analysis of Judaism offered in the text also evoked a vitriolic response from some quarters and even led to allegations of Jewish self-hatred on Freud’s part. However, in more recent times the book has become recognized as one of the most important in the Freudian canon, offering an innovative contribution to the understanding of the nature of religious truth and of the role played by tradition in religious thought.

The focal point of the work is the figure of Moses and his connection with Egypt, which had exerted a fascination on Freud since his childhood study of the Philippson bible, as evidenced also in his publication of the essay “The Moses of Michelangelo” in 1914. Accordingly, at this late juncture in his life and with the threat of fascist antisemitism looming over Europe, he turned his attention once more to the religion of his forefathers, constructing an alternative narrative to the orthodox Biblical one on the origins of Judaism and the emergence from it of Christianity. Developing a thesis partly suggested by work of the protestant theologian Ernst Sellin (1867—1946) in 1922, Freud argued that the historical Moses was not born Jewish but was rather an aristocratic Egyptian who functioned as a senior official or priest to the Pharaoh Amenhotep IV. The latter had introduced revolutionary changes to almost all aspects of Egyptian culture in the 14^th century B.C.E., changing his name to Akhenaten, centralizing governmental administration and moving the capital from Thebes to the new city of Akhetaten. More significantly, he had also introduced a strict new universal monotheistic religion to Egypt, the religion of the god Aton or Aten, in the process outlawing as idolatrous the veneration of the traditional Egyptian polytheistic deities, including the then dominant religion of Amun-Ra, removing all references to the possibility of an afterlife and prohibiting the creation of graven images. He had also proscribed all forms of magic and sorcery, closed all the temples and suppressed established religious practice, thereby undermining the social status and political power of the Amun priests. In Freud’s words, “This king undertook to force upon his subjects a new religion, one contrary to their ancient traditions and to all their familiar habits. It was a strict monotheism, the first attempt of its kind in the history of the world as far as we know and religious intolerance, which was foreign to antiquity before this and for long after, was inevitably born with the belief in one God” (Freud 1939, 34-5). This religion was represented as a universal rather than a local one, reflective of the fact that imperial conquest had extended the Pharaoh’s rule beyond the borders of Egypt into Nubia, Syria and parts of Mesopotamia, which brought with it the novel idea of exclusivity: that the God Aton was not merely the supreme god, but the only god.

These radical innovations were not well received either by the disempowered Amun priestly caste or by the Egyptian general populace; predictably, they produced a fanatical desire for retribution and the return of the traditional religious practices on the part of the priests and the discontented people, “a reaction which was able to find a free outlet after the king’s death” (Freud 1939, 39). Thus, when the Pharaoh died in 1358 B.C.E. the religion of Aton was ruthlessly suppressed in Egypt and Akhenaten became known to his successors as the “heretic king” whose memory they sought to expunge from the historical record. In his narrative, Freud depicts a despairing Moses, a devotee of the Aton religion, seeing “his hopes and prospects destroyed” (Freud 1939, 46), responding to these events by placing himself at the head of an enslaved Semitic tribe which had long been in bondage in Egypt and leading them to freedom across the Sinai. In the process he converted them to an even more spiritualized, rigorous and demanding form of monotheism, which involved the Egyptian custom of circumcision, a symbolic act of submission to the Divine Will.

In the Freudian narrative the onerous demands of the new religion ultimately led his followers to rebel and to kill Moses, an effective repetition of the original father murder outlined in Totem and Taboo, after which they turned to the cult of the volcano god Yahweh. But the memory of the Egyptian Moses remained a powerful latent force until, several generations later, a second Moses, the son-in-law of the Midianite priest Jethro, shaped the development of Judaism by integrating the monotheism of his predecessor with the worship of Yahweh. By this means the guilt deriving from the murder of the original Moses survived in the collective unconscious of the Jewish people and led to the hope of a messiah who would redeem them for their forefathers’ murderous act.

While Freud evidently retained his view of religion as the analogue of an obsessional neurosis, this account now contained the recognition that, as such, its effects are not necessarily pathological, but, on the contrary, can also be socially and culturally beneficial in a marked way. Thus he points out in his narrative that, through the example and guidance of the great prophets, there arose an ethical tradition within Judaism, ultimately traceable back to Moses the Egyptian, which proscribed iconic representation and ceremonial performance, demanding in their place belief and “a life of truth and justice” (Freud 1939, 82), a tradition with which Freud evidently had deep affinity. In his view, the Judaic ethic was one which demanded restrictions on the gratification of certain instincts as being incompatible with its spiritualised view of human nature and dignity, in a manner paralleling that in which the totem laws had imposed the rule of exogamy within the totem clan. Such restrictions, he argued, enabled Jewish culture to flourish and to take on its unique character. The prophets “did not tire of maintaining that God demands nothing else from his people but a just and virtuous life: that is to say, abstention from the gratification of all impulses that according to our present-day moral standards are to be condemned as vicious” (Freud 1939, 187). In this account, the murder of Moses was thus the initial event which provoked a sense of guilt that in turn shaped the ethical content of Judaic monotheism. This guilt, Freud argued, marked what he termed “the return of the repressed” (Freud 1939, 197), the emergence of compulsive patterns of behavior in the life of a social group generated by a dynamic originating in a traumatic event lying in the distant past but mediated and transmitted to the present in covert form by a tradition inspired, and partly shaped, by unconscious memory-traces. “All phenomena of symptom-formation can be fairly described as ‘the return of the repressed’,” he argued; “The distinctive character of them, however, lies in the extensive distortion the returning elements have undergone, compared with their original form” (Freud 1939, 201). This is something, he held, which constitutes an “archaic heritage” that does not need to be reacquired by each generation, but merely to be reawakened, and he charted the development of that heritage by means of an enumeration of the stages by means of which the repressed returns, from the primeval father through to the totem, to the hero, then to the polytheistic gods and finally to the monotheistic concept of a single Highest Being.

On this account, the obsessional sense of guilt governing and shaping the ascetic, highly spiritualized ethic implicit in Judaism has been passed on through the generations, such that it has become the very essence of the Jewish character: “The origin … of this ethics in feelings of guilt, due to the repressed hostility to God, cannot be gainsaid. It bears the characteristic of being never concluded and never able to be concluded with which we are familiar in the reaction-formations of the obsessional neurosis” (Freud 1939, 212). To recognize, through this form of (psycho)analysis, the genesis of the ethical system in the guilt arising from a nefarious historical deed is, he suggested, to free oneself from its obsessive features while simultaneously accepting its entirely human origins. But such a recognition does not entail an abandonment of the core value system, as there is a sense, as Freud acknowledged to be true in his own case, in which that ethical heritage cannot be repudiated once it is acquired.

This narrative account of the rootedness of the Jewish monotheistic tradition in the life and murder of the man Moses captures what Freud believed to be its most essential feature, something “majestic,” an eternal truth, “historic” rather than “material,” that “in primaeval times there was one person who must needs appear gigantic and who, raised to the status of a deity, returned to the memory of men” (1939, 204). For this reason, a number of commentators, in particular, Gresser and Friedman, argue persuasively that the Moses text should be seen as a response to the question posed by many of Freud’s critics after the publication of the Hebrew edition of Totem and Taboo as to the sense in which he remained, as he claimed, “in his essential nature a Jew,” given his psychologically reductive analysis of religion and his perceived hostility to religious orthodoxy. The answer, they suggest, could be offered by him in Moses and Monotheism only in terms of what he saw as essential to Judaism itself, a rigorous, spiritually intellectualized life ethic, centering on the virtues of truth and justice, derived from the man Moses, its human creator, through the work and influence of the prophets (compare Whitebook 2017, 68-9).

In early Christianity, Freud argued, the guilt of Moses’ murder became reconfigured in the Pauline tradition as the notion of an original sin for which atonement must be sought through a sacrificial death, the effect of which was to abolish the feeling of guilt and supplant Judaism with Christianity: “Paul, by developing the Jewish religion further, became its destroyer. His success was certainly mainly due to the fact that through the idea of salvation he laid the ghost of the feeling of guilt” (Freud 1939, 141). Once again, this historical transition was interpreted by Freud in clear Oedipal terms: “Originally a Father religion, Christianity became a Son religion. The fate of having to displace the Father it could not escape” (Freud 1939, 215). However, he held that the advent of Christianity was in some respects a step back from monotheism and a reversion to a covert form of polytheism, with the panoply of saints standing as a surrogate for the lesser gods of pagan antiquity. He accordingly saw the process whereby Christianity supplanted Judaism as comparable to the historical expunging of the monotheistic religion of Aton in ancient Egypt after the death of the Pharaoh Akhenaten: “The triumph of Christianity was a renewed victory of the Amon priests over the God of Ikhnaton” (Freud 1939, 142).

What is arguably of most importance in the Moses narrative is that it constitutes a final effort by Freud to reconcile himself with his own Jewish heritage; as one critic suggests, “Freud uses Moses to re-affirm his loyalty to a people whose religion he does not share but whose claim on him he steadfastly refuses to disavow” (Friedman, 1998, 148). The Jewish people, Freud pointed out, have a self-confidence which springs from the idea of being chosen by God from amongst the peoples of the world, an idea which derives strength from the related notion of participation in the reality of a supreme Deity. But the tenet of the Judaic religion which historically has had perhaps the most significant effect of all, he contended, has been the prohibition, derived from the religion of Aton, of graven images as idolatrous. That forces the believer into worship of a dematerialized God, an abstraction apprehensible only to the intellect, a movement described by Freud as “a triumph of spirituality over the senses” (Freud 1939, 178). This shift from the sensible to the conceptual was, he believed, “unquestionably one of the most important stages on the way to becoming human” (Freud 1939, 180), and it gave a preeminence to abstractions in Jewish intellectual life that made possible some of its key contributions to Western mathematics, science and literature, including, of course, the discipline of psychoanalysis. In that sense, he ultimately recognized that the very science of mind which he had pioneered and with which he sought to expose the Oedipal nature of religion was itself a cultural product of the Judaic religious impulse.

8. Critical Responses

Freud’s utilization of the conceptual apparatus of psychoanalysis in his treatment of religion yields a naturalistic account rooted in psychoanalytic theory which, while being arguably one of the more self-consistent to be found in the modern age, is also one of the most controversial. In its main features it strongly anticipated, and almost certainly influenced, contemporary critiques of religion associated with the “New Atheism” movement of the late 20^th and early 21^st centuries, such as those of Daniel Dennett, Richard Dawkins, Sam Harris and Christopher Hitchens (1949—2011). The impact of Freud’s psychoanalytical projectionism can also be traced in the development of contemporary radical theology, particularly in the work of Don Cupitt and Lloyd Geering. The responses to it, in turn, occupy a very wide spectrum, from enthusiastic affirmation to condemnatory repudiation. A representative sample of these would include the following.

a. The Anthropological Critique

The idea of the “primal horde” was derived by Atkinson and Freud from what was no more than a cautious suggestion by Darwin in his Descent of Man that, amongst several possibilities regarding the social organization of “primeval” humans, one was that it might have consisted of small patriarchal groups led by a single dominant male, “each with as many wives as he could support and obtain, whom he would have jealously guarded against all other men” (Darwin 1981, II 362). This suggestion, which became one of the linchpins of Freud’s account of totem religion, has not received scientific corroboration, and it remains questionable whether the idea has any basis in reality (compare Smith, R.J. 2016). Further, the progressivist evolutionary paradigm championed by Freud, with its projection of a universal linear cultural development from the primitive to the civilized, is largely rejected by contemporary ethnologists and social anthropologists, in particular those influenced by the work of Franz Boas (1858—1942). The assimilation of prehistoric humans with contemporary “primitive” humans on which it is based, and the narrative constructed out of that assimilation, is generally regarded as Eurocentric in its presuppositions and as deriving from the mindset of 19^th century imperialism (Kenny, R. 2015). Thus, in his influential review of Freud’s Totem and Taboo in 1919, the eminent American anthropologist Alfred L. Kroeber, who was a disciple of Boas, subjected Freud’s account of totemism to an extended and trenchant critique, suggesting that the method employed in it amounted to “multiplying into one another, as it were, fractional certainties … without recognition that the multiplicity of factors must successively decrease the probability of their product” (Kroeber 1920, 51). Kroeber attributed this almost entirely to the reliance by Freud on the speculative approach taken by such nineteenth century ethnologists as Tylor and Frazer; their anthropological work, he stated bluntly, “is not so much ethnology as an attempt to psychologize with ethnological data” (Kroeber 1920, 55). In a less trenchantly-worded retrospective review written 20 years later, Kroeber—who had in the interim spent some time as a practicing lay psychoanalyst—sought to make conceptual space for a reconciliation of Freud’s theory with scientific ethnology by making a distinction between “historical” and “psychological” thinking, suggesting that Freud’s account should be understood as involving the latter rather than the former (Kroeber 1939, 447). However, notwithstanding that, Kroeber’s strongly negative assessment in his original review of Freud’s incursion into the field of scientific anthropology is now generally accepted within the discipline. Accordingly, Freud’s account of totemism, considered as a direct contribution to an understanding of the development of human culture, would now be viewed with considerable suspicion by professional anthropologists.

b. Myth or Science?

For these reasons, Freud’s projectionist theory of religion as evolving from a primal parricide has been called into serious question as a scientific or historical hypothesis, and with it, the status of psychoanalysis itself. Karl Popper (1902—1994) and Ludwig Wittgenstein have both argued against Freud’s repeated claim for the scientific status of psychoanalysis and—by implication—the account of religion which he developed from it. Popper did so on the grounds that the terms in which psychoanalytic theory is couched make it unfalsifiable in principle and thus unscientific. The theories of Freud and Adler, he argued, describe some facts, but “in the manner of myths. They contain most interesting psychological suggestions, but not in a testable form” (Popper 1963, 37), unlike, for example, the propositions of the natural sciences which almost certainly served as a model for Freud. Wittgenstein, who considered Freud to be one of the few contemporary thinkers with “something to say” (Wittgenstein 1966, 41), albeit one whose whole way of thinking “wants combatting” (ibid., 50), was intrigued by Freud’s focus on mythology in his narratives, and saw that much of the persuasive force of his work derived from the claim that it has constructed a scientific explanation of ancient myths. However, he considered that what Freud had effected was of a different order: “What he has done is propound a new myth” (Wittgenstein 1966, 51).

In a similar vein, Paul Ricoeur, in conceding that the primal parricide depicted by Freud is constructed out of ethnological scraps “on the pattern of the fantasy deciphered by analysis” (Ricoeur 1970, 208), proposed that it, and indeed the entire edifice of Freud’s psychoanalytic theory, should itself be read as being essentially mythical rather than scientific. He thus argued that “one does psychoanalysis a service, not by defending its scientific myth as science, but by interpreting it as myth” (Ricoeur 1970, 20). This latter stratagem, with some variations, has subsequently been adopted by a number of other commentators who seek a mechanism to validate the Freudian cultural narrative in the face of its undeniable ethnological shortcomings (compare, for example, Paul, 1996). It is worth noting that Ricoeur’s conception of the mythic is complex, and occurs within the context of his construction of a religious hermeneutics that engages and intersects with the Freudian psychoanalytic one while seeking to go beyond it, a hermeneutics that regards myths not as fables, “but rather as the symbolic exploration of our relationship to beings and to Being” (Ricoeur 1970, 551). On such a view, the deficiencies presented by the Freudian narrative are read as being hermeneutic rather than scientific, open to further articulation and refinement through a more nuanced and balanced interpretation of the symbolic structure of religious discourse.

However, the hermeneutic construal of the Freudian enterprise is itself open to the charge that it fails utterly to acknowledge the over-arching importance attributed by Freud to his claim that psychoanalysis is to be properly regarded as a rigorous science of the mind and has been vigorously critiqued on those and related grounds by Adolf Grünbaum (1923—2018). For Grünbaum, the hermeneutic approach to Freud constitutes a serious distortion of its subject matter and is reflective of an objectionable scientophobia; rather immoderately, he accused it of having “all of the earmarks of an investigative cul-de-sac, a blind alley rather than a citadel for psychoanalytic apologetics” (Grünbaum 1984, 93). By contrast, he insisted on seeing psychoanalysis precisely as a testable theory, but one which is based upon clinical reports from therapeutic practice rather than rigorous experimentally-derived evidence. He pointed out that Freud, whom he considered “a sophisticated scientific methodologist” (ibid., 128), was fully aware of and highly sensitive to the question of the logic of the confirmation and disconfirmation of psychoanalytic interpretations, but contended that his utilization of the notion of consilience in that connection could not meet the demands of full scientific probity. Grünbaum accordingly came to view psychoanalysis as being based upon an inadequate conception of scientific confirmation; the clinical data ostensibly adduced in its favor from therapeutic sessions—which Ernest Jones had described as “the real basis” of psychoanalysis (Jones 1959, 1:3) —are, he argued, the products of a shared influence and are irremediably contaminated by suggestion on the part of the analyst. They cannot therefore properly be regarded as providing confirmatory evidence for the theory, while contemporary psychoanalysis has not met the objection that successful therapy operates as a placebo.

c. Lamarckian vs. Darwinian Evolutionary Principles

As we have seen, Freud’s transposition of the father complex from individual infantile development to the social order relied heavily on Haeckel’s thesis that ontogeny recapitulates phylogeny. The latter is now largely rejected by contemporary science, in particular the manner in which Freudians have adopted it to model the social evolution of human beings analogically with the psychological development of children. Further, it seems evident that Freud’s transposition is deeply problematic and leaves psychoanalysis unable to explain the wide variety of culturally determined personality structures which are demonstrated by contemporary empirical research. Freud’s commitment to Lamarckian evolutionary principles has, of course, also received significant critical comment from the scientific community (Slavet 2009, Ch. 2; Yerushalmi 1993, Ch. 2), though it must be noted that his account of acquired memory traces as being partly constitutive of Jewish identity in Moses and Monotheism owes as much to August Weissmann’s germ-plasma theory of inheritance as it does to Lamarckism (Slavet 2009, 28).

d. The Primordial Religion: Polytheism or Monotheism?

The entire enterprise of accounting for the origins of religion as an evolutionary trajectory from polytheism to monotheism has been challenged by the work of the ethnologist Father Wilhelm Schmidt (1868—1954), whose multi-volume Der Ursprung der Gottesidee (The Origin of the Idea of God; 1912—1955) is a wide-ranging study of primitive religion. In it Schmidt argued that the “original” tribal religion was almost invariably a form of primitive monotheism, focused on belief in a single benevolent creator god, with polytheistic religions featuring at a later stage of cultural development. Schmidt, who was influenced by Boas and his followers, was accordingly critical of evolutionist accounts of religious development, contending that they frequently lack solid grounding in the historical and anthropological evidence, and was dismissive on those grounds of the totemic theory propagated by Freud. It must be added that Freud was aware of Schmidt’s work and was less than impressed by its quality or its scientific impartiality. He saw Schmidt, whom he held partially responsible for the abolition of the journal Rivista italiana di Psicoanalisi in Italy, as an implacable enemy of psychoanalysis, who was motivated by a desire to undermine Freud’s account of the genesis of religion. Freud feared for a possible suppression of psychoanalysis in Vienna in the mid-1930s by the ruling Catholic authorities, with whom Schmidt had considerable influence. That fear, combined with hope—which proved unfortunately ill-grounded—that those authorities might function as a bulwark against the threat of Nazism, persuaded Freud to defer publication of the full text of Moses and Monotheism until after he had taken up residence in England (see Freud 1939, Prefatory Notes to Part 111), a fact which itself had a considerably negative effect on the literary coherence of the work. The substantive issue between Freud and Schmidt on the temporal primacy of polytheism or monotheism remains unresolved and is almost certainly irresolvable; as the theologian Hans Küng puts it, the scientific search for the primordial religion should be called off, as “neither the theory of degeneration from a lofty monotheistic beginning nor the evolutionary theory of a lower animistic or preanimistic beginning can be historically substantiated” (Küng 1990, 70).

e. Religion as a Social Phenomenon

It is instructive to compare Freud’s attempts to deal with the social dimension of religion with that of his near contemporary, the sociologist Émile Durkheim (1858—1917), whose study The Elementary Forms of Religious Life (1995; orig. 1912) has been highly influential, though it should not in any way be seen as a response to Freud. In The Elementary Forms Durkheim set himself the task of analyzing religion empirically as a social phenomenon, holding that such a treatment alone can reveal its true nature. For Durkheim, the social dimension of human life is primary; human individuality itself is largely determined by, and is a function of, social interaction and organization. This was a point missed by Freud, who, we have seen, sought to deal with the social dimension of religion by an extension of psychoanalytical principles from individual to group psychology. What Durkheim termed “social facts” play an important role in his analysis; they are the collective forces external to individuals which compel or influence them to act in particular ways. Such facts exist at the level of society as a whole and arise from social relationships and human associations, and include law, morality, contractual relationships and, perhaps most importantly, religion.

Durkheim defined religion as “a unified system of beliefs and practices relative to sacred things, that is to say, things set apart and forbidden—beliefs and practices which unite in one single moral community called a Church, all those who adhere to them” (Durkheim 1995, 44). He saw the connection between religious beliefs and practices as a necessary one; for him, religious experience is rooted more in the actions associated with rites than it is in reflective thought. Traditional accounts of religion have tended to treat religious beliefs as essentially hypothetical or quasi-scientific in nature—an approach clearly evident in Freud—which almost inevitably raises skeptical doubts about their validity, whereas Durkheim saw that what is important to the believer is the normative dimension of faith. The true function of religion is to deliver salvation by showing us how to live; as such, it originates in and receives legitimation from, moments of “general effervescence” (Durkheim 1995, 213), in which members of a group gather together to perform religious rituals. This often leads the participants into a state of psychological excitement resembling delirium, in which they come to feel transported into a higher level of existence where they make direct contact with the sacred object. Participation in such rituals has the effect of affirming and strengthening the collective identity of the group and must be renewed periodically in order to consolidate that identity.

Durkheim took pains to ensure that his use of terms like “delirium” in such contexts should not be misunderstood: the “delirium” associated with religious rituals is, he stressed, “well-founded” (Durkheim 1995, 228) in that it is produced by the operation of social factors that are both irreducibly real and crucially important. Given that it is a foundational postulate of sociology that no human institution rests upon an error or a lie, he declared it unscientific to suggest that systems of ideas of such complexity as religions could be delusory or be the product of illusion, as Freud was to do. In that clear functionalist sense, he concluded, all religions are true; “Fundamentally then, there are no religions that are false. All are true after their own fashion: All fulfil given conditions of human existence, though in different ways” (Durkheim 1995, 2).

This vindication of religion in general, however, has as its counterpart a commitment on Durkheim’s part to an account of the nature of sacred objects or gods which was no less egregiously projectionist than Freud’s. If it is impossible for religious belief, considered as a set of representations relating to the sacred, to be erroneous in its own social right, error can and does emerge, he argued, in the interpretation of what those representations mean, even within the framework of a particular culture. At that level, Durkheim conceded, false beliefs are the norm, because all collective representations are delusional and religion is merely a case in point in that regard: “The whole social world seems populated with forces that in reality exist only in our minds” (Durkheim 1995, 228), non-religious examples of which are the meanings attributed by people to flags, to blood and to humans themselves as a class of being. This point regarding the socially-imposed nature of the meanings associated with collective representations can perhaps be most clearly illustrated by reference to now-defunct cultures and religions. For example, while we readily recognize that the Moai, the deeply impressive monolithic statues of Easter Island, unquestionably had a particular political, aesthetic and religious significance for the Rapa Nui people who created them, the meaning of that symbolism largely escapes us—archeological and anthropological reconstruction aside—as we view them from a perspective external to that culture.

Durkheim contended that in a religious context, the sacred object, which is indeed greater than the individual, is nothing more or less than the power of society itself which, in order to be represented symbolically at all, has be objectified through a process of projection. Gods or sacred objects then, are “a figurative expression of … society” (Durkheim 1995, 227); they are society refined, idealized and apotheosized. As such, they represent a power beyond all individual humans, but are ultimately existentially interdependent with them: “while it is true that man is a dependent of his gods, this dependence is mutual. The gods also need man; without offerings and sacrifices, they would die” (Durkheim 1995, 36).

Durkheim’s treatment of religion, then, utilizes a methodology which offers a sharp contrast with Freud’s highly-individualistic, psychological approach to the subject, a contrast which highlights some of the sociological shortcomings of the latter. Unlike Freud, Durkheim also sought to provide an account of religion which achieves full scientific probity while simultaneously doing justice to the richness of the actual lived experiences of believers. Notwithstanding that, however, it seems clear that in the final analysis his anti-skeptical stratagem works satisfactorily only on its own, scientific terms; a believer could scarcely derive comfort from a view which legitimates his belief-system qua sociological fact while implying that the personal God of worship which is its intentional object is, in reality, nothing other than society personified.

f. The Projection Theory of Religion

This raises the whole question of the intellectual plausibility of the projection theory of religion. The question is a complex one, a fact which Freud scarcely acknowledges in his works. As we have seen, the theory, which has a number of related but distinct forms, arose in modernity as a response to the anthropomorphic nature of the attributes which the conceptualization of a personal God in many of the great world religions seems to necessitate. Freud, like Feuerbach, took this as entailing strict anthropotheistic consequences: Feuerbach’s argument reduced God to the essence of man, and Freud sought to go beyond him in offering a psychoanalytical explanation, in terms of the father complex, of why it is human beings have a need to hypostasize their own subjective nature. Belief in God, and the complex patterns of behavior and of rituals associated with that belief, he argued, arise essentially out of the deep psychological need for a Cosmic father.

However, it has been pointed out that such a view underestimates the logical gulf that exists between wishes and beliefs; the former may on occasion be a necessary condition for the latter, but are rarely a sufficient one: an athlete may wish to triumph in an event with every fibre of his being, but that will not necessarily generate a belief that he can do so, much less the delusion that he has done so. Thus, even if it is true that there is a universal wish for a Cosmic father, it is implausible to suggest that such a wish is a sufficient condition for religious belief and the complex practices and value systems associated with it (Kai-man Kwan 2006). Further, as Alvin Plantinga (1932—) argues, in the absence of compelling empirical evidence to support the view that such a universal wish exists, Freud was left with no option but to contend that such wishes are equally universally repressed into the unconscious, a move which opens his theory to the accusation of being empirically untestable (Plantinga 2000, 163).

It is to be noted too that concerns about anthropomorphisms in religious language are in no way restricted to religious skeptics: apophatic or negative theology, for example, grew out of recognition of the logical difficulties implicit in attempts to express the nature of the divine in language. As a result, theologians such as Maximus the Confessor (580—662), Johannes Scotus Eriugena (815—877) and—in Judaism—Maimonides (1138—1204) repudiated the positive attribution of characteristics to God in favour of “referencing” God exclusively in terms of what He is not, through the via negativa. It is also important to note that some proponents of the projection theory, such as Spinoza and possibly Xenophanes, saw the projection theory as invalidating only those forms of religious belief which are anthropotheistic in nature. Thus projectionism, so far from being hostile to all forms of religious belief and practice, is in fact consistent with themes relating to the avoidance of idolatry long central to the Abrahamic religions in particular, as evidenced in the proscription on naming God in Judaism and in aniconism, the prohibition of figurative representations of the Divine in the early Orthodox Church, in Calvinism and also in Islam (Thornton, 2015: 139-140).

It is thus perfectly consistent to accept projectionism as an account of religious concept formation without thereby repudiating religious belief. Indeed, the logical compatibility of projectionism with religious belief has led some contemporary religious thinkers to go so far as to embrace projectionism as a condition of a reflective religious commitment. The view that religious representations are products of the human imagination, it has been argued, can be accepted implicitly by believers, as the “mark of the Christian in the twilight of modernity is … trust in the faithfulness of the God who alone guarantees the conformity of our images to reality and who has given himself to us in forms that may only be grasped by imagination” (Green, 2000, 15). This argument is closely paralleled by a suggestion from Plantinga that wish-fulfillment as a mechanism could have arisen out of a divinely created human constitution. For while it may not, in general, be the function of wish-fulfillment to produce true belief, that in itself does not rule out the possibility, Plantinga contends—at least for those who believe in God—that humans have been so constituted by the creator to have a deeply-felt need and wish to believe in him. On this view, the very existence of the wish for a transcendent Father may be taken as evidence for the truth rather than the falsity of the beliefs which it inspires: “Perhaps God has designed us to know that he is present and loves us by way of creating us with a strong desire for him, a desire that leads to the belief that in fact he is there” (Plantinga 2000, 165).

Whatever level of plausibility may be assigned to these views, it is in any case clear that the projection theory is also reflective of the difficulties which certain forms of religious discourse generate: the characterization of God as possessing attributes such as Love and Wisdom, however qualified such attributions may be, seems invariably to invite the kind of challenge that is found in Feuerbach, Freud and even in Durkheim. In that sense, the projection theory highlights deep theological and philosophical issues relating to the nature and meaning of religious language. One of the more promising approaches to this issue is that suggested by the work of of Wittgenstein, who, in his Philosophical Investigations (1974), propounded his language-game theory of meaning, which argued that the meaning of any term is determined by its actual use in a living language-system. In that connection, he brought out the complex interplay of linguistic and non-linguistic activities and practices in human life, in a manner analogous to Durkheim’s functionalism. An application of this to religious discourse implies that the latter cannot be understood in isolation from the broad web of cultural practices, beliefs and concerns in which it is imbedded and from which it derives its meaning. This suggests that concerns that skeptical conclusions necessarily follow from our use of human-being predicates in speaking about the Divine are misguided; such concerns gain credence only when accompanied by the deeply pervasive, but uncritical, philosophical assumption—clearly evident in Freud—that the attributions of anthropomorphic predicates to God are to be understood exclusively as factual descriptions of a particular kind, an assumption which is at the very least gratuitous.

This point is made cryptically by Wittgenstein in an indirect allusion to the projection theory: “‘God’s Eye Sees Everything’—I want to say of this that it uses a picture…. [in saying this] I meant: what conclusions are you going to draw? etc. Are eyebrows going to be talked of, in connection with the Eye of God?” (Wittgenstein, 1966, 71). In other words, while in factual discourse references to human eyes have an internal relationship to references to human eyebrows, such that the occurrence of one may and frequently does give rise to the other, no such correlation is possible or necessary in religious discourse about God’s Eye (or Mercy, Anger, Love, and so forth). Thus while “God’s Eye Sees Everything” conjures up the image of a stern, judgmental all-seeing parental figure which, at one level, is amenable to the Freudian father-complex analysis, at another, arguably deeper, level it is clear that the web of relations that holds between the anthropomorphic terms used cannot meaningfully be compared with that which holds in factual discourse about earthly fathers; even the most literal-minded do not seek to speak of God’s eyebrows. The occurrence of anthropomorphisms in religious discourse, then, does not in itself necessitate the acceptance of anthropotheistic conclusions.

g. Moses and Monotheism: Interpretive Approaches

Moses and Monotheism is the most controversial of Freud’s works, seeking as it does to both utilize psychoanalytic theory to reinterpret key historical events and to embed psychoanalysis within a historiographical narrative. Not alone did it contest the orthodox Biblical narrative of the role of Moses in the history of Judaism, it did so at a time when the Jews of Europe were threatened with complete annihilation. It is unsurprising, then, that it should have become the subject of very strong criticism, on the grounds both of methodology and content; indeed, because its central account of the Egyptian origins of Judaic monotheism has seemed so egregiously at odds with both tradition and the historical evidence, much of the critical interest has focused on the question of Freud’s motives in propagating it. The Freudian narrative is, of course, problematic in the extreme when considered as a putative exegesis of the Exodus story; as one commentator puts it, “There is hardly any need to state that Moses and Monotheism does not operate at the level of an exegesis of the Old Testament and in no way satisfies the most elementary requirement of a hermeneutics adapted to a text” (Ricoeur 1970, 545). Though Moses is almost certainly an Egyptian name, the evidence that Moses was an Egyptian is not conclusive and it has also been suggested that his life was not in fact contemporaneous with that of Amenhotep IV (Banks 1973, 411). Freud’s willingness, towards the very end of his life, to construct such an apparently speculative narrative on the very origins of Judaism has long puzzled scholars, but it is possible to distinguish three broad exegetical approaches relating to the Moses text in the secondary literature:

For much of his life Freud presented an image of himself to the world as an urbane, cosmopolitan intellectual, committed to the ideals of secular humanism and modern science, and at times that seemed to necessitate downplaying his Jewish background and education. Some scholars, such as Jones (1957) and, more recently Gay (1987), have accordingly represented the Moses text primarily as a critique of Judaism, a comprehensive application of the reductive analysis of religion offered in Freud’s earlier works to the religion of his forefathers. In a similar vein, Jan Assmann (1998) sees Freud as continuing the more general task, initiated by Baruch Spinoza (1632—1677), of combating monotheism and undoing the negative values, such as intolerance, religious hatred and the configuration of alternative religions as idolatrous, generated by the absolute conception of truth which monotheistic religions seem to require.
The second approach, associated in particular with Yerushalmi (1993), Bernstein (1998) and Slavet (2009; 2010) repudiates what it sees as a confusion of meaning with motivation in the secondary literature regarding Freud’s text, stressing that what is of importance is what Freud sought to convey, not what motivated him to do so. While acknowledging the resonances within the text of personal factors operating in Freud’s life at the time of publication, such as his relationship with the memory of his father, the resurgence of antisemitism and the personal and professional threat presented by Nazism from which he so narrowly escaped, this approach rejects any autobiographical interpretation of the text, focusing instead on Freud’s account of the nature of the Jewish religion and the factors which constitute and determine Jewish identity. Thus Bernstein sees in Freud’s Moses text a powerful new account of religion in general and of Judaism in particular, centering on the idea that a religious tradition derives its dynamic from a complex interplay of conscious and unconscious forces. Slavet attributes to Freud a racial theory of memory and sees Moses and Monotheism as “the culmination of a lifetime spent investigating the relationships between memory and its rivals: heredity, history, and fiction” (Slavet 2009, 7) in the context of the question of “Jewishness.” On this view, Freud sought to show that the advancement of intellectualized spirituality (Geistigkeit) has been the most important part of the legacy of Judaic monotheism, but that this owed as much to the working out of collective trauma, the return of the repressed, as it did to the conscious influence of the patriarchs and prophets.
Finally, there is the semi-autobiographical approach, largely taken in this article, which sees the text as primarily concerned with the long-standing problem for Freud of resolving his personal father complex. That, in psychoanalytical terms, amounted to the implementation of an instance of “deferred obedience” by defining in a positive way his relationship with the religion into which he was born, albeit with an emphasis on the human origins of the Judaic ethic (Rice 1990; Gresser 1994; Friedman 1998).

In a thinker as complex as Freud, these approaches can neither be taken as exhaustive nor as entirely mutually exclusive, as significant textual evidence can be invoked for all three. What seems evident, at any rate, is that Freud was seeking, at that critical point in Jewish history, to affirm his cultural and intellectual indebtedness to the ethical basis of the religion of his forefathers while simultaneously seeking to demonstrate that the validity of that ethic is not contingent upon the Biblical and theological accretions traditionally associated with it. On such a reading, the question of the accuracy of the historical detail in the Freudian narrative becomes as peripheral as it is—on a non-literal interpretation—to that of the Biblical one. The import of the book, as Friedman puts it, may reside ultimately in a purpose which can certainly be discerned in it: to preserve Judaism and articulate Freud’s own Jewish identity at a stage in a historical process in which his people come to progress from worship of a transcendent God “to the rational and self-conscious appreciation of themselves as a people of great accomplishment descended from a great but human leader” (Friedman 1998, 139).

9. References and Further Reading

a. References

Alter, R. 1988. The Invention of Hebrew Prose, Modem Fiction and the Language of Realism (Samuel and Athea Stroum Lectures in Jewish Studies). University of Washington Press.
Assmann, J. 1998. Moses the Egyptian: The Memory in Western Monotheism. Cambridge, MA: Harvard University Press.
Banks, R. 1973. ‘Religion as Projection: A Re-Appraisal of Freud’s Theory’. Religious Studies, vol. 9 (4), 401-426.
Berke, J. 2015. The Hidden Freud: His Hassidic Roots. London: Karnac Books.
Bernstein, R.J. 1998. Freud and the Legacy of Moses. Cambridge: University Press.
Boehlich, W. (ed.) 1992. The Letters of Sigmund Freud to Eduard Silberstein, 1871-1881 (trans. A. Pomerans). Harvard University Press.
Brentano, F. 1973 (orig. 1874). Psychology From an Empirical Standpoint (trans. A.C. Rancurello, D.B. Terrell and L.L. McAlister). London: Routledge.
d’Aquili, E.G. & Newberg, A.B. 1999. The Mystical Mind: Probing the Biology of Religious Experience. Minneapolis: Fortress Press.
Darwin, C. 1981. Descent of Man and Selection in Relation to Sex. Princeton University Press.
Durkheim, É. 1995 (orig. 1912). The Elementary Forms of the Religious Life (trans. Karen Fields). New York: Free Press.
Feuerbach, L. 1881. The Essence of Christianity, 2nd edition (trans. George Eliot). London: Trübner & Co., Ludgate Hill.
Frazer, J. G. 2002 (orig. 1890). The Golden Bough. New York: Dover Publications.
Freud, S. 1914 (orig. 1901). The Psychopathology of Everyday Life (trans. A.A. Brill). London: T. Fisher Unwin.
Freud, S. 1939. Moses and Monotheism (trans. Katherine Jones). London: The Hogarth Press and Institute of Psycho-Analysis.
Freud, S. 1957 (orig. 1910) ‘The Future Prospects of Psychoanalytic Therapy’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud ( & and ed. J. Strachey) Volume X1 (1911-1913). W. W. Norton & Company, 139-151.
Freud, S. 1959. ‘An Autobiographical Study’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud (trans. & ed. J. Strachey). Volume XX (1925-1926). London: The Hogarth Press and the Institute of Psychoanalysis, 7-70.
Freud, S. 1961 (orig. 1927). The Future of an Illusion (trans. James Strachey). New York; W.W. Norton.
Freud, S. 1962 (orig. 1930). Civilization and its Discontents (trans. James Strachey). New York; W.W. Norton.
Freud, S. 1976. ‘An Obituary for Professor S. Hammerschlag’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud (trans. & and ed. J. Strachey) Volume IX (1906-1908). W. W. Norton & Company, 255-6.
Freud, S. 1976 (orig. 1907). ‘Obsessive Actions and Religious Practices’, in The Standard Edition of the Complete Psychological Works of Sigmund Freud (trans. & ed. James Strachey) Volume IX (1906-1908). W. W. Norton & Company, 115-128.
Freud, S. 1986. The Complete Letters of Sigmund Freud to Wilhelm Fliess, 1887-1904 (trans. & and ed. J. Moussaieff Masson). The Belknap Press of Harvard University Press.
Freud, S. 1990 (orig. 1933). New Introductory Lectures on Psycho-analysis (trans. James Strachey). New York: W.W. Norton.
Freud, S. 2001 (orig. 1913). Totem and Taboo: Some Points of Agreement between the Mental Lives of Savages and Neurotics (trans. James Strachey). Oxford: Routledge Classics.
Freud, S. 2010 (orig. 1900, 1908) The Interpretation of Dreams (trans. James Strachey). New York: Basic Books.
Friedman, R. 1998. ‘Freud’s Religion: Oedipus and Moses’. Religious Studies, 34 (2), 135-149.
Gay, Peter. 1987. A Godless Jew? Freud, Atheism and the Making of Psychoanalysis. New Haven: Yale University Press
Goodnick, B. 1992. ‘Jacob Freud’s Dedication to His Son: A Reevaluation’. The Jewish Quarterly Review, Vol. 82 (3-4), 329-360.
Green, G. 2000. Theology, Hermeneutics and Imagination: The Crisis of Interpretation at the End of Modernity. Cambridge: Cambridge University Press.
Gresser, M. 1994. Dual Allegiance: Freud as a Modern Jew. Albany, NY: State University of New York Press.
Grünbaum, A. The Foundations of Psychoanalysis. Berkeley: University of California Press.
Hume, D. 1956 (orig. 1757). The Natural History of Religion (ed. H.E. Root). London: A.C. Black.
Jones, E. 1957. Sigmund Freud. Life And Work: Volume Three – The Last Phase 1919-1939. London: Hogarth Press.
Jones, E. 1959 (ed). Freud: Collected Papers in 5 Volumes (trans. Joan Riviere). New York: Basic Books.
Kai-man Kwan. 2006 “Are Religious Beliefs Human Projections?” in Raymond Pelly and Peter Stuart, eds., A Religious Atheist? Critical Essays on the Work of Lloyd Geering. Dunedin, New Zealand: Otago University Press, 41-66.
Kenny, R. 2015. ‘Freud, Jung and Boas: the psychoanalytic engagement with anthropology revisited’. Notes and records of the Royal Society of London. Jun 20; 69(2): 173–190. Online: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4424604/
Kroeber, A.L. 1920. ‘Totem and Taboo: An Ethnologic Psychoanalysis’, American Anthropologist, New Series, Vol. 22 (1), 48-55.
Kroeber, A. L. 1939. ‘Totem and Taboo in Retrospect’. American Journal of Sociology, Vol. 45 (3), 446-451
Lang, A. & Atkinson, J.J. 1903. Social Origins and Primal Law. London: Longmans Green.
Parsons, W.B. 1998. “The Oceanic Feeling Revisited.” The Journal of Religion, vol. 78 (4), 501–523.
Paul, R. A. 1996. Moses and Civilization: The Meaning Behind Freud’s Myth. New Haven; London: Yale University Press.
Plantinga, A. 2000. Warranted Christian Belief. Oxford University Press.
Popper, K. 1963. Conjectures and Refutations: The Growth of Scientific Knowledge. London: Routledge.
Rice, E. 1990. Freud and Moses: The Long Journey Home. Albany, New York: SUNY Press.
Ricoeur, P. 1970. Freud and Philosophy: An Essay on Interpretation (trans. D. Savage). New Haven & London: Yale University Press.
Saarinen, J.A. 2015. A Conceptual Analysis of the Oceanic Feeling – With a Special Note on Painterly Aesthetics. Jyväskylä: Jyväskylä University Printing House. Online at: https://jyx.jyu.fi/dspace/bitstream/handle/123456789/45384/978-951-39-6078-0_vaitos07032015.pdf?sequence=1
Schmidt, W. 1912-1955. Der Ursprung der Gottesidee: Eine historisch-kritische und positive Studie. (12 vols.) Münster in Westfalen: Aschendorff.
Slavet, E. 2009. Racial Fever: Freud and the Jewish Question. Fordham University Press.
Slavet, E. 2010. ‘Freud’s Theory of Jewishness For Better and for Worse’. In A.D. Richards (ed.) The Jewish World of Sigmund Freud: Essays on Cultural Roots and the Problem of Religious Identity, 96-111. North Carolina: McFarland & Co.
Smith, R.J. 2016. ‘Darwin, Freud, and the Continuing Misrepresentation of the Primal Horde’, Current Anthropology 57 (6), 838-843.
Thornton, S. ‘Projection’, In R.A. Segal and K. von Stuckrad (eds.) Vocabulary for the Study of Religion (vol. 3). Leiden/Boston, 2015, 138-144.
Tylor, E.B. 1871. Primitive culture: researches into the development of mythology, philosophy, religion, language, art, and custom (2 vols). London: John Murray.
Tylor, E.B. 1881. Anthropology: an introduction to the study of man and civilization. London: Macmillan & Co.
Whitebook, J. 2017. Freud: An Intellectual Autobiography. Cambridge University Press.
Wittgenstein, L. 1966. Lectures & Conversations on Aesthetics, Psychology and Religious Belief (ed. C. Barrett). Oxford: Basil Blackwell.
Wittgenstein, L. 1974. Philosophical Investigations (trans. G.E.M. Anscombe). Oxford: Basil Blackwell.
Yerushalmi, Y.H. 1993. Freud’s “Moses”: Judaism Terminable and Interminable. Yale University Press.

b. Further Reading

Alston, W.P. 2003. ‘Psychoanalytic theory and theistic belief’. In C. Taliafero, & P. Griffiths (eds.). Philosophy of Religion: An anthology (123-140). Oxford: Blackwell Press.
Bingaman, K. 2012. Freud and Faith: Living in the Tension. Albany, NY: State University of New York Press.
Blass, R.B. 2004. ‘Beyond illusion: Psychoanalysis and Religious Truth’. The International Journal of Psychoanalysis, 85, 615-634.
Derrida, J. 1998. Archive Fever: A Freudian Impression (trans. E. Prenowitz). University of Chicago Press.
Gay, P. 2006. Freud: A Life for our Time. London: W.W. Norton & Company.
Ginsburg, R. et.al. (eds). 2006. New Perspectives on Freud’s Moses and Monotheism (Conditio Judaica) 1st Edition. Tübingen: Max Niemeyer Verlag.
Hewitt, M.A. 2014. Freud on Religion. London & New York: Routledge.
R.A. 1986. Emile Durkheim: An Introduction to Four Major Works. Beverly Hills, CA: Sage Publications.
Kolbrener, W. (2010). ‘Death of Moses Revisited: Repetition and Creative Memory in Freud and the Rabbis’. American Imago, 67 (2), 243-262.
Milfull, J. 2002. ‘Freud, Moses and the Jewish Identity’. The European Legacy, vol. 7, 25-31.
Nobus, D. 2006. ‘Sigmund Freud and the Case of Moses Man: On the Knowledge of Trauma and the Trauma of Knowledge’. JEP: European Journal of
Psychoanalysis: Humanities, Philosophy, Psychotherapies. Number 22 (1). Online at http://www.psychomedia.it/jep/number22/nobus.htm
Ofengenden, A. 2015. ‘Monotheism, the Incomplete Revolution: Narrating the Event in Freud’s and Assmann’s Moses’. Symploke, Volume 23 (1-2), 291-307.
Palmer, M. 1997. Freud and Jung on Religion. London & New York: Routledge.
Said, E. 2004. Freud and the Non-European. London: Verso.
Smith, D.L. 1999. Freud’s Philosophy of the Unconscious. Studies in Cognitive Systems, vol. 23. Dordrecht: Springer.
Tauber, A.I. 2010. Freud, The Reluctant Philosopher. New Jersey: Princeton University Press.

Author Information

Stephen Thornton
Mary Immaculate College, University of Limerick
Ireland

Frege’s Problem: Referential Opacity

The problem of referential opacity is to explain why a certain inference rule of classical logic sometimes produces invalid-seeming inferences when applied to ascriptions of mental states. The rule concerns substitution of terms for the same object, and here is one of the controversial examples. It involves the mental states of Lois Lane, who believes that Superman can fly. However, she does not know Superman is her coworker Clark Kent, and it is very natural to say that she doesn’t believe that Clark can fly. Yet the inference rule in question apparently allows the following dubious inference:

Superman is identical to Clark Kent.

Lois Lane believes that Superman can fly.

So, Lois Lane believes that Clark Kent can fly.

This inference rule is commonly called Leibnizʼs Law, or Substitutivity of Identicals, or Identity Elimination. The problem it creates is often designated the problem of referential opacity, but because the word “opacity” promotes a particular theory, this article typically employs the more neutral nomenclature “(apparent) substitution-failure.” The term “Leibnizʼs Law” is used instead for

(1) If x and y are the same object, then x and y have the same properties.

And the terms “Identity Elimination” (“=E”) and “Substitutivity of Identicals” are reserved for the specific rule substitution rule illustrated above.

To formulate this rule precisely, we specify it as a rule of natural deduction. It applies to a major premise, which is an identity sentence (for example, “Superman is identical to Clark Kent”), and a minor premise, which contains at least one occurrence of the term on the left of the major premise. The rule permits replacing at least one such occurrence with the term on the right of the major premise. For example, =E is used to make the following inference:

Istanbul is identical to Constantinople.

Istanbul straddles Europe and Asia.

So, Constantinople straddles Europe and Asia.

This particular use produces a valid argument. However, applications of the rule in other sentences sometime produces very counter-intuitive results, as illustrated by the case of Lois Lane, and so we get the problem of apparent substitution-failure. Philosophers of language disagree about how to explain, or explain away, such seeming failures.

The problem was introduced into modern discussion by Quine (1956, 1961). Important early contributions include Marcus (1961, 1962, 1975) and Smullyan (1948). The papers (Kaplan 1986) and (Fine 1989) are influential engagements with Quine. However, the essential problem was raised in the seminal (Frege 1892), and so it is also known as Fregeʼs Puzzle.

Identity Elimination and Its Misuses
The De Re/De Dicto Distinction
Frege’s Theory of Substitution-Resistance
Hidden-Indexical Semantics
1. Two Kinds of Hidden-Indexical Theories
2. Kripke’s Puzzle
Russellianism
References and Further Reading

1. Identity Elimination and Its Misuses

A little more formally, the rule of inference =E can be stated as:

Identity Elimination Schema

Major: t₁ = t₂

Minor: ϕ(t₁)

Conclusion: ϕ(t₂)

Here t₁ and t₂ are expressions which refer to entities (for example, proper names of people or cities). ϕ(t₁) is a sentence containing at least one occurrence of t₁, and ϕ(t₂) is a sentence that results from replacing at least one occurrence of t₁ in ϕ(t₁) with an occurrence of t₂, eliminating the “=” of t₁ = t₂. Recurring t_i presumes that t_i is univocal throughout, and recurring ϕ presumes that the sentential context ϕ is not altered, syntactically or semantically, by the replacement. If these uniformity conditions are not met, then the inference scheme is being misapplied, and it is no wonder that false conclusions are derivable. For example, in the inference “The man behind Fred = the man in front of Bill; the man behind Fred saw him leave; therefore, the man in front of Bill saw him leave,” the context “saw him leave” is not uniform, since substitution of “the man behind Fred” by “the man in front of Bill” changes the reference of “him” (Fine 1989:222–3; Linsky 1967:104).

In discussing the problem with apparent substitution-failure by using =E, many examples will be drawn from the fictional story of Superman, treated as if it were true. In the story, a child from the planet Krypton, Kal-El, is sent to Earth, where physical conditions cause him to acquire superpowers. Wearing specific clothing (red cape, blue jumpsuit), Kal-El prevents disasters, rescues endangered innocents, and foils would-be perpetrators of crimes, such as Lex Luthor. People call Kal-El “Superman” when talking about Kal-El’s actions of this kind.

But Kal-El also takes a day job as a reporter, using the name “Clark Kent.” A coworker, Lois Lane, treats him with indifference in the office, but has a pronounced crush on, as she would put it, Superman, unaware they are the same individual.

The problematic examples discussed below involve ascriptions of mental states to Lois (or occasionally Lex), arrived at by applying the rule =E to the major premise “Superman is Clark” and a carefully chosen minor premise. Lois has a crush on Superman (minor premise), so, by =E, Lois has a crush on Clark. But this latter seems false, and would certainly be rejected by Lois herself. Also, Lois believes that Superman can fly, but does not seem to believe that Clark can; she hopes to see Superman again soon, but seems not much to care when she next sees Clark; she would like a date with Superman, but apparently has no interest in one with Clark; and so on. For a problematic use of =E, consider this paradigm example:

(2)
a. Superman is Clark Kent. Major
b. Lois believes that Superman can fly. Minor
c. ∴ Lois believes that Clark Kent can fly. a, b =E

It is not a solution to the problem of referential opacity to say that when we apply the rule in an instance like (2), the flaw is that the major premise is one that Lois does not realize is true. No doubt her ignorance explains psychologically why she does not draw the conclusion that Clark can fly, in those very words, but it does not explain semantically how the inference rule can carry us from two truths to a seeming falsehood: “Lois realizes (2a) is true” is not itself a premise for the application of the rule in (2), so its falsehood is irrelevant to what is dubious about the application. Indeed, the rule enables the inference that Lois does realize (2a) is true: simply change the minor premise of (2) to “Lois realizes Superman is Superman,” surely unobjectionable once she has acquired the name “Superman” from watching Kal-El perform heroic deeds.

Some terminology is commonly encountered in discussions of cases like (2). Mental-state ascriptions like (2b) and (2c) are called attitude ascriptions, since the subject is being ascribed a mental attitude. When the thing the attitude is toward is specified by a “that”-clause (or by a clause complementized by “if” or “whether”), the ascription is called a propositional attitude ascription. This is because the “that”-clause is standardly taken to specify a proposition, the one expressed by the sentence which “that” prefixes (but see, for example, Davidson 1969, Bach 1997, and Moltmann 2003, 2008, 2017 for criticism of this). So (2b) says that Lois has the attitude of belief toward the proposition that Superman can fly. The sentence following the “that” in (2b) and (2c) is called the content-sentence, though in English, “that” can often be dropped (it is not obligatory in (2b) and (2c)).

a. Quotation

There is mileage to be gained from the idea that the reason we get counterintuitive instances such as (2) is that the rule of =E is being misapplied in some way, or, relatedly, that the rule as formulated is not a faithful reflection of the motivation provided by Leibniz’s Law, as stated in (1)—a better formulation would have to be misapplied to get (2). There are some well-known cases of misapplication of the rule which motivate critiques of (2) as a relevantly similar misapplication. One sort of case, emphasized by Quine (1961), is

(3)
a. Istanbul is Constantinople.
b. “Istanbul” has eight letters.
c. ∴ “Constantinople” has eight letters.

This is a misapplication of =E because the name “Istanbul” does not occur univocally in (3). In the major premise, it is used in the normal way to refer to a certain city. But in the minor premise, it is not used to refer to that city (perhaps it is not used to refer at all). Rather, it occurs as part of the complex quotation-name “‘Istanbul,’” referring to the name “Istanbul,” not the city Istanbul (this is a Tarskian rather than Fregean account of quotation—see further Richard 1986, Washington 1992, Saka 2006—but the nonuniformity objection to (3) holds on either). (3b) correctly predicates “has eight letters” of the word “Istanbul,” as opposed to unintelligibly predicating “has eight letters” of the city Istanbul. So (3) has no more force than a variant in which the minor premise reads “the first name used in (3a) has eight letters” and the conclusion reads “the second name used in (3a) has eight letters,” and which at best seems to presume the absurd principle that if two names refer to the same thing then they have the same number of letters.

Quine thought examples like (3) instructive. The position of “Istanbul” in (3b) is not open to substitution, like the position of “Superman” in (2b), and “Istanbul” does not seem to be referring normally in (3b), so perhaps the same should be said of “Superman” in (2b): the position “Superman” occupies in (2b) is referentially opaque, hence the terminology. But it is unclear how instructive (3) really is. Quine suggests (1956:186) that we should give “serious consideration” to construing mental state ascriptions such as (2b) as involving quotation. (2b) so-construed would say that Lois believes-true “Superman can fly” as a sentence of English.

But he immediately hedges by adding that this “is not to suggest that the subject speaks the language of the quotation, or any language…We may treat a mouse’s fear of the cat as his fearing-true a certain English sentence.” Unfortunately, we are left in the dark about what it is to believe-true or fear-true a sentence as a sentence of L when one does not know L. Quine then admits that the quotational construal of mental state ascriptions will only yield a “systematic agreement in truth-value…and no more.” But even that is doubtful. If “believes-true … as a sentence of L” is simply jargon for “believes that … is true-in-L,” a monolingual Czech who believes that Superman can fly would not do so according to this analysis (she may not even have heard of English); conversely, she may believe that “Superman can fly” is an example of a sentence that is true in English, because she has been told so by a reliable informant; clearly, this does not mean she believes Superman can fly, since she does not know what “fly” means. (See Church 1950 for a famous discussion of quotational accounts, and Schweizer 1993 for a technical investigation of quotational accounts of modal logic.)

A quotational account that does rather better, Quine notes, is that (2b) says that Lois believes the meaning of “Superman can fly,” which avoids the problem of the monolingual Czech. But then it is not really the presence of quotation that is blocking substitution. For if this new quotational account is correct, (2) is valid reasoning if (2a) guarantees that “Superman can fly” and “Clark can fly” mean the same. So (2)’s being a fallacy will require that (2a) not be sufficient for these two sentences to mean the same. This in turn seems to require an account of names on which names can be coreferential yet, one way or another, differ in meaning; and indeed, some accounts to be considered below pursue this. And then substitution-resistance need not be pinned on the presence of quotation.

b. “So-Called”

Quine has another example of misapplication of =E, but one which tends to undermine the thought that there is something referentially peculiar about the position occupied by the substitution-resistant name (though he appears to regard the example as supporting this idea). His well-known “Giorgione” case (Quine 1961:17) is as follows:

(4)
a. Giorgione is Barbarelli.
b. Giorgione is so-called because of his size.
c. ∴ Barbarelli is so-called because of his size.

In (4), there is nothing unusual about the way in which any of the names is used: in each use, there is simply reference to a certain artist. The reason the inference fails to be a legal application of =E is that the sentential context “is so-called because of his size” does not recur uniformly, since the reference of “so” changes in moving from (4b) to (4c): in (4b), “so” refers to the name “Giorgione,” but in (4c), it refers to the name “Barbarelli.” The supposed application of =E is therefore a simple fallacy of equivocation, brought about by the substitution having a hidden truth-condition-altering side-effect (altering the reference of “so”). But it may be an instructive fallacy, if anything like a covert “so” is present in attitude ascriptions. (For other examples of nonuniformity, see Fine 1989:222–36; for more on “so-called,” Forbes 2006:154–7, Corazza 2010, and Predelli 2010.)

c. Modality

Our last example of misuse of =E involves intensional operators, which are operators which do not allow interchange within their scope of accidentally coextensive expressions (two predicates are coextensive if and only if (iff ) they actually apply to exactly the same things, and accidentally coextensive iff they are coextensive, but there could have been something to which one applies and the other does not; two sentences are accidentally coextensive iff they have the same actual truth-value but could have differed in truth-value). The standard cases of intensional operators are modal operators such as “it is necessary that,” “it is possible that,” and “it is contingent that.”

To illustrate how intensional operators can induce failure of substitution of accidentally coextensive predicates, suppose I have in my garage three cars, all

Bentley racing cars from the 1920s, and that these are the only three in existence (the only three that Bentley ever built). Then for any x, x is a car in my garage iff x is a Bentley racing car. But it surely could have been that a car in my garage is not a Bentley, in the sense that there is a way things could have gone as a result of which a car from a different manufacturer ends up in my garage. By contrast, it is not possible that a Bentley racing car is not a Bentley. The problem is that the two predicates “x is a car in my garage” and “x is a Bentley racing car” are only accidentally coextensive, while modal operators are sensitive to what might be called the “modal profile” of expressions within their scope: the array of semantic values they have, sets in the case of predicates, across ways things could have gone, or “possible worlds.” “x is a car in my garage” and “x is a Bentley racing car” would have the same modal profile iff at each world, the set of things the first applies to is the same set as the set of things the second applies to. But as we have said, there is a possible world w where the set of things one predicate applies to is different from the set of things the other applies to, since there is, say, a Bugatti in my garage in w. As the example shows, attempts to substitute predicates which are not necessarily coextensive within the scope of a modal operator easily go awry, resulting in absurdities such as a Bentley that is not a Bentley: within the scope of “possibly” or “it could have been that,” “car in my garage” cannot be replaced by the accidentally coextensive “Bentley racing car” in the sentence “a car in my garage isn’t a Bentley.”

The same can happen with expressions which are accidentally coreferential. Suppose there are nine planets in our solar system, and that this is a contingent fact: there could have been more or fewer planets (on that definition of “planet”).

Then the following use of =E derives a false conclusion from true premises:

(5)
a. The number of planets = 3²
b. It is contingent that the number of planets = 9
c. ∴ It is contingent that 3² = 9.

The conclusion is false because true mathematical identities such as “3² = 9” are the paradigm cases of necessary truths: in every way things could have gone, the number 9 is the outcome when the number 3 is multiplied by itself.

(5) differs from previous examples in that one of the terms in the major premise, “the number of planets,” is not a proper name, but rather what is called a singular definite description: “definite” because “the” coupled with a singular nominal implies exactly one, and “description” because the expression, if it picks out anything, picks out the individual that is the unique satisfier of the descriptive condition “F” in “the F,” in this case “number of planets.”

However, definite descriptions can be classified in at least two ways. One option is that they are treated as belonging to a unitary semantic category of singular terms, together with other grammatical categories such as proper names, demonstratives, and indexicals: expressions of all these types “designate” objects. The classification of definite descriptions with names goes back to Frege (1892). The other approach classifies a definite description “the F” as a first-order quantifier, like “some F,” “each F,” “no F,” and so on (the apparent structural similarity between “the F is G” and “{some/each/no} F is G” is seen as genuine). A quantifier like “some F” is a combination of a det(erminer) “some” with a predicate F, that then combines with a second predicate. In “(det F is G),” “F” is the restriction, or restrictor, in the quantifier “det F,” and “is G” is the quantifier’s scope. In symbols, to take a simple example, “no dog barked” would be represented as “(no x: x is a dog)[x barked],” and so by parallelism, “the dog barked” would be “(the x: x is a dog)[x barked]”: as in English, only det changes as we formalize “the dog barked,” “each dog barked,” “some dog barked,” and so on (for further discussion, see Davies 1981:149–52). (Russell’s Theory of Descriptions (1905) is a quantificational account in the looser sense that Russell took “the F” to be an apparent singular term in need of analysis by the standard determiners some and every. There is also a “predicate” account of some descriptions, as in Fara 2001.)

Only the singular-term account of descriptions raises the problem of referential opacity, for if the descriptions in (5a) are quantifiers rather than singular terms, they are not referential and =E could not be applied in the first place: the major premise is not of the form t₁ = t₂, but is rather “(the x: Fx)[(the y: Gy)[x = y]].”

However, even if descriptions are singular terms, they may be a special case semantically, which could make (5) not very illuminating about (2). Assuming the singular-term analysis, definite descriptions other than mathematical ones are, apart from certain unusual cases, nonrigid designators: they do not pick out the same object at all possible worlds (Kripke 1972, 1980:48ff). For example, the number nine is the unique satisfier of “number of planets” at the actual world, but in some other possible world, a different (natural) number is the unique satisfier, or, perhaps, there is no satisfier because there are no planets. “3²” is the less common case, a rigid definite description: “3²” abbreviates “the product of the number three with itself,” and nine uniquely satisfies “product of the number three with itself” at every possible world, since numbers exist in every possible world, “the number three” is another rigid description, and the product operation is the same at every possible world. (As hinted above, there are other ways of cooking up rigid descriptions; see Davies and Humberstone 1980. For more on nonrigidity, see Tichy 2004.)

According to Kripke (1972), proper names, unlike typical descriptions, are rigid designators: they denote the same object with respect to every possible world. To see the case for rigidity, suppose we say that the planet Jupiter could have failed to exist. Here we are talking about a specific heavenly body which in the actual world orbits the Sun between Mars and Saturn, but which, we might say, in certain other possible worlds, is simply never formed, because of different behavior on the part of the original protoplanetary disk, or because a physical universe never comes into existence, or for whatever possible reason. When we say that Jupiter does not exist in such circumstances, we mean to be talking about our relatively familiar planet (it is the third brightest object in the night sky) and saying that it does not exist. So “Jupiter” denotes Jupiter at each possible world w, no matter what happens in w, even failure of Jupiter to exist (see further Salmon 1981:32–40).

It is crucial to problematic uses of =E in the style of (5) that at least one of the singular terms in the major premise be nonrigid. For if they are both rigid and also codesignate, then the minor premise and the conclusion will agree in truth-value. So we might propose a restriction on =E that makes the application in (5) illegal. The weakest restriction motivated by the failure of (5) is that t₁ and t₂ must have the same modal profile: for each w, either t₁ designates the same thing as t₂ at w, or neither designates anything at w. A slightly stronger restriction is that t₁ and t₂ have the same modal profile and at each w, each designates something. Here we are proposing a sui generis addition to the constraints that correct application of =E in modal languages must meet, a constraint that is required because we are treating definite descriptions as singular terms. But allowing application of =E in formal modal languages only if the terms in the major premise have the same modal profile is not workable, since two terms which have the same profile in one interpretation of the language (at each world, they denote the same thing) may have different profiles in another interpretation. So the standard approach is (i) to decree that =E is only applicable when t₁ and t₂ are proper names, and (ii) in the semantics stipulate that names are always rigid designators. (Some might object that it is illegitimate to sneak semantics into the statement of an inference rule, as the combination of (i) and (ii) does.)

Using “□” for “necessarily,” we can then prove

(6)
c = d⊢ □(c = d),

simply using =E once, with the minor premise “□(c = c),” which is a theorem and therefore does not need to be mentioned on the left in (6). But (using “∃!” for “there exists exactly one”) we will not be able to prove even

(7)
the F = the G ⊢ □([(∃!x)Fx & (∃!x)Gx] → (the F = the G)),

much less with the unconditional version of the conclusion, “□(the F = the G).” The restriction in =E to names blocks anything like a proof of (7) analogous to that of (6) just mentioned, and there is no way of formulating sound rules for “the” to get round this. So we can classify (5) as a misuse of =E, since in (5a) at least one term is not a proper name.

The relevant question for us is whether there is anything in our discussion to justify the claim that the definite description “the number of planets” occurs opaquely in (5b). As already noted, the idea that “the F” is really a quantifier would have to be rejected before the question whether descriptions are referentially opaque in modal contexts could even arise, since quantifiers are not referential. So for “referentially opaque” to be an accurate characterization of the occurrence of “the number of planets” in (5b), we must take a side, not necessarily the most plausible side, on the singular-term/quantifier issue.

Yet even granting that definite descriptions are singular terms, it is implausible that

“the number of planets” is functioning deviantly in (5b), or in some other way that merits the term “opaque.” In an extensional language, the designation of a definite description in given circumstances is calculated following the semantic structure of the description. For example, “the man who first set foot on the Moon” will designate the unique entity, if there is one, that satisfies both “is a man” and “first set foot on the Moon.” To satisfy “first set foot on the Moon,” such an entity must be the first satisfier of “set foot on the moon,” which in turn has further semantic structure. This evaluation procedure, of following the structure to arrive at a unique object (if there is one), does not change when we move to an intensional language; it is simply that in interpreting an intensional language there are varying circumstances with respect to which an expression can be evaluated. A conjunction A & B may have different truth-values in different circumstances, but no one would accuse “&” of being problematic on account of this. Similarly, the fact that “the F” can have different designations in different circumstances is hardly a cause for concern.

Of course, (5) may seem to indicate a problem; but then, so may the sequent

(8)
A ↔ B, ◇(A & C) ⊬ ◇(B & C)

(here “◇” means “possibly”; consider the case where C = ¬B). From (8), we learn that substitution on the basis of accidental equivalence does not work in modal languages, and we must constrain any substitution rule to require necessary equivalence. In the same way, from (5) we learn that substitution on the basis of accidental codesignation is invalid in modal languages, and we must constrain =E to allow its application only if the codesignation is necessary. This is exactly what we have done, by restricting the singular terms of the major premise to individual constants, whose semantics requires them to be rigid designators.

Is there an analogous restriction on =E that we could employ to make the rule acceptable for languages with attitude verbs like “believe”? That t₁ = t₂ be rigid designators is insufficient, as (2) shows. And we want a condition that does not make it a matter of mere mental compulsion that any thinker in the minor premise’s propositional attitude comes to be in the conclusion’s propositional attitude: it has to be logically guaranteed. Plausibly, nothing weaker than identity of proposition determined by the two “that”-clauses satisfies this demand. So if we agree that a difference in the semantics of the two names would result in the two content-sentences in (2) expressing different propositions, we will have to say that the two names in a use of =E in the likes of (2) must be synonymous.

But it is not clear what it means to apply “synonymous” to a pair of names. Names are not usually found in dictionaries, so the normal notion of synonymy, on which, say, “attorney” and “lawyer” are synonyms in virtue of having the same dictionary definition, will not help. There is also a more serious objection, due to Mates (1952), to the effect that even substitution of dictionary synonyms in attitude ascriptions can produce results not much more comfortable than (5). For example, (9a) below may well be false, yet it seems (9b) could still be true:

(9)
a. I suspect that many people doubt that everyone believes all lawyers are lawyers.
b. I suspect that many people doubt that everyone believes all lawyers are attorneys.

One moral we might draw from “Mates cases” like this is that searching for a criterion which allows substitution of t₂ for t₁ in attitude reports is likely to be futile. (For further discussion of attitude reports differing by a synonym, see Burge 1978 and Kripke 1979:160–1.)

To summarize, we have considered three incorrect uses of =E, (3), (4), and (5), in the hope that understanding why they go wrong will help us gain clarity about (2). But (3) turned out not to be so useful, given the drawbacks to quotational accounts of attitude ascriptions. (5) suggests trying to modify =E by limiting its use to some favored class of singular terms, but Mates cases cast doubt on whether this line will be productive (see also Kaplan 1969, Section xi). This leaves (4), which shows how a substitution can have a hidden truth-condition-altering side-effect, a paradigm to which we will return.

For the moment, we note a distinction which emerges from the unhelpfulness of (5). (5) illustrates difficulties for =E which arise from the intensionality of certain vocabulary, primarily modal operators, difficulties resolved by a more careful statement of the rule. On the other hand, the difficulties for =E illustrated by (2) do not seem to be resolvable in a similar way. So the problem manifest in (2) is said to arise from the hyperintensionality, or fine-grained intensionality, of psychological vocabulary such as attitude verbs (a context is hyperintensional iff interchange of necessarily coextensive expressions in it can fail). However, even hyperintensional semantics does not necessarily legitimize a qualified version of =E. (For a version of hyperintensional semantics that takes propositions as primitive, see Thomason 1980, Muskens 2005; for a study of some alternatives, see Fox and Lappin 2005; for the use of “impossible worlds” to analyze hyperintensionality, see the exposition and references in Berto 2013; for a derivational account of hyperintensionality, see Bjerring and Rasmussen 2018; and for an argument that “probably” is hyperintensional, see Moss 2018:§7.5).

2. The De Re/De Dicto Distinction

It is possible to get oneself into a frame of mind according to which there is no such thing as hyperintensionality, and the reasoning of (2) is not flawed at all. For if Lois believes that Superman can fly, then, since Superman is Clark, she just does believe that Clark can fly, even though she would not put it that way. What you believe is one thing, which words you are inclined to use when stating your beliefs is another, and if you are ignorant of an identity, you may disprefer or even reject particular wording that nevertheless captures what you believe. So even though Lois would laugh if someone suggested to her that Clark has superpowers (in those very words), she may still believe it.

One view about this argument in favor of (2) is that it is essentially correct. We shall return to this Russellian position later. But a second view is that it exploits an ambiguity that is present in (2b), “Lois believes that Superman can fly,” and in (2c), “Lois believes that Clark can fly.” According to this view, an attitude ascription such as (2b) can be read in a way that permits substitution and in a way that does not. Normally, we understand such ascriptions in the way that does not, which is why we reject (2), but if cajoled enough (“look, she does believe Clark can fly, she just wouldn’t say it like that”), we may switch to a reading that allows substitution. In the usual terminology, this is called the de re reading, contrasting with the more common de dicto reading, which disallows substitution. Other terminology for this reading is relational, contrasting with notional; transparent, contrasting with opaque; and wide scope, contrasting with narrow scope. We turn now to explaining what distinction these labels attempt to mark.

a. Defining the Distinction

None of the above terminology is entirely happy. It is unclear in what sense the substitution-resistant reading of (2b) is any less “about the thing” (“de re”) than a putative substitution-permitting reading, nor is it clear why the truth of (2b) understood in a substitution-resistant way makes the subject of the ascription any less related to the object the attitude is about (Lois believes Superman can fly because she has seen him do it). And “transparent/opaque” employs the notion of opacity, which, if it is not just a synonym for “substitution resisting,” suggests failure to refer in the normal way, an idea we have yet to find a justification for.

But “wide scope/narrow scope” is more useful. The rationale for “wide scope” is the thought that a substitution-permitting reading of (2) can be brought out by a formulation in which the crucial name is moved to a position in front of the attitude verb (it has wide scope with respect to the verb), as illustrated in

(10)
a. Superman is such that Lois believes that he can fly.
b. Superman is someone who Lois believes can fly.

The step from (2b) to (10a) or (10b) is called exportation, and it is intuitively plausible that the exported forms permit substitution: if Superman is someone Lois believes can fly and if Superman is Clark, then indeed Clark is someone Lois believes can fly. So if we read the minor premise and conclusion of (2) in the exported way, we have an explanation of why someone might, under pressure, accept (2) after all. For (2a) and either (10a) or (10b) entail the exported variant of (2c). Note that we are not saying that exportation is valid, for example, that (2b) entails (10a) (though it seems to—for worries about existential commitment of the kind raised in Donnellan 1974, see Forbes 1996:357–62, and more generally Kvart 1984). The point here is just that (2b) and (2c) could be understood straight off in the style of (10), which would explain why (2) might be swallowed.

One advantage of the wide-scope/narrow-scope terminology is that it reflects a difference whose existence is not in doubt, insofar as it is simply syntactic, manifested in the contrast between, say, (2a) and (10a). But of course, there is a question whether the syntactic difference marks any interesting semantic one.

To argue for a semantic difference, we may observe that the same syntactic distinction arises with definite descriptions and (other) quantifiers, where a semantic difference is undeniable. For example, we have

(11)
a. Lois believes the extraterrestrial who works at The Daily Planet likes her.
b. Lois thinks that no extraterrestrial is in this conference room.
c. Lois hopes that someone born on Krypton will come to her aid.

If the quantifiers are given narrow scope, that is, if the examples in (11) are interpreted following word-order, (11a) is false, (11b) is (say) true, and (11c) is false. (11a) is false because Lois does not think that there are any extraterrestrials who work at The Daily Planet, so would not use “The extraterrestrial who works at The Daily Planet likes me” to express any belief of hers. (11b) is true even though

Clark is in the conference room along with Lois and she sees and recognizes him. But since Lois presumes none of her colleagues is an extraterrestrial, she will happily use “No extraterrestrial is in this conference room” to say what she believes about the planetary origins of those in the room. And (11c) is false because (let us suppose) Lois has never heard of the planet Krypton; therefore, she will not think or say “Would that someone born on Krypton comes to my aid!” At least, these are the commonsense verdicts about the examples in (11), based, as is evident, on maintaining a close connection between the content of mental states and their verbal expression by the subject (on which, see Burge 1978:132).

However, these judgments of truth-value reverse themselves when we consider the exported forms:

(12)
a. The extraterrestrial who works at The Daily Planet is someone who Lois believes likes her.
b. No extraterrestrial is someone Lois thinks is in the conference room.
c. Someone born on Krypton is such that Lois hopes that person will come to her aid.

(12a) is true because Clark is the extraterrestrial who works at The Daily Planet and Lois believes Clark likes her; (12b) is false because Clark is an extraterrestrial and Lois thinks Clark is in the conference room; and (12c) is true because Superman was born on Krypton and Lois hopes Superman will come to her assistance. (The intuition that (12a) and (12c) are true and (12b) false suggests that what is required for the truth of, say, (12a), is that Lois have at least one name t of Kal-El such that she expresses a belief of hers with an assertion of “t likes me” literally construed. So the falsehood of (12a) would require her to have no such name; that she will not use “Superman likes me” to express a belief of hers is insufficient for the falsity of (12a).)

Not only does this contrast between (11) and (12) indicate that exportation makes a semantic difference, it also indicates what that difference is. The false cases in (11) are false because they make attitude attributions to Lois using concepts that either she lacks (“born on Krypton”), or thinks empty (“extraterrestrial who works at the Daily Planet”) and so would not employ positively in any belief she has; while the true case, (11b), is true precisely because “no extraterrestrial” is used to specify the content of her belief. In (12), on the other hand, problematic material is kept out of the specification of Lois’s mental states, which allows (12a) and (12c) to be true, while in (12b), we get a falsehood precisely because “no extraterrestrial” functions simply as an objectual quantifier, without characterizing the content of her belief. So in propositional attitude attributions with wide-scope material binding into the content-sentence, the content-sentence only partially characterizes the attitude, while if there is a “closed” content-sentence within the scope of the attitude verb, that is, if there is no exported material, the content-sentence fully characterizes the attitude. And we can then, if we like, resurrect the “de re/de dicto” terminology and use it in the same way as “wide scope/narrow scope.” The hallmark of a de re attribution is not that it says that the subject of the attribution stands in a special relation to the thing the attitude is about, but that the attribution designates or characterizes that thing in a way the ascriber chooses irrespective of whether the subject would accept the characterization, and the subject’s resisting the characterization is not even prima facie reason to think the attribution false; while a contested de dicto attribution is prima facie false. (See further Brogaard 2008:105–7 and Yalcin 2015:210–13; also see Marcus 1962 and Kazmi 1987 on the interpretation of exported quantifiers.)

This gives us a nontendentious way of using “de re/de dicto,” aligned with “wide scope/narrow scope,” that justifies our proposed diagnosis of any inclination to say that (2) passes muster: the diagnosis is that such judgment relies on construing the minor premise and conclusion as if they were in exported form, that is, construing them as de re attributions in the just explained sense. Still, it is worth observing that on this account we are equating the permits-substitution/resists-substitution distinction in the examples in question with a scope ambiguity. This may be too strong: there may be a substitution-permitting reading of, say, (2b), “Lois believes that Clark can fly,” which is not to be explained as involving a wide-scope reading for “Clark.” We will return to this point later, in connection with hidden-indexical semantics.

b. Skepticism about the Distinction

We have arrived at an apparently defensible way of understanding the de re/de dicto distinction, however the distinction is to be employed. We must therefore note that there are expressions of skepticism about it in the literature, for example Dennett (1982), Richard (1990:128–31), Sosa (1970), and Taylor (2002), whose points have not been addressed here. So, let us briefly consider a selection.

Taylor points out that even if using a definite description provides an accurate characterization of what a subject J believes or doubts, in the sense that the content-sentence containing the description echoes the sentence J would produce to express J ’s attitude, an ascriber will in certain cases resist using the description. These are cases where the ascriber thinks that the definite description is improper (a singular definite description the F is improper iff it is not the case that there is exactly one F). Thus, on seeing Smith’s dismembered corpse, Jones may leap to the conclusion that he was murdered and say “Smith’s murderer must be insane”; this is a “whoever that is” use of a description (Donnellan 1966; I am assuming “Smith’s murderer” is a form of “the murderer of Smith”). But if Black knows or believes that Smith was in fact savaged to death by an escaped tiger, she will not make ascriptions like “Jones thinks Smith’s murderer is insane” or “Jones expects the police to capture Smith’s murderer quickly.” This is puzzling if we have the practice of making de dicto ascriptions to reflect the content of the subject’s attitudes, and there is no reason to doubt that Jones’s statement “Smith’s murderer must be insane” expresses in his mouth what he believes (see further Maier 2015).

This reluctance to ascribe may be a result of pragmatic considerations. One reason to think so is that even in the circumstances of the case, it seems that Jones can properly self-ascribe notionally with “I believe Smith’s murderer is insane.” If Black asserts “Jones believes Smith’s murderer is insane” just before realizing she should not, and if “believe Smith’s murderer is insane” is univocal between Black’s ascription and Jones’s self-ascription, the difference in assertibility most probably has to do with the shift in context of utterance, specifically the shift in speaker. One might flesh this out in terms of “the” being a presupposition-trigger, entailing, even when in the scope of normally entailment-canceling operators such as negation, that its restriction is uniquely satisfied, which in our case means that exactly one person murdered Smith. Then since Black knows that Smith was not murdered, she will not say anything that entails that he was. Nonfactive attitude verbs are often said to suppress the triggering (“projection”) of presuppositions (see Kadmon 2001:116), but in view of Taylor’s examples, this may be wrong, or at least too simple.

A weaker pragmatic approach proposes that using a definite description in a belief-ascription conveys (merely) that the ascriber grants or takes the description to be proper. And cooperative speakers who know this do not use descriptions they think improper. So the difference between Black’s ascription and Jones’s self-ascription is explained. The question would then be how this implicature arises.

So far as undermining the idea that there are de dicto or notional ascriptions goes, one might say that the use of presupposition-triggers in the content-sentence creates a principled exception. One would then expect the phenomenon noted above to recur with other triggers. Jones may say “I think I will manage to save enough money,” but Black should not report “Jones thinks that he will manage to save enough money” unless Black grants Jones’s presupposition that saving enough money will be difficult. For if Black knows that the sum is small and that Jones can easily afford it, on this account she would not want to use “manage,” unless ironically.

There is also a question about how manifest the phenomenon that Taylor isolates is with other quantifiers. If Jones says “everyone who attacked Smith will be brought to justice” (he now thinks there were multiple killers), would Black, who knows about the tiger, happily report “Jones thinks everyone who attacked Smith will be brought to justice,” even though Jones says so? If the report seems infelicitous, that may be a point in favor of a pragmatic account if it is combined with a presuppositional account of “every F” in “every F is G.” According to such an account, the restriction F, in this case “person who attacked Smith,” is presupposed to be nonempty (see Heim and Kratzer 1998:159–72).

Sosa (1970) has an interesting example which tries to undercut the de re/de dicto distinction by suggesting that there are no hard-and-fast limits on exportability and so no substantial cognitive relation invoked by the exported form. In an extreme case (Sleigh 1968), if S believes there are spies but only finitely many, and that all have heights but no two have the same height, S may infer and come to believe “the shortest spy is a spy,” and Sosa would allow the exported ascription “the shortest spy is someone S believes is a spy.” So if Phil Kimbly is the shortest spy, Phil Kimbly is someone S believes is a spy (strangely, S, though the most upright of citizens, never thinks of contacting the FBI).

The argument for this laissez-faire stance about exportation is that there are examples where it is perfectly natural. For instance (Sosa 1970:890), the Commanding Officer (CO) may say to the captain, “Tomorrow I want the shortest platoon member to go first” or “I think the shortest platoon member should go first tomorrow.” The CO has no idea who the shortest platoon member is, but in fact it is the unfortunate Smith again (this is before he meets the tiger). The captain knows Smith is the shortest, and says to the sergeant, “The CO wants Smith to go first tomorrow”/“The CO thinks Smith should go first tomorrow,” or to Smith, “The CO wants you to go first tomorrow.” It is perfectly natural for the captain to say such things, yet the ascriptions seem to be arrived at by first exporting a description used by the CO in a whoever-that-is way, and then substituting a name or pronoun. But should not we object to the exporting, on the grounds that the CO does not have a desire or belief or doubt about Smith, that such-and-such? His desire that the shortest platoon-member go first seems to be no more about Smith than S’s belief that the shortest spy is a spy, arrived at as described, is about Phil Kimbly. But why then is “The CO wants Smith to go first tomorrow” so natural?

According to Kripke (2008:348), examples like these are “toy duck” cases: a child in a toy store points at a stuffed animal, asking his mother if it is a goose, and she replies “No, it’s a duck.” Kripke implies that what the mother says, no matter how natural, cannot really be true: “no dictionary should include an entry under ‘duck’ in which ducks…may not be living creatures at all” (346). Another example might be that you and I go to an exhibition of the work of a famous forger who specialized in analytic cubism. Pointing at one of his forgeries on the wall, I ask “Is that a Picasso?”, to which you reply, “No, it’s a Braque.” This is a natural conversation, but the painting is not really a Braque, and we should not explain the use of artists’ names as predicates of their works in a way that permits an NN not to be by NN. Of course, the simplest explanation of the naturalness of these dialogues is that the remarks “It’s a {duck/Braque}” are true, even though the duck is made of artificial fibers and Braque had nothing to do with the Braque (see Partee 2003 for how this could be). So if we follow Kripke in rejecting that explanation, we need to find another. Fortunately, at least in Sosa’s case of “The CO wants Smith to go first tomorrow,” it is not hard to see what the naturalness consists in: Smith is the person whose going first tomorrow will satisfy the CO’s desire that the smallest platoon-member, whoever he is, go first tomorrow; and Smith is the person whose going first tomorrow would realize the quantified eventuality the CO believes should obtain. Rather than leave it up to the sergeant to find out who the relevant individual is, the captain just tells him, and rather than do so by some laborious step-by-step reasoning about how to satisfy the CO’s desire, the captain makes an attitude ascription that is strictly false, but serves both his and the sergeant’s interests in seeing that the CO’s order is obeyed; for to obey the order, an individual has to be identified. By contrast, the Phil Kimbly ascription seems unnatural because there is no surrounding context to give it a rationale. Perhaps we could invent one, but doing so would not turn an incorrect exportation into a correct one, and nor does it in Sosa’s example. An ascription can be well motivated and promote efficiency in communication, but still be literally false.

c. The de re and Leibniz’s Law

Assuming that the de re/de dicto distinction survives skeptical attack, there is one more issue we can address with its aid. At the start of this essay, we distinguished Leibniz’s Law, “if x and y are the same object, then x and y have the same properties,” from the inference rule of =E. Problem cases for the rule might suggest that the Law itself is dubious. Why have we not considered this possibility?

The reason is that the Law is formulated in terms of objects and properties, and to regard examples like (2)–(5) as threats to it, we would have to construe these inferences as specifying properties of objects in their minor premises; but when we do this, we see that the apparent threat to the Law fades, as follows.

(3) is a “wrong object” case, for (3b) ascribes a property to a word, but in (3a) the objects x and y are cities. (4) is a case of failure to specify a property of an object: (4b) seems to involve the property being so-called because of its size, but the italicized phrase fails to specify a property, because of the uninterpretability of its “so”: “so” needs a context, linguistic or otherwise. There is certainly at least one property of objects in the offing, that of having a name which was endowed on the basis of size. But in conformity with the Law, that property is shared with Barbarelli, and the sentence attributing it, “Giorgione has a name endowed on the basis of his size,” falls short of what (4b) says. There is also the property being called “Giorgione” on account of size, but this is shared with Barbarelli too.

As for (5), there is certainly a reading of (5b) in terms of properties of objects: the property of contingently being 9 is ascribed to the number that numbers the planets. But then (5b) is false, since this number is 9, and 9 is not contingently 9. In other words, this property-of-objects construal requires a de re reading of (5b), with the description “the number of planets” exported, resulting in a falsehood.

Another property-of-objects construal of (5b) is one where the property is contingency and the object is the proposition that the number of planets is 9. On this reading, (5b) is true. But this turns (5) into another wrong object case, since in the major premise the objects are numbers, not propositions. And if we change (5a) to make it about propositions, it would have to say that the proposition that the number that numbers the planets is 9 is the same proposition as the proposition that 3² is 9. If (5) is reformulated this way, it is clearly a correct use of =E, but the falsity of the conclusion, that the proposition that 3² is 9 is contingent, means the rewriting of the major premise to state an identity between propositions produced a falsehood: they are not the same proposition at all.

So what of the original (2)? Here the property-of-objects construals of the minor premise are parallel to those in (5), but we do not want to say quite the same things about them. One property-of-objects reading of (2b) is that Superman has the property of being believed by Lois to be able to fly. (2a) is an identity involving Superman, so certainly we can use =E, in this case to infer that Clark has the property of being believed by Lois to be able to fly. This is just a slightly different formulation of the way of understanding the argument that we identified above as underlying an inclination to say that (2) is valid: the crucial point is that the names that are syntactically in the scope of “believes” are interpreted semantically to be exported from its scope. But we do not arrive at (2c), understood as false: that would require importation of “Clark” back into the scope of “believes,” and the fact that (2c) is by default understood as false shows that importation is invalid.

As with (5), we can reconstrue the minor premise and conclusion of (2) to be specifically about propositions. (2b) would then say that the proposition that Superman can fly is believed by Lois, and (2c) would say that the proposition that Clark can fly is believed by Lois. To prevent this just being another wrong-object case, (2a) would then have to be changed to an identity between propositions. Specifically, it would assert that the proposition that Superman can fly is the same proposition as the proposition that Clark can fly. The =E inference is then entirely in accord with Leibniz’s Law. The problem, of course, is that one is inclined to infer that the asserted identity between the propositions is false.

Perhaps we should say, then, that (5) is partially instructive as regards (2), in that there are parallel property-of-objects readings. What (5) does not help with is the formulation of a restriction on the terms used in =E that allows syntactically unstructured individual constants to be substituted in formulations like those actually used in (2); moreover, there seems to be no way to do this.

3. Frege’s Theory of Substitution-Resistance

a. The Sense/Reference Distinction Applied to Attitude Ascriptions

According to the framework for semantics of natural language sketched in Frege (1892), every meaningful phrase of natural language has potentially two sorts of meaning, a reference (Bedeutung) and a sense (Sinn, a cause of many puns in the titles of worthwhile pieces—for example, Dummett 1973 Ch. 17, Burge 1979, Forbes 1990 (if I may), Salmon 1990; for issues about the translations of these German words, see the discussion and references in Kripke 2001:254, n.1). A meaningful expression e, or a use of e, expresses a sense. Its sense determines its reference (if it has a reference) by virtue of being a way of thinking (or “mode of presentation”) of that reference, but whether there is a reference can depend on how things are in the world. In the case of a singular term, the reference is the thing it designates. For example, the sense of the name “Aristotle” might be articulated by “the pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics.” Whether or not the name “Aristotle” has a reference then turns on whether or not there was such a person.

The same is true of sentences. A sentence expresses a thought, or, in current jargon, a proposition, and a proposition with a reference refers to a truth-value, true or false (the idea that propositions refer is a little odd, but see Dummett 1973:180–6). For example, the proposition that Aristotle was a philosopher is a way of thinking of a truth-value: this proposition is the proposition that the pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics was a […] (here readers should substitute their favorite explanation of “philosopher” for the ellipsis, but please, not “one who philosophizes”). Assuming that there was such a person, then this proposition is a way of thinking of true. However, if “Aristotle” lacks a reference because there was no such person, the proposition “Aristotle was a philosopher” will lack a reference because it has a part that lacks a reference.

It is an important point about this apparatus that the calculation of the reference of the whole proposition or sentence expressing it proceeds via the references of the parts. In the case of “Aristotle was a philosopher,” the reference of the whole sentence is obtained by composing the references of “Aristotle” and “was a philosopher,” as determined by their senses, in a way which results in a truth-value. So, it is easiest to think of the reference of “was a philosopher” as a function, one which, applied to an object, produces a truth-value (functions are input-output operations, so in this case the object is the input, the truth-value the output). Then if “Aristotle” provides an object, we will get a truth-value from “was a philosopher.” But if there was no such person, this procedure will hang, waiting for an object when none is going to be provided. This motivates the verdict that in case the name is empty, the sentence is neither true nor false.

a. The Sense/Reference Distinction Applied to Attitude Ascriptions

The sense-reference distinction suggests that we may be able to explain how (13a) below can be true while (13b) is false:

(13)
a. Lois hopes Superman is nearby.
b. Lois hopes Clark is nearby.

Assuming that the names have different senses (perhaps “the red-caped superhero who flies” versus “the mild-mannered Daily Planet reporter with a crush on Lois Lane”), (13a) and (13b) will express different propositions because their embedded content-sentences do, and so (13a) and (13b) at least potentially may refer to (that is, have) different truth-values. But truth-value is at the level of reference, and the corresponding constituents of (13a) and (13b) are all coreferential (given a fixed context to determine what counts as “nearby”). Specifically, the references (truth-values) of (13a) and (13b) are calculated from the references of their three main constituents: (i) “Lois,” referring to Lois; (ii) “hopes,” referring to the hoping relation; and (iii) “Superman is nearby” and “Clark is nearby,” respectively, which refer to the same truth-value. Since (i) and (ii) are common to (13a) and (13b), (13a) and (13b) must also have the same reference, that is, same truth-value, even if they express different propositions by virtue of having content subsentences that express different propositions. So it looks as if Frege’s apparatus does not get us any closer to an account of how (13a) and (13b) might differ in truth-value.

Explanation of references as functions may be extended to expressions other than singular terms and sentences. For example, “hopes” at this point is assumed to refer to a function f that takes a truth-value as input, say the truth-value of “Superman is nearby,” and produces as output another function, g, the reference of the verb-phrase “hopes Superman is nearby.” g takes the referent of the name “Lois” as input and produces the truth-value of (13a) as output. The problem is then that “Superman is nearby” and “Clark is nearby” present the same truth-value to f, which must therefore output the same function g as the referent of the two verb-phrases “hopes Superman is nearby” and “hopes Clark is nearby” (same input requires same output). Thus, Lois is mapped to true by both verb-phrase functions, or to false by both, since they are both the function g; and so (13a) and (13b) are equivalent.

The source of the difficulty is clear: we have taken the reference of “hope” to be a function of the truth-values of content-sentences that follow it. This is not arbitrary, for the calculation of the reference of any complex phrase uses the references of its constituent phrases along the way, and the content-sentence of the ascription does indeed refer to a truth-value, at least when asserted in isolation, or more broadly, when it occurs extensionally, not in an intensional or hyperintensional context. But this is a very unintuitive account of the reference of “hope.” The thing the attitude of hoping is taken toward is surely a proposition, not a truth-value: the proposition that Superman is nearby is what Lois hopes to be true, not the proposition’s truth-value.

So, on the one hand, we want “hope” to take the reference of its complement sentence as its input, because reference is computed from referents. On the other hand, we want “hope” to take the proposition expressed by its complement sentence as its input, because it is propositions whose truth we hope for. But the proposition is the sense of the content-sentence, not the reference.

To solve this conundrum, Frege made a move of what Kaplan called “brilliant simplicity” (Kaplan 1969:117): we attribute to attitude verbs the property of switching the reference of the material that follows in the ascription from the “customary” reference of that material to a different reference, namely, the customary sense (also known as the “indirect” reference). So in (13a), the (customary) reference of “hopes Superman is nearby” is obtained by applying the (customary) reference of “hope” to the reference “Superman is nearby” has in (13a), its indirect reference, that is, its customary sense. Thus, the reference of “hope” gets the proposition that Superman is nearby as input, as we wanted. This means reference is relativized to linguistic context of occurrence. If “Superman is nearby” occurs extensionally, it refers to its truth-value. But if “Superman is nearby” is the S-part of a complex phrase V+(that)S, where V is an attitude verb, “Superman is nearby” refers to its sense, the proposition that Superman is nearby.

On this account, “hope” refers not to a function that takes a truth-value and produces, as the meaning of the verb-phrase “hopes Superman is nearby,” a function that takes individuals (such as Lois) to truth-values. Rather, “hope” refers to a function which takes a proposition as input, for example the proposition that Superman is nearby, though it still produces, as the meaning of the verb-phrase “hopes Superman is nearby,” a function which maps some individuals, like Lois, to true, and others, like Lex Luthor, to false. However, since we have already agreed that “Superman is nearby” and “Clark is nearby” express different propositions (when occurring extensionally, as we would now add) because of the different senses of “Superman” and “Clark,” this means that the input to the reference of “hope” in (13a) is different from its input in (13b): two different propositions, rather than the single truth-value which is all that is available in the absence of the switch in reference of the content-sentences. Consequently, the verb-phrases “hope Superman is nearby” and “hope Clark is nearby” can refer to different functions; “hope Superman is nearby” can refer to a function which maps Lois to true, while “hope Clark is nearby” can refer to a function which maps Lois to false. This is Frege’s account of how (13a) and (13b) can differ in truth-value, and is the first example of what is nowadays called “switcher semantics”(Gluer and Pagin 2006, 2012; Pagin and Westerståhl 2010).

The reference-switch thesis has immediate application to the question of what is wrong with (2). The Fregean answer is that (2) is a fallacy of equivocation. In (2a), “Superman” and “Clark Kent” have their customary referents, namely, Kal-El. But in (2b), “Superman” refers to its customary sense, the concept of being the red-caped superhero who flies; “Clark” also refers to its customary sense. As the example shows, identity of customary reference does not justify substituting one singular term for another in the content-sentence of an attitude attribution, since identity of customary reference falls far short of the identity of indirect reference (identity of sense) that would be needed for (2) to be valid.

Indeed, Frege’s theory predicts that it will be hard to find any nontrivial sound arguments in the style of (2), even if we change the major premise to be of the form “the sense of t₁ = the sense of t₂.” For then the major premise is true only if two different names have the same sense, and it is not clear under what circumstances that would happen. Perhaps it might be self-evident in the acquisition process that the names refer to the same person: the speaker introduces herself to x with “Hi! My name is Roberta, but people call me Bobbie.” But even if x correctly recalls this, Mates cases can be constructed: x may coherently think that everyone knows Roberta is Roberta but wonder if everyone knows Roberta is Bobbie. Perhaps we should say that for x, for a while, the two names have the same sense, but x envisages that others may use the names with different senses, and the semantics of “everyone knows that Roberta is Bobbie” allows, one way or another, for this possibility. (See also Schiffer’s discussion of the individuation of senses (1992:502–3). For a theory on which senses are never needed to deal with the likes of (2), see Millikan 2000, and for a pro-Fregean critique, Lawlor 2006.)

b. The Hierarchy Problem

There are problems of detail with Frege’s theory. One such is how to accommodate intersubjective variation in sense (see Zalta 2001). But perhaps the best known is the “infinite hierarchies” problem. As we have already seen with Mates sentences, one attitude ascription can be embedded within another. A simple case is:

(14)
a. Kal-El wonders if Lois has begun to notice that Clark is never around when Superman is.
b. Lois has begun to notice that Clark is never around when Superman is.
c. Clark is never around when Superman is.

According to Frege, “Lois has begun to notice that Clark is never around when Superman is” refers in (14a) to the sense it expresses in (14b), since it is within the scope of “wonders” in (14a). And “Clark is never around when Superman is” refers in (14b) to its customary sense, the sense it expresses in (14c) (curiously, the names in (14c) also seem to resist substitution, despite the lack of attitude verbs; we will return to this in our discussion of “simple sentences”). These sentence-senses are obtained systematically from the senses of their constituent words. So in (14b), “Clark” refers to the way of thinking of Kal-El it expresses in (14c), which we label m₁. But whenever a word refers, it does so by expressing a way of thinking of that reference. So “Clark” in (14b), referring as it does to m₁, must express a way of thinking of m₁, which we label m₂. Plausibly, m₂ cannot be m₁ over again, for (i) m₂ = m₁ would require the same way of thinking to be of both a person, Clark, and of a way of thinking of that person, m₁; and, (ii), m₂ = m₁ means that m₁is a way of thinking of itself, an idea not breathtaking in its intelligibility (see further Peacocke 2009:162–3; but see also Dummett 1973:264–9 for an attempt to get by with just m₁). So these considerations motivate the idea that in (14b), “Clark” expresses a way of thinking m₂ which is of m₁ and not identical to m₁.

Now, (14b) occurs in (14a) within the scope of the hyperintensional “wonders,” so its reference in (14a) and the referents of its constituent words in (14a) must switch; they switch from the referents they have in (14b) to the senses they express in (14b). This means that in (14a), “Clark” refers to m₂. But then, “Clark” in (14a) must express a sense which is a way of thinking of m₂, since this is the only way “Clark” could refer to m₂. Call this sense m₂. As before, it is implausible that m₂ is the same as m₂, since, first, it would have to be a way of thinking of itself, and second, it would have to be both a way of thinking of m₂, but also, since ex hypothesi it is m₂, would have to be a way of thinking of m₁. m₂, then, appears to be something new.

And so we are off. We can make (14a) the content-sentence of a new attitude ascription, say

(15)
Lex suspects that Kal-El wonders if Lois has begun to notice that Clark is never around when Superman is.

Now the sense (14a) expresses becomes the reference of (14a) in its appearance as the content-sentence of (15), and the words of (14a) will express new senses in (15), ways of thinking of the senses they express in (14a); for example, in (15), “Clark” will express m_, a way of thinking of m₂, so that “Clark” in (15) can refer to m₂. Since there is no principled restriction on how deeply attitude verbs may be embedded within other attitude verbs, we have, apparently, an unending sequence of senses. In particular, “Clark” can express infinitely many ways of thinking, none of which are intelligible beyond the first or second. Some Frege scholars have developed formal models of sense and reference which embody such hierarchies; see, for example, Church (1951) and Anderson (1980). However, others have tried, in effect, to stop at m₂; see especially Parsons (1981, 2009).

c. The Semantic Innocence Objection

Problems of detail aside, there are two main objections to Frege’s account which have emerged in the last few decades, the semantic innocence objection and the no-such-thing-as-senses objection. We take the former first.

The semantic innocence objection is so-called because of its famous statement by Davidson (1969:172):

If we could recover our pre-Fregean semantic innocence… it would seem to us plainly incredible that…words [in the content-sentences of attitude attributions] mean anything different, or refer to anything else, than is their wont when they come in other environments.

This is, admittedly, simply an appeal to intuition, but it is a powerful one (see also Loar 1972:43). It is indeed very difficult to detect a switch in the reference of “Superman” if Lois remarks “Superman is nearby, if I’m in luck” versus if she remarks “I hope that Superman is nearby.” The reference-switch thesis also causes problems for the treatment of anaphoric pronouns. In “Galileo thought that the Earth moves, and he knew what he was talking about, so it moves,” it is undeniable that the “it” refers to the Earth. But then the pronoun does not directly inherit its reference from its antecedent (see further Segal 1989). No doubt there are epicycles which get round this, but it is questionable whether that road is worth going down, given the lack of intuitive support at its starting point.

d. Do Name-Senses Exist Anyway?

An even more damaging objection to Frege’s account of substitution-failure for names is that the entities which play the crucial role, senses or ways of thinking of individuals, are chimerical. That Fregean name-senses do not exist is the core argument of Kripke (1972). Briefly, suppose that “Aristotle” does express a reference-determining sense, captured by, say, the singular definite description “the pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics.” One possibility is that this description articulates the meaning of the name in much the way that a dictionary might articulate the meaning of “philosopher.” Then it should be both necessary and a priori that Aristotle tutored Alexander. But it is neither. Aristotle could have been killed in an Athenian traffic accident in his youth, so it is not necessary that he tutored Alexander; and that he did so is clearly an empirical claim, which only historical evidence can confirm or disconfirm. Similarly, not even “if Aristotle and Alexander existed, the former tutored the latter” is necessary or a priori.

A somewhat weaker thesis is that the reference of “Aristotle” is fixed by the description, without being synonymous with it. But even merely this would predict, of some perfectly intelligible statements, that they are semantically problematic. For example (based on Kripke’s “Gödel case,” 1972, 1980:83–5), suppose that someone claims on a fake-news website to have found documents showing that Aristotle was not a pupil of Plato, did not tutor Alexander and did not write the Nicomachean Ethics. The first two items Aristotle deliberately falsified on his CV in order to attract students, and though he published the Nicomachean Ethics under his own name, that was after stealing the manuscript from the true author (not a pupil of Plato), whom he murdered to ensure his silence. And as time passed, the false claims became firmly lodged in popular lore about Aristotle.

If it went viral, this story about Aristotle would outrage historians of philosophy. But the very fact that they would be outraged shows that they understand the story well enough. Yet, if the reference of the name is fixed by the description, the story is self-refuting (if it is true, then it is not true): Aristotle did not lie about tutoring Alexander, for according to the story, “Aristotle” is an empty name, so “Aristotle lied” should be either false or neither true nor false. But no historian would contest the story on the grounds that it is self-refuting: the debate would be over the existence or trustworthiness of the documents that the story is based on. The ability to debate the truth of the story, with both sides treating “Aristotle lied about Plato” as at least debatable, is hard to explain if the reference of “Aristotle” is fixed by the proposed description. And if some other description of the same “famous deeds” sort is substituted, a similar example would surely be constructible.

If the weaker, reference-fixing thesis, does not support attribution of senses to names, perhaps we should go back to the stronger, meaning-giving thesis, and try a different kind of description. Kripke considers modifications like (whoever it is who is) “the person commonly thought to have been a pupil of Plato who tutored Alexander and wrote the Nicomachean Ethics.” He argues that this is vulnerable to counterexamples involving subjects who have not kept up with what is commonly thought about whom (1980:88), and he raises a circularity objection (loc. cit.).

The new description identifies Aristotle as the person commonly thought to be thus-and-so. So there is a certain range of thoughts s₁,…,s_n had by members of the linguistic community, thoughts of various people to the effect that Aristotle tutored Alexander, Aristotle was taught by Plato, and so on, and these determine the reference of “Aristotle.” But ex hypothesi, “Aristotle” as it occurs in these thoughts means “the person commonly thought to be…,” referring us back once again to s₁,…,s_n. There is an unending loop here, and we never escape from the thoughts s₁,…,s_n to a specific object as the reference of “Aristotle.”

Kripke also points out that we manage to refer easily enough even when there are no identifying descriptions we could cite. He gives the example of “Richard Feynman,” a name many people use without having an associated definite description (1980:81—this was before Feynman’s incisive testimony at the Challenger disaster inquiry). An associated indefinite description might be “a famous physicist at Caltech who won the Nobel Prize.” But “a” cannot be strengthened to “the,” since Murray Gell-Mann is also a famous physicist at Caltech who won the Nobel Prize. And if we insert “not identical to Gell-Mann” into the description, we make it impossible to refer to Feynman without having a way of thinking of Gell-Mann (not to get into the looming indeterminacy problem).

e. Alternative Accounts of the Sense of a Name

If Kripke’s arguments show that Fregean senses of names do not exist, then the Fregean solution to the problem of opacity collapses, rather like a well-worked-out theory of human behavior in which demonic possession plays a large and crucial role. However, it would be fair to say that Kripke’s counterexamples tell mainly against “famous deeds” descriptivism and some modifications of it involving qualifiers like “commonly thought.” It is reasonable to focus on famous-deeds descriptions, since Frege says that everyone who uses the name expresses a reference-determining sense with it, and so to guarantee that each individual is in possession of such a sense, one naturally looks to information that is easily come by. But perhaps there are other options for the content of name-senses besides famous deeds.

One alternative, due to Chalmers and developed in the two-dimensional framework of Stalnaker (1987), is two-dimensional sense. A two-dimensional sense is an ordered pair consisting in an epistemic sense and a subjunctive sense. For a name, the epistemic sense is a function from “scenarios” to individuals, and the subjunctive sense is a function from possible worlds to individuals (Chalmers 2011:596–9). A scenario is something like a coherent total description of how things might have turned out to be, and the epistemic sense of a name may be a nonrigid function on such items: in one scenario, a name may refer to x, while in another it may refer to a distinct y. But subjunctive senses are rigid: they denote the same object in any two worlds. The idea is then that epistemic operators are sensitive to the epistemic sense, and modal operators to the subjunctive sense, which, since it is a rigid function, may be identified with the object to which it stably refers (2011:597, T4, T5).

If epistemic senses are just famous-deeds descriptions or their like, Kripke’s objections arise over again. And it would certainly be unfortunate if epistemic and subjunctive senses came apart over actual reference, since then statements like “it’s a posteriori that Aristotle was a philosopher” and “it’s contingent that Aristotle was a philosopher” would be about different people. However, Chalmers has a proposal on which this difficulty and certain others will not arise. Asking what might replace a famous-deeds descriptivist account of how names refer, Kripke suggested a “historical chain” account (1972; 1980:91–4):

[S]omeone, let’s say a baby, is born; his parents call him by a certain name. They talk about him to their friends. Other people meet him. Through various sorts of talk the name is spread from link to link as if by a chain… it’s in virtue of our connection with other speakers in the community, going back to the referent himself, that we refer to a certain man.

The same idea was advanced by Geach (1969:288–9):

[F]or the use of…a proper name there must in the first instance be someone acquainted with the object named…But …the…name…can be handed on from one generation to another… Plato knew Socrates, and Aristotle knew Plato, and Theophrastus knew Aristotle, and so on in apostolic succession down to our own times. That is why we can… use “Socrates” as a name the way we do.

One thing required for x to refer to Socrates with “Socrates” nowadays, then, is that x belong to a linguistic community in which there is an apostolic succession from Socrates to x along which the name “Socrates” is passed. (Following Kripke, x also has to intend to defer in x’s use of the name to those from whom x acquired it—if x decides that “Socrates” would be a fine name for x’s pet turtle, that does not count.)

Kripke mentions that Nozick once remarked to him that if any theory of reference is correct, some descriptivist theory is immune to counterexamples in the style of Naming and Necessity. This would be a descriptivist theory on which the descriptions are theory-laden: they incorporate the reference-determining conditions the correct theory formulates (Kripke 1980:88, n.38). Chalmers exploits this option: taking the historical chain theory as a plausible account of reference-determination, he suggests that the epistemic sense of a name NN might just be “the object NN refers to in the mouths of those from whom I acquired it” or its like (Chalmers 2002:641). This will be a nonrigid function, since in some scenarios, the apostolic succession for “Socrates” will lead to contemporary users but start from an individual x who is not Socrates.

Since the description suggested above involves the term “refer,” there is an obvious circularity worry if the sense is to be reference-determining. Chalmers argues (2002:641–3) that there is no reason to worry, since the evaluation of one person’s epistemic sense takes us back to other people, and their epistemic senses will carry us back to even earlier people, until we arrive at the “initial baptism” introducing the name. The question would then be whether the concept of reference is ineliminably invoked at this point, as in “we hereby name this child NN,” and how significant a problem that would be.

A second question is whether epistemic senses are otiose as far as determining reference is concerned. Is the reason why I can use “Socrates” to refer to Socrates not simply that I belong to a community in which there is a chain of uses of “Socrates” linking me to Socrates in the way the historical chain theory describes, and I have added the name to my repertoire with the intention to use it in a way that preserves the reference of those from whom I acquired it? Perhaps adding the name to my repertoire with such a deferential intention is the very same thing as attaching a theory-laden sense to it. But if not, the postulation of an epistemic sense seems redundant: the reference of the name in my mouth is already determined by my social situation, and if I express a certain epistemic sense with it, that is just a private epiphenomenon.

A second alternative to famous-deeds senses is what we might call “cognitive descriptivism,” since it is based on a (somewhat metaphorical) hypothesis about cognitive architecture. The idea is that we organize our information about what we take to be separate objects that we have encountered into separate mental files, or dossiers. This seems to have first been proposed by Grice (1969:141–4), and was used in an account of the senses of names in Forbes (1990). The neo-Fregean idea is that the sense of a name NN for x is “the subject of this dossier,” where the mental demonstrative “this dossier” refers to the dossier labeled NN by x in x’s mental filing system.

Clearly, questions about circularity and redundancy arise much as they do for two-dimensional sense (see Fine 2007:67–8). If what makes x the subject of the dossier labeled NN is that x is the referent of the name NN, then we have circularity. But if being the subject of the dossier labeled NN consists in—to use the causal theory of Evans (1973)—being the dominant causal source of the information in the dossier, why not cut out the detour through dossiers and just say that the reference of a name NN is the dominant causal source of information that would be expressed in statements of the form “NN is…”? Such issues are pursued in Recanati (2012) and Saka (2018), and are far from settled in the literature. But it is clear from these examples that famous-deeds descriptivism is not in sole possession of the field as an elaboration of Frege’s notion of the sense of a name.

However, whatever viable theory of sense may ultimately be produced, the semantic innocence objection will have to be dealt with. Thomason (1980) is unmoved by it, but we shall next consider accounts of senses that may be invoked by attitude ascriptions in a way that explains failure of =E, yet allows those senses to have their customary references, thereby meeting Davidson’s complaint.

4. Hidden-Indexical Semantics

The reference-switch hypothesis is one version of the more general notion that the words used in the content-sentence of an attitude ascription have a special role that they do not play in other contexts. If the special role does not displace their normal role, we arrive at Loar’s idea of a dual contribution (1972:52–3). On the one hand, as Davidson insists, the words of the content-sentence play their normal role. But there is another semantic mechanism at work in which they are also complicit. There is a wide range of such dual contribution accounts in the modern discussion of opacity, perhaps starting with Loar (1972). Field (1978) has the content-sentence invoking a sentence of the “language of thought.” Bealer (1993) proposes an ambiguity theory, on which the content-sentence of an ascription introduces both an entity composed of the referents of the words, thereby explaining the innocence intuition, and an entity like a Fregean proposition, thereby accounting for the intuition of substitution-resistance in the likes of (2). And Larson and Ludlow (1993) develop a semantics on which a propositional attitude is an attitude to an “interpreted logical form” (ILF) which is a tree structure in which a node is occupied by both the reference of the expression at that node and the expression itself. Consequently, “Superman can fly” and “Clark can fly” are different ILFs simply in virtue of “Superman” and “Clark” being different names.

a. Two Kinds of Hidden-Indexical Theories

Some versions of the dual contribution approach are known as “hidden-indexical” accounts (Schiffer 1979), because of the role context-dependence plays in determining the second contribution of the content-sentence, or because there actually is an indexical expression postulated to occur covertly in the ascription. For example, in Crimmins and Perry (1989) and Crimmins (1992), belief-ascriptions are said to be made true by items supplied by the context in which the ascription is made, items called “unarticulated constituents” because there is no expression in the ascription responsible for their intrusion into the truth-condition. Different but coreferential names may be associated with different normal notions of the same object, and an inference like (2) fails because the substitution changes which normal notion of Kal-El is, in their technical sense, “involved” (there is no reference-switch on the part of the names). Similarly, in Richard (1990), the content-sentence of a belief-ascription invokes a “Russellian annotated matrix” (RAM), which, like an ILF, is an item that contains both Fregean referents and the expressions referring to them, and the truth-condition requires that the RAM in the ascription correlate with a RAM believed by the subject of the ascription. What correlates with what is context-dependent, and (2) fails because substitution need not preserve correlation, even though it preserves Fregean reference (Richard 1990:133–41). While in Forbes (1990, 1996) and Recanati (2000:137–63) there is a hidden “so” in belief-ascriptions, as if “believes” were “so-believes,” which blocks substitution much as it does in Quine’s “Giorgione” case, (4), since the “so” refers to the content-sentence of the ascription.

One respect in which the above theories differ is over what kind of thing is believed. In Schiffer’s general scheme for hidden-indexical theories (1992:503–4), what is believed is a proposition of a non-Fregean kind, but the ascription includes as part of its literal meaning that this proposition is believed under a way w of thinking of it. Here w is something like a Fregean proposition in certain respects, and is specified by the very words used in the content-sentence of the ascription. Substitution then has the side-effect of changing the relevant way of thinking, say from the “Superman can fly”-way to the “Clark can fly”-way, and this opens the door to change of truth-value.

The kind of proposition of which w is a way of thinking is known as a “Russellian” proposition, after a famous exchange between Russell and Frege (Frege and Russell 1904). Frege had claimed that Mont Blanc “with its snowfields” is not itself a component of the thought that Mont Blanc is more than 4,000 meters high, to which Russell replied that “in spite of all its snowfields Mont Blanc itself is a component part of what is actually asserted…a certain complex.” Accounts of Russellian propositions have been given in some detail (for example, Cresswell 1985, Crimmins 1992:117–24; see Jespersen 2003 for critical discussion), and in Schiffer’s scheme, attitude ascriptions invoke quasi-Fregean ways of thinking of such complexes, while the attitude itself is to a Russellian proposition.

In the approach of Forbes (1990, 1996), however, it is a Fregean proposition to which an attitude is held, but one that is specified as the way of thinking of the referent of the content-sentence, where this way is determined by that very sentence. The referent is not a truth-value, as Frege would have had it, but rather an abstract state of affairs, which is a structured entity not unlike a Russellian proposition, though one that fits better into a Fregean scheme. So (2a) becomes

(16)
That Superman can fly is so-believed by Lois or more long-windedly,

(17)
Lois believes her so-labeled way of thinking of the state of affairs that Superman can fly

in which “so” refers to “Superman can fly,” sealing it off from substitution in the same way as it does for “Giorgione” in (4). (17) requires for its truth that the ascriber’s content-sentence be a “linguistic counterpart” of some sentence of Lois’s that she would use to express the belief that (17) is attempting to ascribe to her (compare Richard’s notion of correlation), a belief which is a way of thinking of the state of affairs that Superman can fly (which is equally the state of affairs that Clark can fly and equally the state of affairs that Kal-El can fly).

One problem for (17) is that it requires reference-determining senses, whereas Schiffer-style approaches need not. Additionally, (17) departs from (16) in a rather substantial, if not frequently noticed, way: the “that”-clause disappears, and the clausal form of “believes” is replaced by the transitive one (the direct object in (17) is everything following “believes”). But though there seems to be an equivalence between believing that… and believing the proposition (thought, so-labeled way of thinking) that…, it does not generalize to other attitude verbs. For example, suspecting that Lex Luthor is involved is not the same thing as suspecting the proposition that Lex Luthor is involved (is anyone so paranoid as to suspect propositions?—Moltmann (2003:82) credits Arthur Prior with first noticing this issue). The same thing occurs, though for different reasons in different cases, with such verbs as “announce,” “anticipate,” “ask,” “boast,” “calculate,” “caution,” “complain,” “conclude,” “crow,” “decide,” “detect,” “discover,” “dream,” “estimate,” “forget,” “guess,” “hope,” “insinuate,” “insist,” “interrogate” (literary theory), “judge,” “know,” “notice,” “observe,” “plan,” “prefer,” “pretend,” “rejoice,” “require,” “see,” “suggest,” “surmise,” “suspect,” “understand,” and various cognates of these. The verbs for which the equivalence holds include inference verbs like “deduce” and “infer,” plus a few other examples like “doubt,” “establish,” and “verify.” Unfortunately, it would take us far afield were we to address the issue of how to modify (17) for the verbs for which the equivalence fails (see Forbes 2018 for one account).

As the previous paragraph indicates, some hyperintensional clausal verbs that can be used to ascribe propositional attitudes have hyperintensional transitive forms that can be used to ascribe what we might call objectual attitudes. These seem to generate failures of =E much as their clausal counterparts do. For example, “Lex fears Superman” is true, but “Lex fears Clark” does not seem any more plausible than “Lex fears that Clark will crush him.” The apparatus in (17) can be employed to express a hidden-indexical theory for the transitive verb case: the substitution-resistant reading of “Lex fears Superman” is “Lex fears Superman as such,” or “Lex fears Superman so-personified,” and the references of the “such” and “so” will change if “Clark” replaces “Superman,” producing the false “Lex fears Clark {as such/so-personified}.” A fuller version of the substitution-resisting semantics for “Lex fears Superman” might be

(18)
Lex fears Superman under the way of thinking of him that is so-labeled.

Here “under” forms an adverbial phrase modifying the whole verb-phrase in (18) headed by “fears” (there is some dispute about how such an “under” is to be accommodated; see Schiffer 1996, Ludlow 1996).

Hidden-indexical theories all preserve semantic innocence in roughly the same way: there is some entity, whether Russellian proposition or abstract state of affairs, determined by the customary referents of the words of the content-sentence, so the result is compatible with a Davidsonian decrying of any theory which claims that words in attitude ascriptions abandon their customary referents for something else. The “something else” is involved in a different way, a strategy which (17) and (18) illustrate.

Hidden-indexical semantics also offers an alternative formal account of the de re/de dicto distinction. Standardly, the difference is brought out in terms of scope distinctions, as we did in (10). But another possibility is that de re readings are those in which a hidden-indexical refers only to a part of the content-sentence: if Lois believes that her coworker Mary has gone to St. Petersburg, we may point at Mary and say “Lois believes that that woman is in St. Petersburg,” meaning that she believes some way of thinking of the state of affairs, partially labeled “is in St. Petersburg.” This would explain why the awkward locutions in (10) are rarely encountered in ordinary speech and writing.

b. Kripke’s Puzzle

One application of hidden-indexical semantics is to Kripke’s “puzzle about belief” (1979). Kripke doubts that there is a specific problem of interchange of coreferential names in attitude ascriptions, to be resolved by a semantics on which such substitution is fallacious. Rather, he thinks substitutivity problems are a mere symptom of broader anomalies in psychological discourse (“It would be wrong to blame…substitutivity. The reason does not lie in any specific fallacy [for example in (2)] but rather in the nature of the realm being entered,” 1979:157). So he gives examples meant to bring out anomalies even in the absence of substitution.

His main example is that of a subject, Peter, who encounters the same individual under the same name in different contexts and does not realize it was the same person all the time. Suppose Peter goes to a recital by a pianist named Paderewski, and, picking up the name from the recital program, comes to believe on the basis of the performance that Paderewski has musical talent. Later, at a railway station, he observes an individual surrounded by reporters, and someone tells him “That’s Paderewski, the Polish Prime Minister.” Far from connecting the man he sees with the man he heard play, Peter, who believes that no politician has musical talent, remarks out loud, “Ah, a person of no musical talent, then.” But, of course, Ignacy Jan Paderewski, the Prime Minister of Poland after the First World War, was also a celebrated composer and concert pianist.

Kripke wants us to try to answer the question, “Does Peter, or does he not, believe that Paderewski has musical talent?”, and in the course of our attempting to answer it, to realize that no answer can be given, because of “the nature of the realm being entered.” However, from the Fregean perspective, the example is less troubling, as Kripke recognizes (see also Taschek 1988). Peter has two lexical entries for “Paderewski,” in the same way that the present writer has three for “Socrates”—one for the Ancient Greek philosopher, another for the late Brazilian footballer, and a third for the former Portuguese Prime Minister (the latter two individuals had different first names, but I do not know what they are, and I do not know if the first individual had any other name; on the individuation of names, see Kaplan 1990). Of course, the difference between Peter and myself is that the names in Peter’s two lexical entries are coreferential, while the names in my three are, pairwise, not, unless the footballer, on retiring from the game, moved to Portugal and went into politics.

However, an ascriber A may only have one name for Paderewski (one mental file so-labeled), which puts A at a certain expressive disadvantage relative to Peter, if the ability to make an accurate report about Peter’s beliefs requires A to use names which match Peter’s. A would then need two names for Paderewski. But there is a very natural way around this (which Kripke uses himself, in n.37): A can simply say that Peter believes that Paderewski the pianist has musical talent, while Paderewski the statesman does not (Forbes 1990:561). From the perspective of a semantics like that of (17), the appositive uses of “the pianist” and “the statesman” determine different ways of thinking of the single state of affairs that Paderewski had musical talent. And it is only the way of thinking labeled with Peter’s linguistic counterpart of A’s “Paderewski the pianist has musical talent” that he believes: the appositives help us identify which of Peter’s ways of thinking of Paderewski we wish to invoke in our ascriptions. The question remains to explain why the major premise that Paderewski the pianist is Paderewski the statesman does not license the inference to “Peter believes that Paderewski the statesman has musical talent.” This would partly recapitulate our discussion of (2), though of course the appositives may bring their own complications.

It is also conceivable that ascribers in the know about Peter’s situation, addressing an audience also in the know, can rely on context to fix which belief is ascribed to Peter using “Paderewski has musical talent”; for instance, if the discussion concerns Peter’s evaluations of various pianists, the possessive description “Peter’s so-labeled way of thinking” is proper, rather than improper, since the other way of thinking, labeled with Peter’s linguistic counterpart of “Paderewski the statesman has musical talent,” will not be in the domain of the context, even if the discussion takes place after the railway-station encounter.

One can therefore resist Kripke’s question whether Peter does or does not believe that Paderewski had musical talent, just as I would resist the question “Was Socrates, or was he not, a chain-smoker?” The footballer was, but (I suppose) the philosopher was not, so absent contextual clues I would require disambiguation of the question: “Are you asking whether Socrates the footballer was a chain-smoker, or Socrates the philosopher?” In the Paderewski case, there is no referential ambiguity, but there is still an ambiguity, or indeterminacy, over which way of thinking of the state of affairs in question is being invoked: “Are you asking whether Peter believes Paderewski the pianist has musical talent, or Paderewski the politician?” would be a perfectly proper response. The explanation why it is perfectly proper is clear enough on hidden-indexical theories, but may not be so on others (see also Soames 2002, Chs. 2, 3).

Obviously, this account only works if there is a viable notion of the sense of a name. For those skeptical about the prospects of such a thing, Fine (2007) offers an alternative treatment of the puzzle. Fine begins with an explanation of the difference between “Superman is Superman” and “Superman is Clark”: in “Superman is Superman,” the two names are coordinated, but not in “Superman is Clark.” One manifestation of this is that someone who wonders whether Superman is Superman thereby demonstrates a failure to grasp what is said, while Lex can wonder whether Superman is Clark without demonstrating any failure of understanding. Since Fine takes the coordinated/uncoordinated distinction to be of semantic import, his view could be regarded as neo-Fregean, since he thinks “Superman is Superman” and “Superman is Clark” have different semantics, though his view of how the difference arises is quite unlike Frege’s (see Pickel and Rabern 2017 on some questions that arise for Fine’s account here).

Fine then argues that the case of Peter presents us with a puzzle whose solution is to be formulated in terms of this notion of coordination (2007:100–105). The puzzle is that our normal practices of belief-reporting dictate that we report Peter as believing that Paderewski has musical talent, and that we also report him as believing that Paderewski has no musical talent. At the same time, according to Fine, we do not want to make a “composite” report, that Peter believes that Paderewski has musical talent and believes that Paderewski has no musical talent, since this represents Peter as rather unreflective, which is unjustified (more reflection will not help). Yet the composite report is a simple “and”-Introduction inference from the acceptable reports. How can it sensibly be resisted?

Fine’s suggestion (2007:102–3) is that the composite report is unacceptable precisely because the reporter (who is in the know about Peter’s situation) uses

“Paderewski” in a coordinated way across the content-sentences of the composite report, while Peter does not use coordinated “Paderewski’s” in giving voice to his two beliefs. But the individual reports are acceptable, taken in isolation: there is nothing to be coordinated in an individual report, so we can simply take at face value Peter’s assertion of “Paderewski has musical talent,” even asserted after he has both entries in his lexicon, and ascribe such a belief to him. Whereas, for the Fregean, if there is nothing in the context to point toward one of “Paderewski the pianist” and “Paderewski the statesman” rather than the other, it will be indeterminate what belief is being ascribed (unless some feature of context settles it). And for the Fregean, the composite report, if it is the conjunction of two determinate ascriptions, is acceptable. Perhaps it makes Peter sound unreflective; but so does “The present writer believes Socrates was a chain-smoker and believes Socrates was not (ever) a chain-smoker,” though as I write it, it is true.

5. Russellianism

At the beginning of section 2, we noted that there is a possible response to the appearance of substitution-failure in (2) according to which the reasoning is not flawed at all: if Superman is Clark and Lois believes Superman can fly, she simply does believe that Clark can fly, even though she would not put it that way. The main motivation for this account is the view of propositions advanced by Russell in his letter to Frege quoted above, according to which Mont Blanc itself, not a way of thinking of it, is the sole constituent the name contributes to the proposition about its height. The locus classicus of this theory is Salmon (1986); other prominent contributions include Soames (1987), Saul (1997), and Braun (1998).

a. Salmon’s Theory

According to Salmon, belief-ascriptions invoke both Russellian propositions and ways of taking or of grasping those propositions. The apparently two-place attitude relation of belief unfolds into a three-place relation, with a position for a variable over ways of grasping. So for A believes p, Salmon offers (1986:111)

(19)
for some way of grasping propositions w, A grasps p by means of w and bel(A,p,w).

The correctness of the substitution inference (2) is immediate from this. If (2b) is true, Lois has a way of grasping the proposition that Superman can fly under which she believes this proposition. Ipso facto, she has a way of grasping the proposition that Clark can fly under which she believes this proposition, for it is the same proposition. Thus, (2c) is also true. Ways of grasping may be like Frege’s ways of thinking in some respects, but they are not what is believed, and they are not meant to determine reference.

Also note that Fine’s concern to avoid the composite ascription “Peter believes Paderewski has musical talent and believes Paderewski has no musical talent” is allayed, since the composite ascription is harmless on Salmon’s theory. For it involves two existential quantifiers over ways of grasping: there is some way of grasping the proposition that Paderewski has musical talent under which he believes it (more accurately, bels it), and some way of grasping the proposition that Paderewski has no musical talent, under which he believes it. The second way of grasping is no mere negation of the first, so there is nothing that imputes an intellectual deficiency to Peter (Salmon 1986:130–1).

The main question this account raises is why it seems so clear that there is a way of understanding (2) on which it is invalid. Salmon answers this question by distinguishing between semantically encoded and pragmatically imparted information (Salmon 1986:78). As far as what is semantically encoded is concerned, (2b) and (2c) are the same. But they differ over what they pragmatically convey, and those who think (2b) and (2c) can have opposite truth-values are mistakenly projecting the pragmatic difference onto the semantics. For example, it may be that (2c) pragmatically conveys that Lois believes that “Clark can fly” expresses a truth and that she would assent to it if asked. Loading this into the semantics would be like the mistake made by students in beginning logic classes when they reject “all Fs are G” on being informed that some Fs are G. The defeasible “not all” conveyed pragmatically by “some” obscures their view of the consistency of the two quantified statements.

A different explaining-away of the appearance of falsity in (2c) is provided by Braun (1998). Braun notes that since “Superman can fly” and “Clark can fly” express the same Russellian proposition, (2b) and (2c) express the same Russellian proposition as well. But someone judging (2b) and (2c) may take their common content in one way when judging (2b) and in another when judging (2c), which makes it at least intelligible that they resist the substitution inference.

So, there are things the Russellian can say about conversations among the screenwriters for Superman II, when they agree that at the start of the movie Lois should be shown beginning to suspect that Clark is Superman, and should then confirm that he is, by tricking him when he is personified as Clark into giving himself away. That the screenplay will thereby have Lois beginning to suspect that Clark is Clark, and then tricking him into revealing it, is overlooked by the writers: it never occurs to them (as a non-Russellian would say) that these are the same identity-proposition, taken in different ways.

Russellian propositions are “coarse-grained” compared to Fregean ones, for the latter are individuated in such a way that the propositions that Clark is Clark and that Clark is Superman are two. But once one accepts the distinction between proposition and way of taking the same, it is not clear what limits there are on the coarseness of grain that may be tolerated. There seems to be no obstacle to an unstructured conception of propositions as classes of possible worlds (Lewis 1979; Stalnaker 1984, 1987), and conceivably, it is defensible that true and false are the only propositions. (The same question about how much coarseness of grain is tolerable arises for hidden-indexical theorists who postulate indexically specified ways of thinking of Russellian propositions.)

b. Commonsense Psychology

Another question for Russellianism stems from the main purpose we have in ascribing attitudes: to arrive by abduction at explanations of behavior based on psychological generalizations (“those who believe Superman is present feel safer,” Rupert 2008:83). Someone who (i) feels safer if he believes that Superman is present, and (ii) sees that Clark is present, may still behave nervously or flee, which on the face of it is hard to understand if seeing that Clark is present is the same thing as seeing that Superman is present. Similarly, there are general normative principles of rationality such as

(20)
Anyone who believes a conditional proposition and its antecedent ought to infer its consequent.

This is not to say that such a person ought to believe its consequent: once the consequent is inferred, the thinker has various options, such as rejecting the conditional, or its antecedent, as alternatives to accepting its consequent. But a person who, at a minimum, does not make the inference, betrays a failure of rationality. However, Lex may believe the proposition that if Superman is nearby, then he, Lex, should hide. Lex may then notice and so come to believe that Clark is nearby, but take no steps to conceal himself. Yet if believing that Clark is nearby is the same thing as believing that Superman is nearby (bel-ing a certain proposition via some way of taking it), it seems that we should convict Lex of a failure of rationality, in that he remains unmoved by his two beliefs and so has apparently failed to use modus ponens. (The literature on logic, rationality, and closure under consequence is relevant here; see, for instance, Jago 2009, MacFarlane 2018, Staffel 2018.)

In response to this, Braun (2000) argues that psychological explanation employs ceteris paribus (other-things-equal) principles. For example, even in a case where it is clear to Lex that Superman is nearby, his making no attempt to hide does not mean, say, that he no longer believes he should hide if Superman is nearby, or no longer trusts modus ponens. He will only hide, or try to hide, other things equal. And if he already knows that he is in a location where there are no hiding places, his motivation to seek one is thereby overridden.

So far, this is just commonsense psychology. But according to Braun, there is a special way in which things might not be equal: although a conditional and its antecedent are believed, the antecedent as it occurs as minor premise of the modus ponens and the antecedent as it occurs as a constituent of the major premise may not be grasped in matching ways (2000:209). And if they are not, grounds for anticipating the expected behavior are removed. This means the principle stated in (20) is incorrect as it stands: the correct version would require a “matching ways” restriction. So there is no lapse of rationality on Lex’s part when he fails to use modus ponens in the case where he notices Clark is nearby, and so believes that Superman is nearby, and also believes he should hide if Superman is nearby. For the constituent corresponding to “Superman is nearby” in the way he takes the conditional is different from the way he takes the proposition that Superman is nearby when he comes to believe it once he has noticed that Clark is nearby. Braun admits (2000:234) that he cannot see any other way in which (20) is in need of qualification, so there is a whiff of the ad hoc about his response; but it does allow for a version of (20) acceptable to Russellians.

c. Saul on Simple Sentences

Another prominent defense of Russellianism, due to Saul (1997a, 1997b, 1999, 2007), focuses on “simple sentences,” sentences where we have a strong intuition of substitution-resistance, but there is no sense-invoking expression in the sentence whose semantics might underwrite the intuition. We have already noted one example, (21a) below. The other examples in (21) also manifest the phenomenon:

(21)
a. Clark is never around when Superman is.
b. Clark went into the phone booth and Superman came out.
c. Superman is more successful with women than Clark is.

There is a clear challenge to the Fregean in these examples. The inference in (2) fails, according to the Fregean, because of the semantics of “believes,” which requires its complement content-sentence to behave in a special way: to switch its reference, to make a double contribution to the truth-condition of the whole ascription, or to do whatever else one’s favored account of hyperintensionality proposes. But in the examples in (21), there is no expression which might force analogous behavior on the part of the names. Yet substitution of one name for the other in (21a) and (21c) produces something impossible, so, despite their apparent truth, (21a) and (21c) must be false. And substitution in (21b) seems to alter the meaning enough that the inference fails to be truth-preserving: (21b) appears to require a change of clothing or role, but a single substitution produces something which does not. These examples show that intuitions of substitution-failure do not depend on the presence of psychological vocabulary. And in the absence of anything else to explain them, they show that such intuitions must be mistaken.

Why, then, put any store in corresponding intuitions about (2)? However, hidden-indexical theorists can justify substitution-failure for the examples in (21) if they are willing to extend the scope of hidden-indexical introduction beyond attitude verbs. For instance, perhaps what we mean by (21b) is something along the lines of “Clark, so-attired, went into the phone booth, and Superman, so-attired, came out.” The “so” here accounts for substitution-failure as usual, since the names are associated with distinct ways of dressing: the “Superman” way (dressing as Superman) and the “Clark” way. For other examples, something more general than ways of dressing is needed, and this affords us an opportunity to make a partial unification of the cases of hyperintensional and simple sentences. A more general concept is that of personification, and using it, for (21a) we would have

(22)
Clark, so-personified, is never around when Superman, so-personified, is.

We have the same element of personification in the explanation of why fear of Superman is not the same thing as fear of Clark: to fear Superman, so-personified, is a very different thing from fearing Clark, so-personified (Forbes 2006:166–74).

A possible Fregean view, then, is that (22) is the literal meaning of (21a). According to Braun and Saul (2002) however, the intuition that (21a) can be true rests on some kind of confusion between it and the likes of (22); the latter certainly resists substitution, but differs in meaning from the former precisely because of that. Why would we suffer from such a confusion? Here Braun and Saul make use of the mental files metaphor, but they do not regard it as part of an account of difference in semantic content (see also Rupert 2008). We put information we would naturally express with one name in the file labeled with that name, and information we would naturally express with the other name goes into the file that other name labels. Then in assessing (21c), say, we compare the romantic history recounted in the entries in one file with that recounted in the other, and this task diverts our attention from the fact that the files concern the same individual. The attention-diverting element then explains why we judge (21c) to be true rather than impossible. Braun and Saul draw a parallel with the “Moses illusion” (2002:15–16), in which a large majority of subjects, when asked “How many animals of each kind did Moses take into the Ark?”, respond “Two,” partly because the “how many?” question diverts their attention from their knowledge that in the Bible it was Noah who took animals into his Ark (perhaps this happened to the reader just now).

But such an account cannot apply to speakers and writers who knowingly produce sentences like those in (21). For example, in a review of books about Shostakovich, the historian Orlando Figes wrote, “Shostakovich always signalled his connections to the classical traditions of St. Petersburg, even if he was forced to live in Leningrad” (The New York Review of Books, June 10, 2004, p.14). Far from having his attention somehow diverted from the fact that St. Petersburg is Leningrad, Figes is consciously writing for an audience aware of the identity, since only they will appreciate the rhetorical punch of his remark. And he will certainly resist an editor who proposes to replace “Leningrad” with a second “St. Petersburg,” even though there is nothing hyperintensional about being forced to live somewhere.

Another example comes from an article on the transformation of Eric Blair into George Orwell (Lingua Franca vol.9 #9). The writer of the article is hardly diverted from the fact that Blair is Orwell, since his topic is exactly how one personification came to be abandoned for another in the same individual:

Diffident in private, Blair so feared failure in the literary marketplace that he invented a pseudonym for the book he wrote based on his diaries, Down and Out in Paris and London. Criticism would be directed at George Orwell, not Eric Blair. But since the book, when published in 1933, was a literary success, Eric Blair became George Orwell.

Perhaps, “criticism would be directed at George Orwell, not Eric Blair” is hyperintensional, but “Eric Blair became George Orwell” is not; it clearly resists substitution of “George Orwell,” and it would be absurd to say that the writer only makes the claim because he has allowed himself to lose sight of the fact that Blair and Orwell are the same person.

A third example: a New Yorker cartoon in which Superman, so-personified, is talking to his therapist, and reports, “I’m doing super, but Clark can’t find a paper that’s hiring.” It is unclear who the cartoonist thought would find this funny, but knowing that it is the same person is required to get the joke.

These examples and others (including my favorite, in The New York Times’s “The Philosopher Stripper” article—see Forbes 2006:167–8) show that cases like (21)’s occur outside fiction, and that those who create them do so in full awareness of the relevant identity. That (21a) means what (22) means is certainly the most straightforward explanation of why (21a) is perfectly natural. So substitution-resistance in some simple sentences does not provide as great a threat to the claim of substitution-resistance in (2) as might at first seem, since the mechanisms producing the substitution-resistance may be seen as fundamentally the same in the two cases.

d. Richard’s Phone Booth

The final argument for Russellianism to be considered here is the well-known phone booth case in Richard (1983); I have updated it to cell phones. This example exploits the context-dependence of indexical expressions such as “I,” “here,” and “now.” The phenomenon of indexicality was one on which Frege had pronounced views: he wrote about “I” that (Frege 1967:25–6)

…everyone is presented to himself in a particular and primitive way, in which he is presented to no-one else. So when Dr. Lauben thinks he has been wounded, he will probably take as a basis this primitive way in which he is presented to himself. And only Dr. Lauben can grasp thoughts determined in this way. But now Lauben may want to communicate with others. He cannot communicate a thought which he alone can grasp. Therefore, if he now says “I have been wounded,” he must use “I” in a sense which can be grasped by others, perhaps in the sense of “he who is speaking to you at this moment”….

Whatever one thinks of the last remark, the idea that for each thinker x, “I” can be used by x to express a private first-person way of thinking of x, is one which has persisted since Frege proposed it, and is of course implicitly present in much of the history of philosophy, for example, in Descartes’ cogito. (For further discussion of first-person and more generally indexical and demonstrative thought, see Anscombe 1974, Castaneda 1968, Evans 1981, Lewis 1979, Magidor 2015, Peacocke 1983, 2008 Ch. 3, and Perry 1977, 1979.)

An example in Perry (1979) provides a dramatic illustration. Perry is pushing a grocery cart around the aisles in a store when he comes across a trail of sugar on the floor. He thinks “that person is making a mess” and sets off in pursuit to let them know that a bag of sugar in their cart has burst (“that person” is an example of “deferred ostension,” referring via the sugar trail to the person whose cart the sugar bag is in; see further Borg 2002). His pursuit brings him back to the same point in the store, and he realizes, “I am the one who is making a mess.” This appears to be a new thought, and a Fregean would say it differs from “that person is making a mess” in view of the difference between Perry’s demonstrative way of thinking expressed by “that person” and his first-person way of thinking, “I.”

Fregean first-person ways of thinking are private in the sense that if x and y are distinct thinkers, y cannot employ x’s “I”-way of thinking in y’s thoughts, certainly not as a way of thinking of y. However, this does not stop y from ascribing attitudes to x that require x to be employing x’s own first-person way of thinking (see Peacocke 1981, Percus and Sauerland 2003). y might say that Perry has just realized he himself is the one making a mess, which is to make the ascription “Perry has just so-realized that he himself is the one making a mess.” The ability to describe a Fregean proposition as one that is a special way of thinking of the state of affairs that Perry is making a mess does not imply that the constituents of that proposition are available to the ascriber to use in his or her own thoughts.

But de dicto ascriptions may not always be possible. If Perry says of some store employee, “she knows that I made the mess,” he is not ascribing knowledge to her of the proposition that is his “I made the mess”-labeled way of thinking of the state of affairs that Perry made the mess. From a Fregean point of view, the most Perry can mean is the de re “I am known by her to have made the mess,” since the store employee will probably have identified the culprit demonstratively, “that guy is making the mess,” after following the sugar trail. Perry cannot even ascribe a de dicto demonstrative belief to the employee using “she believes that guy is making a mess” pointing at his own reflection in a mirror. Ascribers using a demonstrative in the content-sentences of their ascriptions are expressing their own demonstrative ways of thinking of the relevant object, not characterizing the subject’s thought, which means that the ascriptions are de re (Forbes 1987:13–15).

Let us now return to Richard’s example. It involves switching contexts (“context-hopping”) and uses Kaplan’s (1989) apparatus to manage context-dependence. In Kaplan’s semantics for context-dependent expressions, sentences are evaluated taken in a context and with respect to a possible world, the circumstances of evaluation (1989:544). A context is a sequence of entities which provides referents for the indexicals and demonstratives in a sentence S and so determines the Russellian proposition S expresses. At a minimum, we would have an agent, a time, a place, and an addressee, to be the referents of “I,” “now,” “here,” and “you,” and an object x to be the referent of a demonstrative or demonstrative pronoun (Kaplan uses “agent” rather than “speaker” to allow for a sentence such as “I am not speaking right now” to be true with respect to silent circumstances). When contexts are systematically related, the truth-values of sentences given fixed circumstances are systematically related. For example, suppose that in circumstances w, X is listening to Y at noon Mountain Time (MT), 11/16/17, and let c be a context with X as its agent, noon 11/16/17 MT as its time, and Y as its addressee.

Then the sentence “I am now listening to you” is true taken in c with respect to w. But if we obtain a new context c* from c by switching agent and addressee, then “I am now listening to you” is false taken in c* with respect to w, since Y is speaking, not listening, to X at noon MT 11/16/17, in w. However, “you are now listening to me” is true taken in c* with respect to w, since “I am now listening to you” taken in c identifies the same state of affairs as “you are now listening to me” taken in c*, the state of affairs that X is listening to Y at noon MT, 11/16/17.

In the circumstances w of Richard’s example, a man a is in his apartment, talking to a woman o on his cell phone. a is also looking out the window onto the street below, where he sees a woman talking on her cell phone. It does not occur to a that the woman he is talking to on his phone might be the woman he is watching through his window; but in fact both are o. Then a notices a man in the street acting suspiciously, apparently trying to sneak up on o from behind. In this situation, a could use “she is in danger” to make a sincere assertion to o on his phone about what he sees. But a would not use “you are in danger” to make a sincere assertion to o speaking into his phone (a might instead open the window and shout down to the street). So in the context c with a as agent, o as phone addressee, and o as the referent of “she,” and taking at face value the facts about what a would and would not say with which referential intention as indicative of what a does and does not believe, the following appear to be true:

(23)
a. I believe she is in danger.
b. I do not believe you are in danger.

But Richard argues (1990:117–8) that (23b) is in fact false; in other words, that a does have a belief he could express by asserting into his phone “you are in danger” with the intention to address the person he is talking to. For if we now consider a context c* in which the woman o is agent (and, if we like, a is addressee), the truth of (23a) in c guarantees the truth of

(24)
The person watching me believes I am in danger

in c*. Consequently, if we switch back to the context c,

(25)
The person watching you believes you are in danger is true.

But there is a true identity in c which entails the falsity of (23b), namely,

(26)
I am the person watching you.

By =E, we have the anti-Fregean conclusion

(27)
I believe you are in danger

now seen to be true in c after all.

By Russellian lights, the reasoning is impeccable. But should it move the Fregean? For the Fregean, attitude ascriptions can be ambiguous between de re and de dicto construals, and this applies to (27) in particular. Does the derivability of (27) really show that in c the protagonist a can express a belief of his by asserting “you are in danger” into his phone, using “you” with the intention to refer to the woman he is talking to? Perhaps all that the derivation establishes is the truth of the de re reading of (27), “you are someone I believe to be in danger.” Note that to say that (27)’s de re reading is true in c is not to say that the agent of c believes that it is true, so it still does not give a grounds to say “you are in danger” into his phone.

(23a) can be understood de re as “she is someone I believe to be in danger,” and if the argument is construed de re throughout, the reasoning is correct. But of course the de re conclusion is not a problem for the Fregean. A de dicto conclusion might well be problematic, but to get one we must at least start with the reading of the premise (23a) on which it is a true de dicto self-ascription. Then, if the de re but not the de dicto reading of (27) is true, there must be some step in which there is a de dicto to de re switch. The switch appears to occur in moving from (23a) to (24).

(24) is relevantly similar to an ascription of Perry’s, “the store employee knows that I made the mess.” Here Perry is not ascribing knowledge of the proposition that is his “I made the mess”-labeled way of thinking of the state of affairs that Perry made the mess. By the same token, we should not construe (24) as o’s making an ascription to a of belief in the proposition that o expresses by “I am in danger.” For that way of thinking of the state of affairs that o is in danger is simply unavailable to a, since it involves o’s first-person way of thinking of herself. The truth of (24), then, is no more than the truth of “I am someone who the man watching me believes is in danger,” whose truth in c* is a consequence of (23a)’s truth in c. Thus, the de re conclusion follows from the de dicto starting point, but, to repeat, the de re conclusion is acceptable to the Fregean, since it is silent on what way of thinking the man watching o employs in his “she is in danger” thought.

Richard considers this kind of response (1990:128–32; see also 190–6 for his own critique of his earlier argument) and rejects it. This is partly because he thinks the response imputes opacity to subject-position in ascriptions, and partly because he is generally skeptical about the de re/de dicto distinction. But the above criticism does not seem to involve any opacity in subject-position, that is, a failure of =E when applied to ascriber, for the use of (26) is legitimate, there is no single context in which (23a)’s “I” and (24)’s “the man watching me” are coreferential, and the content-sentence is different in (23a) and (24). Certainly, the reference of “I” in c is the same as the reference of “the man watching me” in c*, but this does not threaten the use of =E if the content-sentence is fixed and interpreted uniformly, in Fine’s sense: “the man who is agent of c believes she is in danger” and “the man who is watching the agent of c* believes she is in danger” have the same truth-value if “she” is unequivocal, and in the second ascription, “she” is not anaphoric upon the embedded “the agent of c*.”

As for general skepticism about de re/de dicto, the reader may refer to the discussion in section 2. Relevant examples arise in extensions of Richard’s case, where the apparent truth of certain statements is easily explained using the distinction, but not without. Suppose that the suspiciously behaving man turns out to be a harmless drunk who staggers on by. The phone conversation then continues in such a way that a soon realizes that the woman he is talking to is the woman he was watching. a may then say such things to o over the phone as “so it was you I thought was in danger” or “I thought you were in danger but didn’t say anything because I didn’t realize it was you I was watching.” These are perfectly natural remarks and seem to be true along with (23b). Employment of the de re/de dicto distinction provides a straightforward explanation of how they can all be true together. So there is no need to take on the obligation burdening the Russellian, of always having to explain away the appearance of truth.

6. References and Further Reading

Almog, Joseph, John Perry, and Howard Wettstein (eds.) 1989. Themes from Kaplan. Oxford University Press.
Almog, Joseph and Paolo Leonardi (eds.) 2009. The Philosophy of David Kaplan. Oxford University Press.
Anderson, C. Anthony. 1980. Some New Axioms for the Logic of Sense and Denotation: Alternative
(0). Noûs 14:217–234.
Anscombe, Elizabeth. 1974. The First Person. In Mind and Language: The Wolfson Lectures, edited by Samuel Guttenplan, 45–65. Oxford University Press.
Bach, Kent. 1997. Do Belief Reports Report Beliefs? Pacific Philosophical Quarterly 78:215–241.
Bealer, George. 1993. A Solution to Frege’s Puzzle. In Philosophical Perspectives 7: Language and Logic, edited by James Tomberlin, 17–60. Ridgeview.
Berto, Francesco. 2013. Impossible Worlds. In Stanford Encyclopedia of Philosophy, edited by Edward Zalta. https://plato.stanford.edu/
Bjerring, Jens Christian, and Mattias Skipper Rasmussen. 2018. Hyperintensional Semantics: A Fregean Approach. Forthcoming in Synthese.
Borg, Emma. 2002. Pointing at Jack, Talking about Jill: Understanding Deferred Uses of Demonstratives and Pronouns. Mind & Language 17:489–512.
Braun, David. 1998. Understanding Belief Reports. The Philosophical Review 107:555–595.
Braun, David. 2000. Russellianism and Psychological Generalizations. Noûs 34:203–236.
Braun, David, and Jennifer Saul. 2002. Simple Sentences, Substitution, and Mistaken Evaluations. Philosophical Studies 111:1–41.
Brogaard, Berit. 2008. Attitude Ascriptions: Do You Mind the Gap? Philosophy Compass, Epistemology 3:93–118.
Burge, Tyler. 1978. Belief and Synonymy. The Journal of Philosophy 75:119–138.
Burge, Tyler. 1979. Sinning against Frege. The Philosophical Review 88:398–432.
Castaneda, Hector-Neri. 1968. On the Logic of Attributions of Self-Knowledge to Others. The Journal of Philosophy 65:439–456.
Chalmers, David. 2002. On Sense and Intension. In Sense and Direct Reference, edited by Matthew Davidson, 605–651. McGraw Hill.
Chalmers, David. 2011. Propositions and Attitude Ascriptions: A Fregean Account. Noûs 45:595–639.
Church, Alonzo. 1950. On Carnap’s Analysis of Statements of Assertion and Belief. Analysis 10:97–99. Also in Linsky (ed.) 1971, 168–170.
Church, Alonzo. 1951. A Formulation of the Logic of Sense and Denotation. In Structure, Method and Meaning: Essays in Honor of Henry M. Sheffer, edited by P. Henle, H. M. Kallen and S. K. Langer, 3–24. Liberal Arts Press.
Corazza, Eros. 2010. From “Giorgione” Sentences to Simple Sentences. Journal of Pragmatics 42:544– 556.
Cresswell, Max. 1985. Structured Meanings. The MIT Press.
Crimmins, Mark. 1992. Talk About Belief. The MIT Press.
Crimmins, Mark, and John Perry. 1989. The Prince and the Phone Booth: Reporting Puzzling Beliefs. The Journal of Philosophy 86:685–711.
Davidson, Donald. 1969. On Saying That. In Davidson and Harman (eds.), 73–91.
Davidson, Donald, and Gilbert Harman (eds.) 1969. Words and Objections: Essays on the Work of W. V. Quine. Reidel.
Davies, Martin. 1981. Meaning, Quantification and Necessity. Routledge.
Davies, Martin, and Lloyd Humberstone. 1980. Two Notions of Necessity. Philosophical Studies 38: 1–30.
Dennett, Daniel. 1982. Beyond Belief. In Thought and Object, edited by Andrew Woodfield, 1–95. Oxford University Press.
Donnellan, Keith. 1966. Reference and Definite Descriptions. The Philosophical Review 75:281–304.
Donnellan, Keith. 1974. Speaking of Nothing. The Philosophical Review 83:3–30.
Dummett, Michael. 1973. Frege: Philosophy of Language. Duckworth.
Evans, Gareth. 1973. The Causal Theory of Names. Proceedings of the Aristotelian Society, Supplementary Volume 47:187–208. Also in Evans 1985, 1–24.
Evans, Gareth. 1981. Understanding Demonstratives. In Evans 1985, 291–321. Oxford University Press.
Evans, Gareth. 1985. Collected Papers, edited by Antonia Phillips. Oxford University Press.
Fara, Delia. 2001. Descriptions as Predicates. Philosophical Studies 102:1-42.
Field, Hartry. 1978. Mental Representation. Erkenntnis 13:9–53.
Fine, Kit. 1989. The Problem of De Re Modality. In Almog, Perry and Wettstein (eds.) 1989, 197–272.
Fine, Kit. 2007. Semantic Relationism. Blackwell.
Forbes, Graeme. 1987. Indexicals and Intensionality: A Fregean Perspective. The Philosophical Review 96:3–31.
Forbes, Graeme. 1990. The Indispensability of Sinn. The Philosophical Review 99:535–563.
Forbes, Graeme. 1996. Substitutivity and the Coherence of Quantifying In. The Philosophical Review 105:337–72.
Forbes, Graeme. 2006. Attitude Problems. Oxford University Press.
Forbes, Graeme. 2018. Content and Theme in Attitude Ascriptions. In Non-Propositional Intentionality, edited by Alex Grzankowski and Michelle Montague. Oxford University Press.
Fox, Chris, and Shalom Lappin. 2005. Foundations of Intensional Semantics. Basil Blackwell.
Frege, Gottlob. 1892. Uber Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik
100:25–50. Translated as “On Sense and Reference,” in Translations from the Philosophical Writings of Gottlob Frege, edited by Peter Geach and Max Black, 1970. Basil Blackwell.
Frege, Gottlob, and Bertrand Russell, 1904. Selection from the Frege-Russell Correspondence. In Salmon, N. and Scott Soames (eds.), 56–57.
Frege, Gottlob. 1967. The Thought: A Logical Enquiry. In Philosophical Logic, edited by Peter Strawson, 17–38. Oxford University Press.
Geach, Peter. 1969. The Perils of Pauline. Review of Metaphysics 23:287–300.
Gluer, Kathrin, and Peter Pagin. 2006. Proper Names and Relational Modality. Linguistics and Philosophy 29:507–535.
Gluer, Kathrin, and Peter Pagin. 2012. General Terms and Relational Modality. Noûs 46:159–199.
Heim, Irene, and Angelika Kratzer. 1998. Semantics in Generative Grammar. Basil Blackwell.
Jago, Mark. 2009. Logical Information and Epistemic Space. Synthese 167:327–341.
Jespersen, Bjorn. 2003. Why the Tuple Theory of Structured Propositions Isn’t a Theory of Structured Propositions. Philosophia 31:171–183.
Kadmon, Nirit. 2001. Formal Pragmatics. Blackwell.
Kaplan, David. 1969. Quantifying In. In Davidson, D. and Harman G. (eds.), 206–242; also in Linsky, L. (ed.), 112–144. Page references here are to the Linsky reprint.
Kaplan, David. 1986. Opacity. In The Philosophy of W. V. Quine, edited by Lewis Edward Hahn and Paul Arthur Schilpp, 229–289. Open Court.
Kaplan, David. 1989. Demonstratives. In Joseph Almog, John Perry and Howard Wettstein (eds.) 1989, 481–563.
Kaplan, David. 1990. Words. Proceedings of the Aristotelian Society 64:93–117.
Kazmi, Ali Akhtar. 1987. Quantification and Opacity. Linguistics and Philosophy 10:77–100.
Kripke, Saul. 1972. Naming and Necessity. In Semantics of Natural Language, edited by Donald Davidson and Gilbert Harman, 252–355. Reidel. Republished with an introduction as Naming and Necessity by Saul Kripke, Harvard University Press 1980. Page references here are to the 1980 book.
Kripke, Saul. 1979. A Puzzle about Belief. In Meaning and Use, edited by Avishai Margalit, 239–283. Reidel. Also in Salmon and Soames 1988: 102–148, and in Kripke, S. (ed.) 2011, 125–161. Page references in this article are to the last of these.
Kripke, Saul. 2001. Frege’s Theory of Sense and Reference. In Kripke, S. (ed.) 2011, 254–291.
Kripke, Saul. 2008. Unrestricted Exportation and Some Morals for the Philosophy of Language. In Kripke, S. (ed.) 2011, 322–350.
Kripke, Saul. 2011. Philosophical Troubles. Oxford University Press.
Kvart, Igal. 1984. The Hesperus-Phosphorus Case. Theoria 50:1–35.
Lewis, David. 1979. Attitudes de dicto and de se. The Philosophical Review 88:513–543.
Linsky, Leonard. 1967. Referring. Humanities Press.
Linsky, Leonard, (ed.) 1971. Reference and Modality. Oxford University Press.
Larson, Richard K., and Peter Ludlow. 1993. Interpreted Logical Forms. Synthese 95:305–355.
Lawlor, Krista. 2005. Confused Thought and Modes of Presentation. The Philosophical Quarterly 55:21–36.
Loar, Brian. 1972. Reference and Propositional Attitudes. The Philosophical Review 81:43–62.
Ludlow, Peter. 1996. The Adicity of “Believes” and the Hidden Indexical Theory. Analysis 56:97–101.
MacFarlane, John. 2018. In What Sense (If Any) is Logic Normative for Thought? in translation, in Modern Logic: Its Subject Matter, Foundations and Prospects, edited by D. Zaitsev, 345–383. Forum.
Magidor, Ofra. 2015. The Myth of the De Se. In Philosophical Perspectives 29: Epistemology, edited by John Hawthorne and Jason Turner, 249–283. Wiley Blackwell.
Maier, Emar. 2015. Parasitic Attitudes. Linguistics and Philosophy 38:205–236.
Marcus, Ruth Barcan. 1961. Modalities and Intensional Languages. Synthese 13:303–322. Also in Marcus 1993, 3–35.
Marcus, Ruth Barcan. 1962. Interpreting Quantification. Inquiry 5:252–259.
Marcus, Ruth Barcan. 1975. Does the Principle of Substitutivity Rest on a Mistake? In The Logical Enterprise, edited by Alan Anderson, Ruth Barcan Marcus and Richard Martin, 31–38. Yale University Press. Also in Marcus 1993, 101–109.
Marcus, Ruth Barcan. 1993. Modalities. Oxford University Press.
Mates, Benson. 1952. Synonymity. In Semantics and the Philosophy of Language, edited by Leonard Linsky, 111-136. University of Illinois Press.
Millikan, Ruth. 2000. On Clear and Confused Ideas. Cambridge University Press.
Moltmann, Friederike. 2003. Propositional Attitudes without Propositions. Synthese 135:70–118.
Moltmann, Friederike. 2008. Intensional Verbs and Their Intentional Objects. Natural Language Semantics 16 (3):239–270.
Moltmann, Friederike. 2017. Cognitive Products and Semantics of Attitude Verbs and Deontic Modals. In Act-Based Conceptions of Propositional Content, edited by Friederike Moltmann and Mark Textor. Oxford University Press.
Moss, Sarah. 2018. Probabilistic Knowledge. Oxford University Press.
Muskens, Reinhardt. 2005. Sense and the Computation of Reference. Linguistics and Philosophy 28:473–504.
Pagin, Peter, and Dag Westerståhl. 2010. Pure Quotation and General Compositionality. Linguistics and Philosophy 33:381–415.
Parsons, Terence. 1981. Frege’s Hierarchies of Indirect Senses and the Paradox of Analysis. In Midwest Studies in Philosophy Vol. VI: The Foundations of Analytic Philosophy, edited by P. A. French, T. Uehling and H. Wettstein, 37–58. Minnesota University Press.
Parsons, Terence. 2009. Higher-Order Senses. In Almog, J., and Leonardi, P. (eds.), 45–59.
Partee, Barbara. 2003. Privative Adjectives: Subsective Plus Coercion. In Presuppositions and Discourse. Essays offered to Hans Kamp, edited by Rainer Bäuerle, Uwe Reyle and Thomas Ede Zimmerman. Elsevier.
Peacocke, Christopher. 1981. Demonstrative Thought and Psychological Explanation. Synthese 49:187–217.
Peacocke, Christopher. 1983. Sense and Content. Oxford University Press.
Peacocke, Christopher. 2008. Truly Understood. Oxford University Press.
Peacocke, Christopher. 2009. Frege’s Hierarchy: A Puzzle. In Almog, J., and Leonardi, P. (eds.), 159–186.
Percus, Orin, and Uli Sauerland. 2003. On the LFs of Attitude Reports. Proceedings of Sinn und Bedeutung 7:228–42.
Perry, John. 1977. Frege on Demonstratives. The Philosophical Review 86:474–497.
Perry, John. 1979. The Problem of the Essential Indexical. Noûs 13:3–31.
Pickel, Bryan, and Brian Rabern. 2017. Does Semantic Relationism Solve Frege’s Puzzle? The Journal of Philosophical Logic 46:97–118.
Predelli, Stefano. 2010. Substitutivity, Obstinacy, and the Case of Giorgione. The Journal of Philosophical Logic 39:5–21.
Quine, W. V. 1956. Quantifiers and Propositional Attitudes. The Philosophical Review 53:177–187. Also in The Ways of Paradox by W. V. Quine, 1966, 177–87. Harvard University Press.
Quine, W. V. 1961. Reference and Modality. In From a Logical Point of View by W.V. Quine, Harper and Row, 139–157. Also in Linsky, L. (ed.), 1971, 168–170. Page references here are to the Linsky reprint.
Recanati, François. 2000. Oratio Obliqua, Oratio Recta. The MIT Press.
Recanati, François. 2012. Mental Files. Oxford University Press.
Richard, Mark. 1983. Direct Reference and Ascriptions of Belief. The Journal of Philosophical Logic 12:425–452.
Richard, Mark. 1986. Quotation, Grammar and Opacity. Linguistics and Philosophy 9:383–403.
Richard, Mark. 1990. Propositional Attitudes. Cambridge University Press.
Rupert, Robert. 2008. Frege’s Puzzle and Frege Cases: Defending a Quasi-Syntactic Solution. Cognitive Systems Research 9:76–91.
Russell, Bertrand. 1905. On Denoting. Mind 14:479–493. Also in Logic and Knowledge by Bertrand Russell, edited by R. C. Marsh, 41–56. Allen & Unwin, 1956.
Saka, Paul. 2006. The Demonstrative and Identity Theories of Quotation. The Journal of Philosophy 103:452–471.
Saka, Paul. 2018. Superman Semantics. In Advances in Pragmatics and Philosophy II, edited by Alessandro Capone et al., 141–157. Springer.
Salmon, Nathan. 1981. Reference and Essence. Princeton University Press.
Salmon, Nathan. 1986. Frege’s Puzzle. The MIT Press.
Salmon, Nathan. 1990. A Millian Heir Rejects the Wages of Sinn. In Propositional Attitudes, edited by C. Anthony Anderson and Joseph Owens, 215–247. CSLI Publications.
Salmon, N. and Scott Soames (eds.) 1988. Propositions and Attitudes. Oxford University Press.
Saul, Jennifer. 1997a. Substitution and Simple Sentences. Analysis 57:102–108.
Saul, Jennifer. 1997b. Reply to Forbes. Analysis 57:114–118.
Saul, Jennifer. 1999. Substitution, Simple Sentences, and Sex-scandals. Analysis 59:106–112.
Saul, Jennifer. 2007. Simple Sentences, Substitution, and Intuitions. Oxford University Press.
Schiffer, Stephen. 1979. Naming and Knowing. In Contemporary Perspectives in the Philosophy of
Language, edited by P. A. French, T. Uehling and H. Wettstein, 61–74. University of Minnesota Press.
Schiffer, Stephen. 1992. Belief Ascription. The Journal of Philosophy 89:499–521.
Schiffer, Stephen. 1996. The Hidden-Indexical Theory’s Logical-Form Problem: A Rejoinder. Analysis 56:92–97.
Schweizer, Paul. 1993. Quantified Quinean S5. The Journal of Philosophical Logic 22:589–605.
Segal, Gabriel. 1989. A Preference for Sense and Reference. The Journal of Philosophy 86:73–89.
Sleigh, R. C. 1968. On a Proposed System of Epistemic Logic. Noûs 2:391–398.
Smullyan, Arthur. 1948. Modality and Description. The Journal of Symbolic Logic 13:31–37. Also in Linsky 1971, 35–43.
Soames, Scott. 1987. Direct Reference, Propositional Attitudes, and Semantic Content. Philosophical Topics 15:47–87.
Soames, Scott. 2002. Beyond Rigidity. Oxford University Press.
Sosa, Ernest. 1970. Propositional Attitudes De Dicto and De Re. The Journal of Philosophy 67:883–896.
Staffel, Julia. 2018. Attitudes in Active Reasoning. In Reasoning. New Essays on Theoretical and Practical Thinking, edited by Magdalena Balcerak Jackson and Brendan Balcerak Jackson. Oxford University Press.
Stalnaker, Robert. 1984. Inquiry. The MIT Press.
Stalnaker, Robert. 1987. Semantics for Belief. Philosophical Topics. Also in Content and Context by Robert Stalnaker, 117–129. Oxford University Press, 1999.
Taschek, William. 1988. Would a Fregean Be Puzzled by Pierre? Mind 97:99–104.
Taylor, Kenneth. 2002. De Re and De Dicto: Against the Conventional Wisdom. Philosophical Perspectives 16:225–265.
Thomason, Richmond. 1980. A Model Theory for Propositional Attitudes. Linguistics and Philosophy 4:47–70.
Tichy, Pavel. 2004. The Myth of Non-Rigid Designators. In Pavel Tichy’s Collected Papers in Logic and Philosophy, edited by Vladimir Svoboda, Bjorn Jespersen and Colin Cheyne.
Washington, Corey. 1992. The Identity Theory of Quotation. The Journal of Philosophy 89:582–605.
Yalcin, Seth. 2015. Quantifying in from a Fregean Perspective. The Philosophical Review 124:207–253.
Zalta, Edward. 2001. Fregean Senses, Modes of Presentation, and Concepts. In Philosophical Perspectives 15: Metaphysics, edited by James Tomberlin, 335–359.

Author Information

Graeme Forbes
Email: graeme.forbes@colorado.edu
University of Colorado
U. S. A.

Sayyid Qutb (1906—1966)

Sayyid Qutb was one of the leading Islamist ideological thinkers of the twentieth century. Living and working in Egypt, he turned to Islamism in his early forties after about two decades as a secular educator and literary writer. As an Islamist, he held that all aspects of society should be conducted according to the Shari’a, that is, laws of God as derived from the Qur’an and the practice (sunna) of the Prophet Muhammad. Probably his best known and most distinctive doctrine is his interpretation of jahiliyya (pre-Islamic ignorance) as characterizing all of the societies of his time, including the Muslim ones. Another doctrine was his interpretation of faith in one God only (tawhid) as entailing the absolute sovereignty of God (hakimiyyat Allah) and the liberation of humans from service to other humans instead of God. He was executed by the Egyptian government for his Islamist activities and is thus considered a martyr, something that has added immeasurably to the impact of his ideas.

Although he did not consider himself a philosopher, he had opinions on a number of topics that interest philosophers, and he commented on the ideas of philosophers. He had a grand vision of the universe as a harmonious whole under God’s rule and of humans as called upon to be God’s deputies in managing the Earth. Humans, however, were given a measure of freedom that other beings do not have. Rightly used, this freedom would allow humans to fit in harmoniously with the rest of creation and have the highest status under God. Misused, it would introduce discord into the world and misery into human life. Jahiliyya equates to misuse of this freedom, and Qutb calls for jihad, conceived along the lines of revolution, as the response. In discussing these things, he touches on a range of topics, including the nature of God and the universe, human nature, knowledge and revelation, ethics, society, human history, death, and judgment. This article presents only the latest and most radical phase of his thought.

Biography
Basic Conception
God
Human Nature and Purpose, Other Spiritual Beings
Free Will and Predetermination, The Problem of Evil
Knowledge: Revelation, Worldly Knowledge
Ethical Values, Shari’a
The Ideal Society (Utopia), Economics. Gender Relations
Jahiliyya (Dystopia) and Jihad (Revolution)
Human History
Death, Judgment, Martyrdom
Qutb’s Legacy
Final Remarks: Aesthetics, Harmony, and Essentialism
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Biography

Sayyid Qutb (1906—1966) was and is one of the most important ideologues of the Islamist movement, which seeks to re-establish truly Islamic values and practices in Muslim societies that have become more or less Westernized. He was born and raised in an Egyptian village, attended the state primary school there, and in 1920 moved to Cairo to attend secondary school and then Dar al-‘Ulum, a teacher training institute that sought to balance traditional and modern ways. From 1933 to 1952 he worked in the Ministry of Education, first as a teacher and later as an inspector and administrator. He also became one of the secular literary elite prominent at the time, publishing more than 100 poems as well as articles and books on literary and social topics. In 1948, he rather abruptly began to publish Islamist articles and the next year published a major Islamist book, Social Justice in Islam, which was to go through a total of six editions. The reasons for this shift are not totally clear, but the chaos of Egyptian politics, the efforts of imperialist powers to reassert their position, and the establishment of the state of Israel presumably played a role. His Islamism was confirmed during a two-year (1948-1950) study tour of the United States, which he found to be technologically impressive but hopelessly corrupt morally.

After his return to Egypt he joined the Muslim Brothers, the leading Islamist organization, founded in 1928 by Hasan al-Banna, and soon became one of its leading spokespersons. The Brothers supported the Free Officers’ revolution in 1952 at first but soon withdrew support. After an attempt on the life of Abdel Nasser in 1954, the leading Brothers were imprisoned, Sayyid Qutb among them. In prison, they suffered very harsh treatment, though poor health spared Qutb the worst of it. This led to a radicalization of his ideas, including the claim that the whole world, including the “Muslim” world, is in a state of jahiliyya, that is, un-Islamic ignorance and barbarism. This radicalization was assisted by the ideas of the extremely influential Indo-Pakistani Islamist Abu’l ‘Ala’ Mawdudi (1903-1979), whose writings became known to Qutb and other Arab thinkers from about 1951. Mawdudi’s ideas about divine sovereignty, the Islamic state, jahiliyya, and other things spoke very much to Qutb’s condition and helped him to crystalize and articulate his views.

In 1964, Qutb was released from prison and published his best-known book, Milestones, effectively calling for an Islamic revolution. He also became mentor to a group of young Brothers and was soon arrested for conspiring to overthrow the government. In 1966, he was convicted of this charge and executed. He thus became a martyr to his cause, considerably multiplying his influence.

Qutb wrote a number of books during his Islamist period in addition to those mentioned, especially a multi-volume commentary on the Qur’an, In the Shadow of the Qur’an, which he began in 1952 and was still revising at the time of his death.

Qutb’s radical ideas divided the Muslim Brothers after his death. The main line group rejected them and sought to work within the existing political system, briefly achieving the presidency in 2012-2013. Smaller groups, such as the so-called Takfir wa-Hijra group, Jama‘at al-Islamiyya (Islamic group), and Tanzim al-Jihad (Jihad organization), adopted and modified Qutb’s ideas and were responsible for considerable terrorism through the 1990s (see below). His influence spread far beyond Egypt, indeed throughout the whole of the Islamic world and its diaspora. This included extreme groups such al-Qaeda, whose second leader, Ayman al-Zawahiri, was very much influenced by Qutb’s main ideas and his example as a martyr, and who first joined an Islamist group the year that Qutb was executed. In fact, Qutb has come to be seen by many as the spiritual “godfather” of such groups. On the other hand, it is possible to read him selectively, and so he has influenced many who do not fully accept his extreme views. There is a considerable literature on him both in Islamic and Western languages.

Qutb was not a philosopher by most definitions of the term, and he consciously rejected philosophy as he understood it, both Western philosophy and classical Islamic philosophy. He considered the discipline to be an effort to accomplish with human reason what can only be accomplished on the basis of divine revelation and also as a foreign intrusion on pure Islamic thought. Nevertheless, his thinking was quite systematic and did have a place for reason; moreover, he used rational arguments in criticizing philosophy and made reference to Western philosophers (mostly known to him through Arabic translations) in the process. He also deals with many topics that are of interest to philosophers. He is a good example of Weber’s Wertrationalität (rationality in accordance with moral demands).

The following article is based entirely on the last phase of his writing, from about 1958, during which he rejected many of his earlier ideas. This phase was the most radical, most systematic, and most influential.

2. Basic Conception

Qutb saw his ideas as a necessary interpretation and corollary of the basic Muslim creed: “There is no god but God; Muhammad is the Messenger of God.” His views fall within the wide spectrum of Sunni Islamic thinking but particularly within the forms of it commonly labelled “Islamist” (stressing the application of Islamic norms to society) and “Salafi” (broadly, those who emphasize the authority of the Qur’an, Sunna, and the earliest generations of successors, the salaf, over against later “innovations”). Like many popular writers on religious topics in modern times, he did not have the traditional education given to the ‘ulama’ (religious scholars) and was to some extent self-taught in this area.

The article focuses primarily on the more basic and theoretical aspects of Qutb’s writing (what we might call his philosophy or theology), which he calls the Islamic tasawwur, a word usually translated “concept” or “conception,” but which here could also be translated “worldview” or “vision.” Qutb, in the manner of fundamentalists and also scientists, does not consider this his conception but the true conception. He characterizes this conception as divinely sourced, and following from that: fixed in its basics, comprehensive, balanced, dynamically positive, realistic, and unified.

The tasawwur grows out of its divine source and does not need or accept significant influence from the outside. Therefore, Qutb criticizes not only contemporary modernists, who wish to “reform” Islam in terms of modern, that is, Western ideas and ideologies, but also the earlier Muslim philosophers and theologians, who made use of Greek philosophical ideas. We may note that Qutb is firmly of the view that ideas are prior to actions, which flow from them. The ideas are not ends in themselves, however, but are meant to undergird actions and activities. In fact, all of human life and activity flows from a creedal tasawwur of some kind. Qutb often describes Islam (and religion more generally) in terms of three stages: tasawwur, manhaj (method, program), nizam (social and political order). Each stage proceeds from the former one with almost logical necessity. All three are necessary for Islam to exist. Since Qutb believed that there was no Islamic nizam in his time, he often said that Islam has no “existence.” We may note that Qutb’s Islam is a highly reified concept, not just a label applied variously to diverse human ideas and practices.

3. God

The centrepiece of the tasawwur is God (Allah), that awesome being Whose essence and some of Whose attributes are beyond the reach of human understanding, though many attributes can be understood by the human mind. (Qutb does not discuss the relation between God’s essence and attributes, an important theme in traditional Muslim theology.) These attributes belong only to God and comprise his divinity; no other being shares in them. God is one and unique. This is the first and most basic constituent of the tasawwur, and recognition of it is called tawhid (the usual Arabic term for belief in one God). God is also eternal, without beginning or end.

This God is the creator and source of everything else in existence. These things are separate from God but totally dependent on Him and harmoniously obey regular laws, some of which can be and have been discovered by human science. These laws are not separate from God, however. God acts directly in all that happens, so that these “laws” are just His customary way of acting. Since His will is completely free, He can and sometimes does vary His action and produce what we call miracles. For example, fire usually burns things, but God might make it not do so on some occasion, as in the story of the prophet Ibrahim (Abraham) in the Qur’an. Such events do not disrupt the general order and harmony of the universe, however, since they are part of God’s larger plan. While most of creation obeys God necessarily, humans in their moral aspect may or may not obey. Instead, they are subject to a moral law established by God, the Shari’a, which will put them in harmony with creation if they obey it.

God is therefore the Lord and Sustainer of all creation, while all creation stands in a relation of servanthood to Him, necessarily in the case of most things, willingly or unwillingly in the case of humans (disobedient humans are still servants). It follows necessarily from all of these attributes that God is the only source of authority and the only sovereign in the universe, not only physically but also morally, legally, and politically. No human ruler or nation may claim sovereignty, a point of major importance for Qutb’s revolutionary doctrine. These central ideas reflect those of Mawdudi, though Qutb probably stresses them more. His term for the sovereignty of God, hakimiyyat Allah, comes from the Arabic translation of Mawdudi’s term for the same thing.

4. Human Nature and Purpose, Other Spiritual Beings

Humans hold a very special place in God’s creation, as already indicated. According to the Qur’an, God created the human body and breathed His spirit into it, and He gave humans a status above the angels, whom he commanded to prostrate to the first man. Human nature as originally created, and in its proper state, is called fitra, and this fitra has a need for God and a predisposition to serve Him. The Islamic tasawwur is congruent with it. The fitra may be obscured by human whims, desires and negligence, but is not destroyed.

The basic purpose of humans is to serve God willingly in all aspects of life. They are to do so in the honorable role of God’s deputy, khalifa, over the earth. They are responsible for making it fruitful, developing it technologically, caring for it, and organizing a just society in accordance with God’s Shari’a. This idea is very important to Qutb.

The only significant distinctions among humans in God’s sight are based on their obedience or disobedience to His will. Otherwise all are of equal value regardless of race, ethnicity, nationality, class, or gender, although in the last case there are significant differences of function to be discussed below.

Angels are spiritual beings who serve God and are always obedient to Him. They carry God’s throne, deliver God’s messages to the prophets, watch over the gates of paradise and hell, record the actions of humans, support them in their struggle against evil, pray for them, and cause them to die when their time comes. Jinn (the “genies” of the Arabian Nights) are made of fire, can live on the face of the earth or inside it, can move very swiftly, and are invisible, though they may become visible to humans. They have the power of moral choice and are commanded to serve God just as humans are. Some are believers, and some are not. They will be resurrected on the last day and go to paradise or hell. The Devil is a jinn. Satans may be humans or jinn; they tempt human beings and are enemies to prophets. We know about all of these because the Qur’an tells us. Human science knows nothing of them, though it may discover something about them some day. Awareness of these creatures expands our world beyond the limited one of physical perception.

5. Free Will and Predetermination, The Problem of Evil

But are humans really free in their moral choices, given that God is directly involved in determining everything that happens? Like earlier Muslim theologians, Qutb seeks to affirm both (this is one of the ways the Islamic tasawwur is balanced). He states that the human will works within the bounds of divine determination and that this divine determination is realized through human will. The precise relationship between them is one of those things that are beyond the capacity of human reason to comprehend. Some degree of human freedom is necessary for moral responsibility and for the activist position that Qutb took, while certainty that God is in control is important for the small, struggling revolutionary movement of which he was a part.

But why does evil exist at all and why do good people suffer? From time to time Qutb suggests various partial answers to the latter question. People suffer because they violate the physical or moral laws, or God causes them to suffer to teach them or to provide challenges. This world is a place of trial and striving, and the suffering of a good person will be compensated in the future life, and possibly also in this life. As to why God did not create a world without suffering and evil, this question is not raised by sincere believers, who respect God too much and know that the issue is beyond the capacity of the human intellect to deal with, nor is it raised by serious atheists since they do not believe in God. It is raised by those who are argumentative or not serious.

6. Knowledge: Revelation, Worldly Knowledge

How do humans know of God and of the truths enshrined in the Islamic tasawwur? The human fitra can perceive something about God in the harmony of the universe that He has created and runs (that is, the Teleological Argument), but of primary importance is God’s word revealed to messengers to whom He has given a special nature that allows them to receive His messages and particularly that given to the Prophet Muhammad in the Qur’an. The text of the Qur’an contains the verbatim words of God and provides information about God, the universe, aspects of human, divine moral and legal commands, and the final judgment of human by God. It calls on humans to reflect on the signs of God in the harmony of the universe. It is from the Qur’an that the Islamic tasawwur is directly and exclusively derived.

The Qur’an speaks to all aspects of the human fitra, not only to reason but also to the emotions and the aesthetic sense. According to Qutb and most Muslims, it has the power to influence people directly through these. Qutb gives examples of this, including one in which a woman was converted to Islam by hearing the recitation of the Qur’an. In the years before he embraced Islamism, Qutb wrote two books exploring the literary nature of the Qur’an (Artistic Depiction in the Qur’an and Scenes of the Resurrection in the Qur’an) and concluded that its power comes from producing extremely evocative word pictures for the reader. He appears to have continued to hold this theory in his Islamist period though not limiting the power of the Qur’an to it.

Qutb generally insists on interpreting the text in terms of its plain meaning, but in the case of realities that are beyond human comprehension he understands it to provide allusions that inspire the human soul. These realities include the divine essence, the connection between will of creator and creation, and the nature of the spirit. For the rest, reason can receive the revelation and interpret it, along with other faculties. On the whole, Qutb avoids metaphorical or esoteric interpretations of the Qur’an.

One should seek and may derive direct inspiration from the Qur’an, especially if one has a close and ongoing relation to it. Qutb claims to have lived for years “in the shadow of the Qur’an” (this is also the title of his Qur’an commentary). Especially important is the intention to act on what one reads. One is not to read the Qur’an simply as a devotional exercise, or to get information, but to find out what God wants one to do at a particular time and to do it. Qutb is convinced that the Qur’an will guide such a person. (This is part of what is meant by saying that the tasawwur is practical). One will not truly understand the Qur’an unless one is engaged in the struggle (jihad) for an Islamic society.

For most Muslims, the Sunna (words and deeds) of the Prophet Muhammad is authoritative along with the Qur’an; and also authoritative is the tradition of scholarship related to these. Qutb likewise relies on the Sunna and, somewhat selectively, on the later tradition. He emphasizes the Qur’an, however, more than most. He also emphasizes the generation of Muslims contemporary with Muhammad, the “Unique Qur’anic Generation” as he terms them. This generation was present at the time of revelation and drew their understanding of life and their duties exclusively from it; they received it with the intention to obey as a soldier would receive marching orders for the day; also, they broke completely with their former life. No later generation has equalled them, but they should be the model for Islamic activists today.

Still, there are many areas of life in which human reason is sufficient for understanding and making discoveries, and in so doing fulfilling part of the human role as God’s khalifa. These involve what Qutb calls the “pure” sciences, mainly the physical sciences insofar as they do not involve moral or metaphysical issues.

Splitting the atom would be included but not its use in atomic bombs. Biology is included but not Darwinian evolution. The Islamic tasawwur encourages this kind of science. It does not have the certainty of revelation but, properly done, it will not conflict with revelation. Qutb speaks of the “open book of the universe” (possibly echoing the 19^th century Indian modernist, Sayyid Ahmad Khan). In fact, Western science is historically rooted in the past scientific activities of Muslims. It has developed in an anti-religious direction, but Islam can purify this science and put it on the sound basis of the fitra.

7. Ethical Values, Shari’a

General ethical values are of course part of the Islamic tasawwur. They are fixed and do not “develop” over time, although their application may vary. They provide a “fixed axis” and “fixed framework” around and within which human activity takes place. These values are not scattered or ad hoc but are systematic, constituting a complete system for all of life. As they derive from the one God, they unify humans with the creation and its Creator, and integrate individual personalities. To be valid, ethical action must be accompanied by faith in this God. Because they come from God, they provide a greater sense of obligation than secular morality can. Qutb criticizes various forms of secular morality at length.

In principle, there is no grey area in Qutb’s ethics. The contrast is stark between guidance and error, faith and kufr (unbelief, wilful rejection of faith), tawhid (recognition of God’s unity) and shirk (ascribing divinity to other beings than God). Along with this, however, he recognized that although basic ethical values do not change, their application does change with changing times and situations, both of which are experienced very much by modern revolutionaries.

The specific ethical rules and values are enshrined in the Shari’a, to which Qutb makes very frequent reference. This is commonly called the law of God but is more accurately described as a moral classification by God of all human actions into five categories: obligatory, approved, neutral, reprehensible or forbidden. The human understanding of the Shari’a is called fiqh (“understanding”) and is based on the Qur’an and the Sunna of the Prophet, along with the effort (ijtihad) of later scholars to interpret and apply these. Among Sunnis, the consensus of these scholars on any ruling has been considered to guarantee its validity, with the result that the scope for ijtihad has diminished over time. One of the major issues of modern times has been the degree of freedom contemporary interpreters should have to reverse past rulings in the light of current needs. Modernists seek a high degree of freedom in order to bring fiqh in line with prevailing values derived from the West. Qutb opposes ijtihad for this purpose, which he considers defeatism in the face of the West, and insists that there should be no ijtihad where there is a clear and authoritative text. He favors it, however, where, in his view, it represents an authentic Islamic response to current conditions. He calls this fiqh haraki (that is, a fiqh that reflects changing human activities or needs of the current Islamic movement). He also indicates approval of the unfettered use of the principle of public interest (maslaha), a principle recognized in traditional fiqh but usually with restrictions. At the same time, he regularly canvasses the views of earlier scholars on specific matters and sometimes accepts them. All of this accords with his claim that the Islamic tasawwur is realistic and practical. The term Shari’a is to some extent interchangeable or correlated with the term manhaj, and he seems to see the Shari’a as part of the Islamic manhaj. Qutb also claims that the Shari’a is perfectly harmonious with the general laws of the universe, including the physical laws of human biology, and is the only means by which the voluntary life of humans can be integrated with them, as briefly mentioned above.

8. The Ideal Society (Utopia), Economics. Gender Relations

The ideal society is one that recognizes the sovereignty of God alone, not the people, the nation, or the human ruler, and is governed by the Islamic Shari’a. Since the Shari’a is part of God’s overall law for the universe, a society truly governed by it will be in accord with the whole of the universe and with the human nature and needs of its members. It will be just, progressive, and tolerant. Class, racial, and ethnic distinctions will not influence people’s status, but rather piety, virtue, and competence. It will be a society in which people generally know who the virtuous and competent are and can choose them for leadership. He backs this up with descriptions of the society governed by the prophet Muhammad and his earliest successors, especially in Social Justice in Islam. Though the historical critic would probably claim that he is selective in his examples, Qutb’s view is that the history of Islam is not identical to the whole history of those societies called Muslim, but to the history of those societies insofar as they were truly following the Shari’a and implementing Islam.

While class, racial, and ethnic differences will not matter, religious differences will matter since the society is based on a religious creed. Qutb sometimes states that people have absolute freedom of conscience in matters of belief and that the freedom of any individual to hold and propagate his religious belief, free of compulsion, is a fundamental human right. It is not clear just how far this goes, however. No one should be forcibly converted to Islam. Jews and Christians (and possibly others) will have a place in society as granted by the Qur’an and Sunna. They may follow their own creeds and rites of worship but are limited in some areas, as specified in the traditional idea of dhimma (protected status), which Qutb generally accepts and defends. For example, they will pay a special tax called jizya, for which Qutb gives three reasons: it is a symbol of their acceptance of Islamic rule, it is in return for their protection by the Islamic government, and it contributes to the social expenses of the state. While dhimmis would be granted freedom of belief and worship, and Qutb speaks of freedom to propagate religious belief, it seems unlikely that a state run on Qutb’s interpretation would allow non-Islamic religious views to be propagated freely, among Muslims or anti-religious views at all. This is especially the case given Qutb’s view that Islam alone is the true religion and his statement in at least one place that abandoning the truth is corruption. Such a state would hardly accept the kind of religious pluralism, the legal equality in principle of all religions, assumed by many Westerners and others.

An Islamic government will be governed by the principle of consultation (shura). Qutb gives many examples of it from the early days of Islam. The exact form of shura varies with circumstances and, in accordance with the realistic and practical nature of the Islamic tasawwur, will be determined only when such a government is actually formed. Nevertheless, in a least one place he does outline a structure of government involving a ruler (imam) nominated by the recognized leaders of the community (literally: “people of binding and loosing”, a recognized phrase in Arabic) and chosen by the whole community. There will also be a parliament (majlis al-shura) whose members are chosen by the people locally. The high moral tone the government is more important, however, than these details. Qutb seems to envisage the imam as a strong and righteous leader who is normally to be obeyed implicitly, but not if he commands people to disobey God. He rejects the term “democracy” because he sees it as a Western concept involving government by the people instead of by God.

For all that Qutb seems to envisage the true Islamic state and society as a kind of utopia, he recognizes that actual Islamic societies have been less than ideal, and he severely criticizes many of the historical Muslim rulers without quite calling their government and society un-Islamic. In at least one place he states a ruler may be unjust but still be considered Islamic if he basically recognizes the authority of God.

Economics in an Islamic society is based on the fact that all wealth belongs to God, who entrusts it to human societies and thence to individuals as his khalifas. On this basis, the right to private property is guaranteed as a reward for work so that individuals are encouraged to work for their own benefit and the benefit of all. This strikes a just balance between effort and reward and accords with human nature. Private property, however, is limited legally by the institution of Zakat, which requires a portion of one’s wealth to be given away and is one of the Pillars of Islam. It is also limited by the right of the political leader to tax further when this is necessary for the welfare of the community and to assist the needy, who have a recognized right to a share in the community’s wealth. Islam also opposes the concentration of wealth in a few hands, and its rules on inheritance and opposition to usury are designed to discourage this. Likewise, the community should own collectively resources needed for the general wellbeing, and these have expanded considerably in modern times. Added to all of this is the additional moral obligation on individuals to assist the needy and contribute to social causes. In discussing economics, Qutb often goes beyond what the traditional sources of authority prescribe, especially in relation to the economic power of the state. What he writes would be largely acceptable to modernists with a moderate socialist inclination.

Qutb is at pains to point out that women and men are equal in respect of their humanity as such. He even argues that Eve was not created from Adam’s rib but created in the same way as Adam (the account of Adam’s rib is not in the Qur’an but is in later sources). In temperament, however, women and men differ. Women are more emotional and men more rational. Women’s temperament fits them for raising children and other domestic tasks, whereas men are more fitted for the world of work outside the home. Hence, men have the right to leadership within the family and women the right to protection.

The family is the basic unit of society and the institution that produces human values; its place is rooted in the cosmic order. Obedience to God in matters relating to marriage, divorce, and family is service to God no less than formal prayer. Thus, women’s primary role of caring for the family is extremely important. For this reason, women should not work outside the home unless it is absolutely necessary. Moreover, those who do are likely to be exploited both sexually and economically, turned into sex objects and underpaid. He also believes that young children should be cared for within the home, not in crèches. He draws on his experiences in the United States, among other things, to support these points. All of these things characterize a jahili society, according to him. He also argued that Western women sought election to parliament because men had been making laws unfair to women, but under a system of divinely based law the laws will be fair.

Women should dress in a manner that shows only their faces and hands but not be secluded, as in some societies. They also should not mix publicly with men as this may lead to promiscuity and weaken marriages. He defends divorce and polygyny, at least under certain conditions. If these seem to make women insecure it is because the present society is jahili and not sufficiently attuned to Islamic values. Although Muslim men are permitted in traditional fiqh to marry Jewish or Christian women, Qutb is inclined to oppose this today since it may weaken Muslims’ faith and sense of identity, given that current Muslim societies are only nominally Muslim. It is worth noting that Qutb evidently had no objection to women’s involvement in the Islamic movement. Both of his sisters were involved, and one went to prison. He was also a mentor to Zaynab al-Ghazali, a well-known woman Islamic activist in Egypt who had put into her marriage contract that her husband would not interfere with her Islamist activities.

9. Jahiliyya (Dystopia) and Jihad (Revolution)

Any society that is not governed according to the Shari’a is a jahili society. The term jahiliyya literally means ignorance with a connotation of barbarism and has most often been applied to the Arabian society on the eve of Muhammad’s mission. The term and general idea come from Mawdudi, but Qutb makes it more extreme. For Mawdudi, contemporary Muslim societies are part Muslim and part jahili, while for Qutb there is no such mid-term. The contrast is stark: a society is either Islamic or jahili. A jahili society compels or at least pressures its member to serve other humans rather than God, and its leaders presume to create values and laws rather than apply the values and laws of God, effectively claiming divine attributes and making themselves gods beside God. The moral, psychological, and social results are disastrous, though it is not these results these results that define a jahili society. Many states claim to be Islamic and claim that their laws are based on the Shari’a or partly so when in reality the laws are man-made and they are jahili societies. In fact, Qutb claimed that all so-called Islamic countries in his time were jahili, with the result that, as he put it, Islam does not exist. This does not mean that there are no Muslims, but it does mean that they cannot live a complete Muslim life. While Qutb labels societies jahili he is much less inclined to label individuals as unbelievers (kafir), unlike some of his Qutbist successors.

Although the line between Islam and jahiliyya is stark in principle, Qutb does not clearly indicate exactly how and where it is drawn. It seems that societies whose leaders sincerely recognize the Shari’a even if they often fall short in practice will still be Islamic, while others that appear morally superior but whose leaders do not accept the Shari’a, or who interpret it in a Westernizing way, will be jahili, though Qutb will assume that the moral difference is temporary or more apparent than real. This is consistent with Qutb’s views, mentioned above, that ideas are primary and that faith is necessary for works to be valid.

The answer to jahiliyya for Qutb is jihad. This word, which appears frequently in the Qur’an and the later tradition, means “striving” and the full phrase is “striving in the path of God”. It may take non-violent forms, such as the “greater jihad”, the struggle against evil tendencies within one’s self (referred to by the prophet), or other forms of righteous striving. In juristic and political circles, the term has mainly referred to the violent activity of war, with rules for proper behavior in warfare elaborated. Thus, the term is often translated “holy war”. This is the usage that Qutb draws on. In modern times, many Muslims have preferred to emphasize the non-violent forms of jihad and to limit violent jihad to defensive warfare. Qutb considers this defeatist and argues the need for both violence and the initiating of violence at times. Jahiliyya is not merely a condition of society but an aggressive and unrelenting force that can only finally be defeated by violence. Moreover, Muslims have an obligation not only to defend themselves but to fight tyranny wherever it appears and to remove obstacles to the preaching of Islam. Jihad is part of the Islamic mission to liberate humans from servitude to other humans and realize the rule of God on earth. This is the greatest of all human tasks and one should not apologize for using force when necessary. God knows that evil must be confronted in this way. (Perhaps this attitude is not so different from the actions of Western powers fighting to spread civilization, democracy, and/or human rights.) Qutb relates the “greater jihad” to this by describing it as the inner battle of the warrior to purify himself of personal desires and any other obstacles to his serving God and establishing God’s authority on earth.

In the present situation jihad takes effectively the form of revolution, though Qutb does not use this term. (He may be influenced by Mawdudi’s book, Jihad in Islam, which explicitly calls it “revolutionary struggle”, at least in the English translation.) Individuals or groups of Muslims must come together to organize their lives on the basis of Islam, thus giving birth to a new society and isolating themselves psychologically, though not physically, from the jahili society around them. These groups will for a long time devote themselves to studying and internalizing the basic Muslim creed, there is no god but God. This is what Muhammad did for thirteen years in Mecca, before any attempt to establish an Islamic society was made. Soon enough, the Muslim group will be attacked by the jahiliyya and have to respond in ways that probably include violence until it replaces or at least holds its own against the jahiliyya. In the early stages, violence is to be avoided except for self-defence though later it may be initiated, as mentioned above. All of this according to Qutb is based on the example of the Prophet’s actions in Mecca and Medina and represents a realization of the second part of the creed, “Muhammad is the Messenger of God.”

10. Human History

Qutb explicitly rejects the Enlightenment idea of continuous human progress, at least in the moral area. Rather, in accordance with the traditional Muslim view, history is characterized by a series of prophetic missions, often representing moral high points, followed by decline. The first prophet was the first man, Adam. Although he and his wife disobeyed God and were expelled from Paradise, they repented and were pardoned; their pure fitra was re-established though they now lived in a world of physical and moral struggle. Many of the ensuing prophets preached to peoples who rejected them and were destroyed by God, but some, in particular Ibrahim (Abraham), Musa (Moses), Daud (David), and ‘Isa (Jesus) left continuing communities, though these communities changed the revelations they had received. Each of the messengers taught the same truths about God and the universe, though in increasingly advanced forms as befit their societies’ development, until the human race reached its maturity and Muhammad brought the final revelation and most complete and universal message, confirming but superseding the previous messages. The high point of human moral and social history was the community in Medina under the prophet and his immediate successors. The Muslim community continued for some twelve centuries, often prospering politically and culturally though declining morally.

In the West, a corrupted form of Christianity was imposed on people and this eventually led to a rebellion against religion and to the anti-religious philosophies (“Positivism”, “Dialectical Materialism”, etc.) so prevalent by the twentieth century. The West also began to attack the Muslim world militarily during the medieval crusades and this crusading continued later in the form of Western imperialism. This is a common idea among Islamists today, who regularly refer to Westerners as crusaders. As a result of Western imperialism, Muslim societies began to adopt Western ways and abandoned the Shari’a, often without admitting it, so that by Qutb’s time there was no longer a truly Islamic society anywhere. The whole world is in a state of jahiliyya, and this jahiliyya, because of its material advancement and sophistication, is deeper than previous ones. Although the previous wave of Islam has left some traces, such as the idea of the unity of the human race, that might ease the rebirth of Islam, this will happen only by God’s will working through Islamic activists. A new Islamic society will not be morally better than the “unique Qur’anic generation” except (one may note, though Qutb does not say) that its moral status will be linked to much better technology.

11. Death, Judgment, Martyrdom

Qutb held to the traditional view that death is followed by resurrection on the Last Day, by divine judgment on the basis on one’s action, and a final abode in paradise or hell. This, finally, is the greatest motive for service to God in this life. How God will raise people to life after they are dead is one of the divine secrets that human reason cannot understand, just as it cannot understand the secret of life generally. He seems to take the Qur’anic descriptions of judgment, heaven, and hell, at face value, sometimes analysing the language and literary force of the accounts. These scenes are related to this world since worldly actions lead to them and worldly joy and suffering foreshadow them. They also widen the individual’s perspective beyond the bounds of this life. He also held to the common view that God has fixed the date of each person’s death, a good reason to risk martyrdom in revolutionary action.

The situation of martyrs, those who die in jihad, is distinctive. The Qur’an says, “Do not say of those who are killed in the path of God, ‘They are dead.’ They are alive . . .” (Qur’an 2:154; 3:169). Qutb says that they are alive in the sense that they continue to be an active force directing the community, but that they also may be more literally alive on another level of existence that we cannot conceive of. Toward the end of Milestones, he says that martyrs receive three rewards: contentment and freedom from fear and sorrow, praise from angels and humans and favorable accounting in the final judgment (I have seen no mention of 72 virgins, however). Qutb is considered a martyr by many, probably most, Muslims. It is reported that on learning that he was to be executed he praised God for earning martyrdom. Both Zaynab al-Ghazali and Qutb’s sister, Hamida, claimed to have had visions just after his death assuring them that he is in paradise.

12. Qutb’s Legacy

Qutb’s ideas, strengthened by his status as a martyr, have had considerable influence among Muslims. His close linking of belief in one God with the need for the rule of a divinely derived law, and his insistence on a clear line between Islam and non-Islam, has strengthened Islamism generally. His understanding of jahiliyya has broadened the scope and depth of the struggle. His conceptualization of the movement as one for “liberation” resonates with many people, as does his view that all forms of activity should be service to God. His understanding of jihad and his own martyrdom has strengthened the willingness for both violence and self-sacrifice. One young man, who was moved by his execution to join an Islamist cell, was Ayman al-Zawahiri, who later became a leader in the radical group Tanzim al-Jihad, and still later leader of al-Qaeda. Within the Muslim Brothers organization, Qutb’s legacy has been ambivalent, a threat to their ability to function with some freedom, but not possible to ignore. In 2009, his ideas were at the forefront of a debate between those who wanted less accommodation to secular society and those who wanted more.

Those who particularly claim to follow his legacy, mostly outside the Brothers, have commonly been called Qutbists or Qutbians. They include the so-called Takfir wa Hijra (the label refers to their condemnation of society and separation from it), Jama‘a Islamiyya (Islamic Group), and Tanzim al-Jihad (Jihad Organization) in Egypt, and al-Qaeda. (It is not clear where “Islamic State” or ISIS stands on Qutb.) They tend to simplify Qutb’s ideas or take them to extremes that he might not have accepted. This article considers their interpretations of some of Qutb’s ideas.

Qutb’s idea of jahiliyya is a fairly easy idea to misunderstand. It has often been interpreted as takfir, the declaration of individuals as unbelievers or apostates, usually applied to enemies or government representatives. Jama‘a Islamiyya and Tanzim al-Jihad spoke more of kufr than jahiliyya. They considered Egyptian society as a whole to be Muslims and only the leaders of society to be kafirs. On this assumption, some of the Tanzim al-Jihad members assassinated the Egyptian president in 1981, hoping by this to spark a revolt and overthrow the government, something that did not happen. On Qutb’s view of jahiliyya, this effort would have been hopelessly misguided and premature.

The leader of the so-called Takfir wa Hijra group, who had reportedly studied Qutb’s writings in prison, accepted the claim that the whole Egyptian society was jahili, but with a more extreme interpretation than Qutb’s. He claimed that any of its members who left his group were abandoning Islam and that the standard Friday prayers were illicit in a jahili society. He also tried to isolate the group physically from society more than Qutb called for. Outsiders have interpreted its position as takfir and apparently insiders have too, since they came to accept the label.

The distinction between the “near enemy” (their own rulers) and the “far enemy” (for example, Israel and the United States) made by the Jama‘a Islamiyya and Tanzim al-Jihad, and their choice to attack the “near enemies” first, does owe something to Qutb’s idea of jahiliyya, since this idea removes Egyptian society from the category of Islamic. Al-Qaeda’s view of the world-wide struggle also seems to fit Qutb’s idea, though al-Qaeda changed the priority to the “far enemy”. Qutb might have accepted this as a practical example of flexibility after the attack on the “near enemy” failed.

Qutb called for a long period of preparation before engaging in jihad, but Tanzim al-Jihad and Jama‘a Islamiyya advocated immediate action. While al-Qaeda trains its recruits militarily and indoctrinates them, it does not appear to provide the sort of long term spiritual preparation Qutb had in mind. The leader of Takfir wa-Hijra appreciated the need for a long period of preparation, which is one of the reasons he sought to isolate the group. He hoped to build a model community that would eventually be strong enough to bring down the government. Unfortunately for them, police arrested some of the group and the group in return kidnapped a former government minister, whom they killed when the government refused to release the prisoners. The government then cracked down and succeed in capturing and executing the group’s leaders.

While Qutb defended the need for, and almost the inevitability of, violence in certain circumstances, this was to counter those who downplayed it, often for apologetic reasons. It is doubtful (though impossible to know) whether Qutb would have approved of the terrorist activities of the Qutbist groups. For the most part, they do not make sense if jahiliyya is as deeply rooted as Qutb claims and, in any case, Qutb accepted the traditional fiqh view that non-combatants should not be targeted. Also, revenge has often been a motive for violent actions, but Qutb appears to have rejected that motive. Perhaps the most important contribution of Qutb’s theories is that they remove the legitimacy from the existing authorities for his followers and make the followers look ultimately like “paper tigers.”

Qutb has been criticized by traditional scholars on particular points of fiqh and history and generally for making judgments about religion without the sort of training they consider necessary. Also, Sunnis have generally taken the position that for this worldly purpose a person is to be treated as Muslim if he is outwardly one, whatever his behaviour, and likewise, the government is to be treated as Muslim as long as the rulers are outwardly so. Many see Qutb’s views about jahiliyya and jihad as violations of this.

Many who are not radical Islamists still appreciate many of Qutb’s ideas and ultimate goals. Often it is argued that his extreme views were the result of his imprisonment and torture and that, had he lived longer, his ideas and activities would have developed in a more moderate direction. They also like to call attention to his earlier works, which contain less extreme views than those discussed in this article.

13. Final Remarks: Aesthetics, Harmony, and
Essentialism

There is a strongly aesthetic dimension to Qutb’s writing, and one could say that its master theme is harmony. God’s universe is a perfectly harmonious system into which everything fits beautifully and practically. This universe is friendly to life, and human life can be in full harmony with it and blessed. Disharmony comes when humans act in ways that contravene the ways God has set out for them. The beauty of God’s harmony makes the disharmony introduced by humans all the worse, like a beautiful painting disfigured. Hence the horror of jahiliyya and seriousness of the effort to end it.

Connected with this is the resolutely essentialist nature of Qutb’s thinking. Everything is essentialized, including nature, humanity, gender, Islam, the West, jahiliyya, Shari’a, belief, and unbelief. Perhaps God is a partial exception, since His essence is unknowable and His freedom to produce miracles may break the regularities on which human essentialism depends. A major aspect of this essentialism is the dichotomously “Manichean” way in which he treats good and evil. As mentioned above, there is no mid-term between guidance and error, faith and unbelief, tawhid and shirk, or between Islam and jahiliyya or Shari’a and human legislation, Although the interpretation of the Shari’a may require human effort (ijtihad), and its application may vary with circumstances, the difference in principle between divinely sourced and humanly sourced is stark.

This combination of aesthetics, essentialism, and “Manicheism,” while very much open to criticism from scientists and philosophers, is undoubtedly one of the keys to the power of his ideology. The strong contrast between good and evil, the sense that evil is currently in charge in the world though good is in ultimate control, and the conviction that something can be done must and must be done at any cost to change this situation has characterized and driven many a revolutionary ideology.

14. References and Further Reading

a. Primary Sources

Qutb, Sayyid, In the Shade of the Qur’an (Fi zilal al-Qur’an), 18 vols, Translated & Edited by: M.A. Salahi & A.A Shamis, Leicester, UK: The Islamic Foundation, 1999-2005.
- Qutb’s massive and popular commentary on the Qur’an. Much of it was written before his most radical period but the first 13 (of 30) parts were revised during that period.
Qutb, Sayyid, The Islamic Concept and its Characteristics (Khasa’is al-tasawwur al-islami wa-muqawwimatuhu), trans. Mohammed M. Siddiqui. Indianapolis: American Trust Publications, 1991.
- The most “philosophical” of Qutb’s late works, used considerably for this article. It is the first of two volumes on the subject; the second has not been translated into English.
Qutb, Sayyid, Basic Principles of the Islamic Worldview, trans. Rami David. North Haledon, N.J.: Islamic Publications International IPI, 2006.
- A later translation of the same work as above.
Qutb, Sayyid, Islam: The Religion of the Future (Al-mustaqbal li-hadha al-din), translator not given. Beirut and Damascus: The Holy Koran Publishing House, n.d.
- A shorter book stating main point and emphasizing the need of humanity for Islam. Comments on quotes from Alexis Carrel and John Foster Dulles.
Qutb, Sayyid, Milestones (Ma‘alim fi al-tariq), trans. S. Badrul Hasan [?]. Kuwait: International Islamic Federation of Student Organizations, 1978. Also, Lahore: Kazi Publications, nd. The title is sometimes translated “Signposts”.
- Qutb’s best known radical work, a handbook for Islamic revolution.
Qutb, Sayyid, Milestones, “revised translation”, translator not given. Indianapolis: American Trust Publications, 1990.
- Claims to provide “a fresh editing and rereading” but I cannot confirm that does so from what I have read of it.
Qutb, Sayyid, This Religion of Islam (Hadha al-din), translator not given. Kuwait: International Islamic Federation of Student Organizations, 1972.
- Summarizes the characteristics of the Islamic manhaj and its positive effect on the world in the past. Relatively optimistic.
[Qutb, Sayyid] Sayyid Qutb and Islamic Activism: A translation and critical analysis of Social Justice in Islam (Al-‘adala al-ijtima‘iyya fi al-islam). By William Shepard, Leiden: Brill, 1996.
- Last edition of Qutb’s major work on Islamic social and political teachings. Comparisons are made with earlier editions.
The Sayyid Qutb Reader, ed. Albert J. Bergesen. Routledge, 2007.
- Includes an introduction to Qutb’s career and ideas, and selections mainly from the radical parts of In the Shade of the Qur’an , along with some from Milestones, Social Justice in Islam, and A Child from the Village (autobiographical account of his childhood village, written before he became Islamist).

b. Secondary Sources

Abu-Rabi‘, Ibrahim, Intellectual Origins of Islamic Resurgence in the Modern Arab World. Albany: SUNY Press, 1996.
- Chapter 3 deals with the Muslim Brothers and chapters 4 to 6 cover Qutb’s pre-Islamist, early Islamist and later Islamist thinking.
Calvert, John, Sayyid Qutb and the Origins of Radical Islamism. New York: Columbia University Press, 2010.
- Excellent study of Qutb’s activities and writings during both is secularist and Islamist period; with helpful information on the social and political background and a survey of later “Qutbists”.
Carré, Olivier, Mysticism and Politics: A Critical Reading of Fî Zilal al-Qur’an by Sayyid Qutb (1906-1966), Leiden, Boston: Brill, 2003.
- An in-depth study of Qutb’s Qur’an commentary. Includes selections from the text.
Haddad, Yvonne Y., ‘Sayyid Qutb: Ideologue of Islamic Revival’, ch. 4 in Voices of Resurgent Islam, ed. J. Esposito. New York and Oxford: Oxford U. P., 1983.
- Includes a discussion of Qutb’s main concepts.
Kepel, Gilles, Muslim Extremism in Egypt: The Prophet and the Pharoah. Berkeley & Los Angeles, 1986 and Berkeley: University of California Press, 2003.
- Chapters 1 and 2 discuss Qutb’s last years and Milestones. The rest of the book deals with later radical groups in Egypt.
Musallam, Adnan, From Secularism to Jihad: Sayyid Qutb and the Foundations of Radical Islamism. Praeger, 2005.
- Thoughtful account of the whole of Qutb’s life, career and writings, especially good on the earlier years. Also deals with Qutb’s influence on later radicals.
Shepard, William, “Sayyid Qutb’s doctrine of Jahiliyya “, International Journal of Middle East Studies 35/4 (Nov. 2003): 521-545.
- Discusses the background to and components of this doctrine.
Shepard, W., “Islam as a ‘System’ in the Later Writings of Sayyid Qutb”, Middle Eastern Studies 25/1 (January 1989): 31-50.
- Discusses key terms such as tasawwur and manhaj.
Toth, James. Sayyid Qutb: The Life and Legacy of a Radical Islamic Intellectual. Oxford: Oxford UP, 2013.
- A good study of Qutb’s life and ideas with a lot of interesting information.

Author Information

William E. Shepard
Email: w.shepard@snap.net.nz
University of Canterbury
New Zealand

Set Theory

Set Theory is a branch of mathematics that investigates sets and their properties. The basic concepts of set theory are fairly easy to understand and appear to be self-evident. However, despite its apparent simplicity, set theory turns out to be a very sophisticated subject. In particular, mathematicians have shown that virtually all mathematical concepts and results can be formalized within the theory of sets. This is considered to be one of the greatest achievements of modern mathematics. Given this achievement, one can claim that set theory provides a foundation for mathematics.

The foundational role of set theory and its mathematical development have raised many philosophical questions that have been debated since its inception in the late nineteenth century. For example, here are three: Does infinity exist, and if so, are there different kinds of infinity? Is there a mathematical universe? Are all mathematical problems solvable?

Before pursuing the philosophical issues concerning set theory, one should be familiar with a standard mathematical development of set theory. This article presents such a development.

In the late nineteenth century, the mathematician Georg Cantor (1845–1918) created and developed a mathematical theory of sets. This theory emerged from his proof of an important theorem in real analysis. In this proof, Cantor introduced a process for forming sets of real numbers that involved an infinite iteration of the limit operation. Cantor’s novel proof led him to a deeper investigation of sets of real numbers and to his theory of abstract sets. Cantor’s creation now pervades all of mathematics and offers a versatile tool for exploring concepts that were once considered to be ineffable, such as infinity and infinite sets.

Sections 1 and 2 below describe the “naïve” principles of set theory that were used and developed by Cantor. Then, Section 3 describes a more sophisticated (axiomatic) approach to set theory that arose from the discovery of Russell’s paradox. After identifying the Zermelo-Frankel axioms of set theory, Section 4 discusses Cantor’s well-ordering principle and examines how Cantor used the well-ordering principle to develop the ordinal and cardinal numbers. Section 5 considers controversies concerning the well-ordering principle and its equivalent, the axiom of choice. This is followed by introducing the cumulative hierarchy of sets, Kurt Gödel’s universe of constructible sets, and Paul Cohen’s method of forcing in Sections 6, 7, and 8, respectively. The latter two topics, explored in Sections 7 and 8, can be used to show that certain questions are unresolvable when assuming the Zermelo-Frankel axioms (with or without the axiom of choice). The next two sections address further developments in set theory that are intended to settle these and other unresolved questions; namely, Section 9 discusses large cardinal axioms, and Section 10 investigates the axiom of determinacy.

On the Origins
Cantor’s Development of Set Theory
1. Russell’s Paradox
The Zermelo-Fraenkel Axioms
1. The Axioms
2. Classes
Cantor’s Well-Ordering Principle
1. Ordinal Numbers
2. Cardinal Numbers
The Axiom of Choice
1. On Zermelo’s Proof of the Well-Ordering Principle
2. Banach-Tarski Paradox
The Cumulative Hierarchy
Gödel’s Constructible Universe
Cohen’s Forcing Technique
Large Cardinal Axioms
The Axiom of Determinacy
Concluding Remarks
References and Further Reading

1. On the Origins

Let us first discuss a few basic concepts of set theory. A set is a well-defined collection of objects. The items in such a collection are called the elements or members of the set. The symbol “$\in$” is used to indicate membership in a set. Thus, if $A$ is a set, we write $x \in A$ to say that “$x$ is an element of $A$,” or “$x$ is in $A$,” or “$x$ is a member of $A$.” We also write $x \notin A$ to say that $x$ is not in $A$. In mathematics, a set is usually a collection of mathematical objects, for example, numbers, functions, or other sets.

Sometimes a set is identified by enclosing a list of its elements by curly brackets; for example, a set of natural numbers $A$ can be identified by the notation

$A = \{1,2,3,4,5,6,7,8,9\}$.

More typically, one forms a set by enclosing a particular expression within curly brackets, where the expression identifies the elements of the set. To illustrate this method of identifying a set, we can form a set B of even natural numbers, using the above set $A$, as follows:

$B = \{n \in A : n \text{ is even}\}$.

which can be read as “the set of $n \in A$ such that $n$ is even.” Of course,

$\{n \in A : n$ is even$\} = \{2,4,6,8\}$.

It is difficult to identify the genesis of the set concept. Yet, the idea of a finite collection of objects has existed for as long as the concept of counting. Indeed, mathematicians have been investigating finite sets and methods for measuring the size of finite sets since the beginning of mathematics. For example, the above two sets

$A=\{1,2,3,4,5,6,7,8,9\}$

$B=\{2,4,6,8\}$

are finite sets. As every element in $B$ is an element in $A$, the set $B$ is said to be a subset of $A$, denoted by $B \subseteq A$. Since there are elements in $A$ that are not in $B$, we say that $B$ is a proper subset of $A$. Moreover, the number of elements in $B$ is strictly smaller than the number of elements in $A$. Thus, one can say, “the whole $A$ is greater in size than its proper part $B$.”

Infinite sets lead to an apparent contradiction. Consider the infinite sets:

$C=\{0,1,2,3,\ldots \}$

$D=\{1,3,5,7, \ldots \}$.

We view the sets $C$ and $D$ as existing entities that both contain infinitely many elements. Thus, $C$ and $D$ are “completed infinities.” Observe that every element in $D$ is in $C$, and that $D$ is a proper subset of $C$. However, if, as many mathematicians once believed, “infinity cannot be greater than infinity,” then the whole $C$ is not greater in size than its proper part $D$. This counterintuitive result was viewed by many early prominent mathematicians as being contradictory, as it appeared to conflict with the well-understood behavior of finite sets. These mathematicians thus concluded that the concept of a “completed infinity” should not be allowed in mathematics.

For this reason, before Cantor, a majority of mathematicians considered infinite collections to be mathematically illicit objects. Cantor was the first mathematician to view infinite sets as being legitimate mathematical objects that can coexist with finite sets. Clearly, the size of a finite set can be measured simply by counting the number of elements in the set. Cantor was the first to investigate the following question:

Can the concept of “size” be extended to infinite sets?

Cantor addressed this question in the affirmative by using the concept of a function to measure and compare the sizes of infinite sets. Functions are widely used in science and mathematics. For sets $A$ and $B$, we say that $f$ is a function from $A$ to $B$, denoted by $f$: $A \rightarrow B$, if and only if $f$ is a relation (operation) that assigns to each element $x$ in $A$, a single element $f(x)$ in $B$. There are three important properties that a function might possess:

$f$: $A \rightarrow B$ is an injection if and only if for each $y$ in $B$ there is at most one $x$ in $A$ such that $f(x)=y$.
$f$: $A \rightarrow B$ is a surjection if and only if for each $y$ in $B$ there is at least one $x$ in $A$ such that $f(x)=y$.
$f$: $A \rightarrow B$ is a bijection if and only if for each $y$ in $B$ there is exactly one $x$ in $A$ such that $f(x)=y$.

Observe that $f$: $A \rightarrow B$ is an injection if and only if distinct elements in $A$ are assigned to distinct elements in $B$; that is, for all $x$ and $a$ in $A$, if $x \neq a$, then $f(x) \neq f(a)$. Also note that $f$: $A \rightarrow B$ is a bijection if and only if $f$: $A\rightarrow B$ is an injection and a surjection.

Cantor observed that two sets $A$ and $B$ have the same size if and only if there is a one-to-one correspondence between $A$ and $B$, that is, there is a way of evenly matching the elements in $A$ with the elements in $B$. In other words, Cantor noted that the sets $A$ and $B$ have the same size if and only if there is a bijection $f$: $A \rightarrow B$. In this case, Cantor said that $A$ and $B$ have the same cardinality. For an illustration, let $\mathbb{N} = \{0, 1, 2, 3, 4, \ldots \}$ be the set of natural numbers and let $E = \{0,2,4,6,8,\ldots\}$ be the set of even natural numbers. Now let $f$: $\mathbb{N} \rightarrow E$ be defined by $f(n)=2n$. One can verify that $f$: $\mathbb{N} \rightarrow E$ is a bijection and, thus, we obtain the following one-to-one correspondence between the set $\mathbb{N}$ of natural numbers and the set $E$ of even natural numbers:

Hence, each natural number $n$ corresponds to the even number $2n$, and each even natural number $2i$ is thereby matched with $i \in \mathbb{N}$. The bijection $f$: $\mathbb{N} \rightarrow E$ specifies a one-to-one match-up between the elements in $\mathbb{N}$ and the elements in $E$. Cantor concluded that the sets N and E have the same cardinality.

Cantor also defined what it means for a set $C$ to be smaller, in size, than a set $D$. Specifically, he said that $C$ has smaller cardinality (smaller size) than $D$ if and only if there is an injection $f$: $C \rightarrow D$ but there is no bijection $g$: $C \rightarrow D$. Cantor then proved that there is no one-to-one correspondence between the set of real numbers and the set of natural numbers. Cantor’s proof showed that the set of real numbers has larger cardinality than the set of natural numbers (Cantor 1874). This stunning result is the basis upon which set theory became a branch of mathematics.

The natural numbers $0, 1, 2, 3, \ldots$ are the whole numbers that are typically used for counting. The real numbers are those numbers that appear on the number line. For example, the natural number $2$, the integer $-3$, the fraction $6/5$, and all of the other rational numbers are real numbers. The irrational numbers, such as $\sqrt{2}$ and $\pi$, are also real numbers. Again, let $\mathbb{N} = \{0, 1, 2, 3, \ldots \}$ be the set of natural numbers, and let $\mathbb{R}$ be the set of real numbers. If a set is either finite or has the same cardinality as the set of natural numbers, then Cantor said that it is countable. Since the set of real numbers $\mathbb{R}$ is larger, in size, than the set of natural numbers $\mathbb{N}$, Cantor referred to the set $\mathbb{R}$ as being uncountable.

After proving that the set of real numbers is uncountable, Cantor was able to prove that there is an increasing sequence of larger and larger infinite sets. In other words, Cantor showed that there are “infinitely many different infinites,” a result with clear philosophical and mathematical significance.

After his introduction of uncountable sets, in 1878, Cantor announced his Continuum Hypothesis (CH), which states that every infinite set of real numbers is either the same size as the set of natural numbers or the same size as the entire set of real numbers. There is no intermediate size. Cantor struggled, without success, for most of his career to resolve the Continuum Hypothesis. The problem persisted and became one of the most important unsolved problems of the twentieth century. After Cantor’s death, most set theorists came to believe that the Continuum Hypothesis is unresolvable.

Cantor’s profound results on the theory of infinite sets were counterintuitive to many of his contemporaries. Moreover, Cantor’s set theory violated the prevailing dogma that the notion of a “completed infinity” should not be allowed in mathematics. Thus, the outcry of opposition persisted. Influential mathematicians continued to argue that Cantor’s work was subversive to the true nature of mathematics. These mathematicians believed that infinite sets were dangerous fictional creations of Cantor’s imagination and that Cantor’s fictions needed to be eradicated from mathematics (Dauben 1979, page 1) (Dunham 1990, pp. 278-280). Nevertheless, Cantor’s theory of sets soon became a crucial tool used in the discovery and establishment of new mathematical results, for example, in measure theory and the theory of functions (Kanamori 2012). Mathematicians slowly began to see the utility of set theory to traditional mathematics. Accordingly, attitudes started to change and Cantor’s ideas began to gain acceptance in the mathematical community (Dauben 1979, pp. 247-248). The significance of Cantor’s mathematical research was eventually recognized. David Hilbert, a prominent twentieth century mathematician, described Cantor’s work as being

the finest product of mathematical genius and one of the supreme achievements of purely intellectual human activity. (Hilbert 1923)

Ultimately, Cantor’s theory of abstract sets would dramatically change the course of mathematics.

2. Cantor’s Development of Set Theory

In his development of set theory, Cantor identified a single fundamental principle, called the Comprehension Principle, under which one can form a set. Cantor’s principle states that, given any specific property $\varphi(x)$ concerning a variable $x$, the collection $\{x : \varphi(x)\}$ is a set, where $\{x : \varphi(x)\}$ is the set of all objects $x$ that satisfy the property $\varphi(x)$. For example, let $\psi(x)$ be the property that “$x$ is an odd natural number.” The Comprehension Principle implies that

$S = \{ x : \psi (x)\} = \{1,3,5,7,\ldots \}$

is a set. Employing the Comprehension Principle, one can form the intersection of two sets $A$ and $B$ using the property “$x \in A$ and $x \in B$”; thus, the intersection of $A$ and $B$ is the set

$A \cap B = \{x : x \in A$ and $x \in B\}$.

One can also form the set

$A \cup B = \{x : x \in A$ or $x \in B\}$

which is called the union of $A$ and $B$. Recall that one writes $X \subseteq A$ to mean that $X$ is a subset of $A$, that is, every element of $X$ is also an element of $A$. Using the Comprehension Principle, one can form the power set of $A$, which is the set whose elements are all of the subsets of $A$, that is,

$\wp(A) = \{ X : X \subseteq A\}.$

Thus, if $A$ is a set and $X \subseteq A$, then $X \in \wp(A)$. So, if $A = \{1,2,3\}$ and $B = \{3,4,5\}$, then

$A \cap B = \{3\}$,

$A \cup B = \{1,2,3,4,5\}$, and

$\wp(A) = \{\varnothing,\{1\},\{2\},\{3\},\{1,2\},\{1,3\},\{2,3\},\{1,2,3\}\}$,

where $\varnothing$ denotes the empty set, that is, the set that contains no elements. The Comprehension Principle was an essential tool that allowed Cantor to form many important sets. Cantor’s approach to set theory is often referred to as naïve set theory.

Cantor’s set theory soon became a very powerful tool in mathematics. In the early 1900s, the mathematicians Émile Borel, René-Loius Baire, and Henri Lebesgue used Cantor’s set theoretic concepts to develop modern measure theory and function theory (Kanamori 2012). This work clearly demonstrated the great mathematical utility of set theory.

a. Russell’s Paradox

The philosopher and mathematician Bertrand Russell was interested in Cantor’s work and, in particular, Cantor’s proof of the following theorem, which implies that the cardinality of the power set of a set is larger than the cardinality of the set. First, recall that a function $g$: $A \rightarrow B$ is a surjection (or is onto $B$) if for all $y \in B$, there is an $x \in A$ such that $g(x)=y$.

Cantor’s Theorem. Let $A$ be a set. Then there is no surjection $f$: $A \rightarrow \wp(A)$.

Proof. Suppose, for the sake of obtaining a contradiction, that there exists a surjection $f$: $A \rightarrow \wp(A)$. Observe that, for all $z \in A$, $f(z) \subseteq A$. By the Comprehension Principle, let $X$ be the set

$X = \{x : x \in A$ and $x \notin f(x)\}$.

Clearly, $X \subseteq A$. Thus, $X \in \wp (A)$. As $f$ is onto $\wp(A)$, there is an $a \in A$ such that $f(a) = X$. There are two cases to consider: either $a \in X$ or $a \notin X$. If $a \in X$, then the definition of $X$ implies that $a \notin f(a)$. Since $f(a) = X$, we have that $a \notin X$, which is a contradiction. On the other hand, if $a \notin X$, then the definition of $X$ implies that $a \in f(a)$. Since $f(a) = X$, we see that $a \in X$, a contradiction. Thus, there is no surjection $f$: $A \rightarrow \wp(A)$. This completes the proof.

In 1901, after reading Cantor’s proof of the above theorem, that was published in 1891, Bertrand Russell discovered a devastating contradiction that follows from the Comprehension Principle. This contradiction is known as Russell’s Paradox. Consider the property “$x \notin x$”, where $x$ represents an arbitrary set. By the Comprehension Principle, we conclude that

$A = \{x : x \notin x\}$

is a set. The set $A$ consists of all the sets $x$ that satisfy $x \notin x$. Clearly, either $A \in A$ or $A \notin A$. Suppose $A \in A$. Then, the definition of the set $A$ implies that $A$ must satisfy the property $A \notin A$, which contradicts our supposition. Suppose $A \notin A$. Since $A$ satisfies $A \notin A$, we infer, from the definition of $A$, that $A \in A$, which is also a contradiction.

There were similar paradoxes discovered by others, including Cantor (Dauben 1979), but Russell’s paradox is the easiest to understand. These paradoxes appeared to threaten Cantor’s fundamental principle that he used to develop set theory. Nevertheless, Cantor did not believe that these paradoxes actually refuted his development of set theory. He knew that the construction of certain collections can lead to a contradiction. Cantor referred to these collections as “inconsistent multiplicities.” Today, such collections are called proper classes, and the paradoxes can be used to prove that they are not sets.

3. The Zermelo-Fraenkel Axioms

Over time, it became clear that, to resolve the paradoxes in Cantor’s set theory, the Comprehension Principle needed to be modified. Thus, the following question needed to be addressed:

How can one correctly construct a set?

Ernst Zermelo (1871–1953) observed that to eliminate the paradoxes, the Comprehension Principle could be restricted as follows: Given any set $A$ and any property $\psi (x)$, one can form the set $\{x \in A : \psi (x)\}$, that is, the collection of all elements $x \in A$ that satisfy $\psi (x)$, is a set. Zermelo’s approach differs from Cantor’s method of forming a set. Cantor declared that for every property one can form a set of all the objects that satisfy the property. Zermelo adopted a different approach: To form a set, one must use a property together with a set.

Zermelo also realized that in order to more fully develop Cantor’s set theory, one would need additional methods for forming sets. Moreover, these additional methods would need to avoid the paradoxes. In 1908, Zermelo published an axiomatic system for set theory that, to the best of our knowledge, avoids the difficulties faced by Cantor’s development of set theory. In 1930, after receiving some proposed revisions from Abraham Fraenkel, Zermelo presented his final axiomatization of set theory, now known as the Zermelo-Fraenkel axioms and denoted by ZF. These axioms have become the accepted formulation of Cantor’s ideas about the nature of sets.

a. The Axioms

As noted by Zermelo, to avoid paradoxes, the Comprehension Principle can be replaced with the principle: Given a set $A$ and a property $\varphi (x)$ with a variable $x$, the collection $\{x \in A : \varphi (x)\}$ is a set. However, this raises a new question: What is a property? The most favored way to address this question is to express the axioms of set theory in the formal language of first-order logic, and then declare that its formulas designate properties. This language involves variables and the logical connectives $\wedge$ (and), $\vee$ (or), $\neg$ (not), → (if … then …), and ↔ (if and only if), together with the quantifier symbols $\forall$ (for all) and $\exists$ (there exists). In addition, this language uses the relation symbols $=$ and $\in$ (as well as $\neq$ and $\notin$). In this language, the variables and quantifiers range over sets and only sets. A formula constructed in this formal language is referred to as a formula in the language of set theory. Such formulas are used to give meaning to the notion of “property.”

We now illustrate the expressive power of this set theoretic language. The formula $\exists x(x \in A)$ asserts that “the set $A$ is nonempty,” and $\forall x(x \notin A)$ states that “$A$ has no elements.” Moreover $\neg \exists x \forall y(y \in x)$ states that “it is not the case that there is a set that contains all sets as elements.” In addition, one can translate English statements, which concern sets, into the language of set theory. For example, the English sentence “the set $A$ contains at least two elements” can be translated into the language of set theory by $\exists x \exists y((x \in A \wedge y \in A) \wedge x \neq y)$.

There is another quantifier, called the uniqueness quantifier, that is sometimes used. This quantifier is written as $\exists ! x \varphi (x)$ and it means that “there exists a unique $x$ satisfying $\varphi (x)$.” This is in contrast with $\exists x \varphi(x)$, which simply states that “at least one $x$ satisfies $\varphi (x)$.” The uniqueness quantifier is used as a convenience, as the assertion $\exists !x \varphi (x)$ can be expressed in terms of the other quantifiers $\exists$ and $\forall$; namely, it is equivalent to the formula

$\exists x \varphi (x) \wedge \forall x \forall y ((\varphi (x) \wedge \varphi (y)) \rightarrow x=y)$.

The above formula is equivalent to $\exists!x \varphi (x)$ because it asserts that “there is an $x$ such that $\varphi(x)$ holds, and any sets $x$ and $y$ that satisfy $\varphi (x)$ and $\varphi(y)$ must be the same set.”

The Zermelo-Fraenkel axioms are listed below. Each axiom is first stated in English and then written in logical form. After each logical form, there is a discussion of the axiom and some of its consequences. When reading these axioms, keep in mind that, in Zermelo-Fraenkel set theory, everything is a set, including the elements of a set. Also, the notation $\vartheta (x, \ldots, z)$ means that $x, \ldots, z$ are free variables in the formula $\vartheta$ and that $\vartheta$ is allowed to contain parameters (free variables other than $x, \ldots, z$) that represent arbitrary sets.

Extensionality Axiom. Two sets are equal if and only if they have the same elements.

$\forall A \forall B ( A = B \leftrightarrow \forall x ( x \in A \leftrightarrow x \in B))$.

The extensionality axiom is essentially a “definition” that states that two sets are equal if and only if they have exactly the same elements.

Empty Set Axiom. There is a set with no elements.

$\exists A \forall x ( x \notin A)$.

The empty set axiom states that there is a set which has no elements. Since the extensionality axiom implies that this set is unique, we let $\varnothing$ denote the empty set.

Subset Axiom. Let $\varphi(x)$ be a formula. For every set $A$, there is a set $S$ that consists of all the elements $x \in A$ such that $\varphi(x)$ holds.

$\forall A \exists S \forall x ( x \in S \leftrightarrow ( x \in A \wedge \varphi (x)))$.

(The variable $S$ is assumed not to appear in the formula $\varphi (x)$.) The subset axiom, also known as the axiom of separation, asserts that any definable sub-collection of a set is itself a set, that is, for any formula $\varphi(x)$ and any set $A$, the collection $\{x \in A : \varphi(x)\}$ is a set. Clearly, the subset axiom is a limited form of the Comprehension Principle. Yet, it does not lead to the contradictions that result from the Comprehension Principle. The subset axiom is, in fact, an axiom schema since it yields infinitely many axioms-one for each formula $\varphi$.

Pairing Axiom. For every $u$ and $v$, there is a set that consists of just $u$ and $v$.

$\forall u \forall v \exists P \forall x ( x \in P \leftrightarrow ( x =u \vee x = v))$.

The pairing axiom states that, for any two sets $u$ and $v$, the set $\{u, v\}$ exists. Thus, by the extensionality axiom, the set $\{u, u\} = \{u\}$ exists.

Union Axiom. For every set $F$, there exists a set $U$ that consists of all the elements that belong to at least one set in $F$.

$\forall F \exists U \forall x ( x \in U \leftrightarrow \exists C (C \in F \wedge x \in C))$.

The union axiom states that, for any set $F$, there is a set $U$ whose elements are precisely those elements that belong to an element of $F$, that is, $x \in U$ if and only if $x \in A$ for some $A \in F$. The extensionality axiom implies that the set $U$ is unique; it is often denoted by $\bigcup F$. For example, consider the set $\{A,B\}$. Then

$\bigcup \{A,B\} = \{x : x$ belongs to a member of $\{A,B\}\} = \{x : x \in A$ or $x \in B\} = A \cup B$.

For another example, let $F = \{ \{a,b,c\},\{e,f\},\{e,c,d\} \}$. Then $\bigcup F = \{a,b,c,d,e,f\}$.

Power Set Axiom. For every set $A$, there exists a set $P$ that consists of all the sets that are subsets of $A$.

$\forall A \exists P \forall x ( x \in P \leftrightarrow \forall y( y \in x \rightarrow y \in A))$.

The power set axiom states that, for any set $A$, there is a set, which we denote by $\wp(A)$, such that for any set $B$, $B \in \wp(A)$ if and only if $B \subseteq A$.

Infinity Axiom. There is a set $I$ that contains the empty set as an element and whenever $x \in I$, then $x \cup \{x\} \in I$.

$\exists I ( \varnothing \in I \wedge \forall x (x \in I \rightarrow x \cup \{ x \} \in I))$.

The infinity axiom ensures the existence of at least one infinite set. For any set $x$, the successor of $x$ is defined to be the set $x^{+} = x \cup \{x\}$. Thus, the axiom of infinity asserts that there is a set $I$ such that $\varnothing \in I$ and if $x \in I$, then $x^{+} \in I$. Note that $\varnothing^{+} = \{\varnothing\}$, and that $\{\varnothing\}^{+} = \{\varnothing,\{\varnothing\}\}$. It follows that the set $I$ contains each of the sets

$\varnothing; \{\varnothing\}; \{\varnothing, \{\varnothing \}\}; \{\varnothing, \{\varnothing, \{\varnothing \}\}\}; \ldots$.

One can show that any two of the sets in the above list (separated by a semi-colon) are distinct. Hence, the set $I$ contains an infinite number of elements; in other words, $I$ is an infinite set. So, the infinity axiom simply states that infinite sets exist and are legitimate mathematical objects. The infinity axiom is a key tool that is used to develop the set of natural numbers $\mathbb{N}$ and to prove that $\mathbb{N}$ is well-ordered, that is, every nonempty set of natural numbers has a least element.

Replacement Axiom. Let $\psi (x, y)$ be a formula. For every set $A$, if for each $x \in A$ there is a unique $y$ such that $\psi (x, y)$, then there is a set $S$ that consists of all of the elements $y$ such that $\psi (x, y)$ for some $x \in A$. (Below, $\exists$! is the uniqueness quantifier.)

$\forall A (\forall x ( x \in A \rightarrow \exists ! y \psi (x,y)) \rightarrow \exists S \forall y( y \in S \leftrightarrow \exists x (x \in A \wedge \psi(x, y))))$.

(The variable $S$ is assumed not to appear in the formula $\psi (x, y)$.) The replacement axiom states that for every set $A$, if for each $x \in A$ there is a unique $y$ such that $\psi(x,y)$, then the collection $\{y$ : $\exists x (x \in A \wedge \psi(x,y))\}$ is a set; that is, a “functional image of a set, is a set.” The replacement axiom is a special form of Cantor’s Comprehension Principle that plays a critical role in modern set theory. However, the replacement axiom does not lead to the contradictions that follow from the Comprehension Principle. Like the subset axiom, the replacement axiom is an axiom schema. Accordingly, there are infinitely many Zermelo-Fraenkel axioms.

Regularity Axiom. Each nonempty set $A$ contains an element that is disjoint from $A$.

$\forall A ( A \neq \varnothing \rightarrow \exists x ( x \in A \wedge \neg \exists y ( y \in x \wedge y \in A)))$.

The regularity axiom, also known as the axiom of foundation, states that, for any nonempty set $A$, there is a set $x \in A$ such that $A \cap x = \varnothing$. The regularity axiom rules out the possibility of a set belonging to itself. In standard mathematics, there are no sets that are members of themselves. For example, the set of natural numbers is not a natural number. The regularity axiom eliminates collections that are not relevant for standard mathematics. The regularity and pairing axioms imply that if $a \in b$, then $b \notin a$. To see this, suppose that $a \in b$. Then it follows, from regularity, that $a \cap \{a,b\} = \varnothing$. So $b \notin a$.

The Zermelo-Fraenkel axioms are now the most widely accepted answer to the question: How can one correctly construct a set? Of course, these axioms are more restrictive than Cantor’s Comprehension Principle; however, no one, in over 100 years, has been able to derive a contradiction from these axioms. Moreover, all of the classic results (excluding the paradoxes) that were derived using Cantor’s naïve set theory can be derived from the Zermelo-Fraenkel axioms.

It is a remarkable fact that essentially all mathematical objects can be defined as sets within Zermelo-Fraenkel set theory. For example, functions, relations, the natural numbers, and the real numbers can be defined within Zermelo-Fraenkel set theory. Hence, effectively all theorems of mathematics can be considered as statements about sets and proven from the Zermelo-Fraenkel axioms.

b. Classes

The argument used in Russell’s Paradox can be applied to prove, in ZF, that there is no set that contains all sets (as elements). As every set is equal to itself, the collection $\{x$ : $x = x\}$ contains every set, but this collection is not a set. Thus, given a formula $\varphi(x)$, one cannot necessarily conclude that the collection $\{x : \varphi(x)\}$ is a set. However, in set theory, it is convenient to be able to discuss such collections. They cannot be called sets. Instead, a collection of the form $\{x$ : $\varphi(x)\}$ is called a class. The collection $\{x : x = x\}$ is a class that is not a set; for this reason, it is called a proper class.

When can one prove that a class is a set? Let us say that a class $\{x : \varphi(x)\}$ is bounded if and only if there is a set $A$ such that for all $x$, if $\varphi(x)$, then $x \in A$. Using the subset axiom, one can prove that a bounded class is a set. It follows that the class $\{x : x = x\}$ is not bounded.

In the Zermelo-Fraenkel axioms, there is no explicit mention of classes. However, there are alternative axiomatizations of set theory that extend ZF by including classes as objects in the language, that is, these axiom systems give classes a formal state of existence. The most common such axiomatic treatment of classes is denoted by NBG (von Neumann–Bernays–Gödel). The NBG system uses a formal language that has two different types of variables: capital letters denote classes and lowercase letters denote sets. In addition, classes can contain only sets as elements. So, a class that is not a set cannot belong to a class. Thus, a class $X$ is a set if and only if $\exists Y (X \in Y)$. In the NBG system, sets satisfy all of the ZF axioms, and the intersection of a class with a set is a set, that is, $X \cap y$ is a set. The NBG system also has the class comprehension axiom:

$\exists X \forall y (y \in X \leftrightarrow \varphi (y))$

where the formula $\varphi(y)$ can contain set parameters and/or class parameters (with other restrictions). Thus, the class comprehension axiom asserts that $\{x : \varphi(x)\}$ is a class.

The NBG system is a conservative extension of ZF; that is, a sentence with only lowercase (set) variables is provable in NBG if and only if it is provable in ZF. The Zermelo-Fraenkel system has a clear advantage over NBG, namely, the simplicity of working with only one type of object (sets) rather than two types of objects (sets and classes). The Zermelo-Fraenkel axiomatic system is the standard system of axioms for modern set theory.

4. Cantor’s Well-Ordering Principle

As proposed by Cantor, two sets $A$ and $B$ have the same cardinality if and only if there is a bijection $f$: $A \rightarrow B$. When $A$ is a finite set, there is a unique natural number, denoted by |$A$|, that identifies the number of elements in $A$. In this case, we say that |$A$| is the cardinality of $A$. For example, if $A = \{3,5,7,2\}$, then |$A$| $= 4$. Clearly, the cardinality of a finite set identifies the number of elements that are in the set. Moreover, if $A$ and $B$ are both finite sets, then one can prove that

|$A$| = |$B$| if and only if there exists a bijection $f$: $A \rightarrow B$.

($\Delta$)

With this understanding, Cantor asked the following question:

Are there values that can represent the size of infinite sets and satisfy ($\Delta$)?

In other words, given two infinite sets $A$ and $B$, can one assign values |$A$| and |$B$| such that

|$A$| = |$B$| if and only if there exists a bijection $f$: $A \rightarrow B$?

Cantor answered this question, in the affirmative, by developing the transfinite ordinal numbers, which are “infinite numbers” in the sense that they are larger than all of the natural numbers, and are well-ordered just like the natural numbers. Cantor believed that each infinite set can be assigned a specific ordinal number and that this ordinal number would measure the size of the set. Cantor realized that, in order to successfully apply his theory of ordinal numbers, he needed an additional principle. In 1883, he proposed the following principle.

Well-Ordering Principle: It is always possible to bring any well-defined set into the form of a well-ordered set.

A relation $\leq$ on a set $X$ is a well-ordering of $X$ if and only if it is a total ordering in which every non-empty subset of $X$ has a least element, where it is assumed that the relation $\leq$ does not apply to any elements that are not in $X$. If a set can be well-ordered, then one can generalize the concepts of induction and recursion, similar to mathematical induction, on the elements of the set. Given any infinite set, Cantor used the well-ordering principle to identify an ordinal number that measures the size of the set. Such an ordinal is called a cardinal number.

a. Ordinal Numbers

The natural numbers are often used for two purposes: to indicate the position of an element in a sequence and to identify the size of a finite set. In other words, a natural number can be used to identify a position (first, second, third, …) and it can be used to identify a size (one, two, three, …). Cantor extended the natural numbers by introducing the concepts of transfinite position and transfinite size. Suppose that we want to count the number of real numbers. As noted in Section 1, Cantor proved that the set of real numbers is uncountable. Thus, if we attempted to assign each real number to exactly one of the natural numbers $0, 1, 2, 3, \ldots,$ then we would not have enough natural numbers to complete this task. However, suppose that we add some new numbers, called transfinite ordinals, to our stock of numbers. Clearly, we need an ordinal that will identify the first position that occurs after all of the natural numbers. Cantor denoted this ordinal by the Greek letter $\omega$. That is, Cantor proposed the following “position” sequence

$0, 1, 2, 3, 4, \ldots, \omega$.

(1)

Observe the following:

By starting with $0$ and repeatedly adding $1$, we obtain all of the natural numbers.
Every natural number greater than $0$ has an immediate predecessor; for example, $5$ has $4$ as its immediate predecessor.

By contrast, the ordinal number $\omega$ cannot be obtained by repeatedly adding $1$ to $0$ and it does not have an immediate predecessor. For these reasons, we say that $\omega$ is a limit ordinal.

We can continue the sequence (1) by repeatedly adding  to $\omega$. By doing so, we obtain the following position sequence:

$0, 1, 2, 3, 4, \ldots, \omega, \omega+1, \omega+2, \omega+3, \ldots$

(2)

The process for constructing (1) and (2) can be repeated endlessly. In this way, we obtain the ordered sequence of all of the ordinals:

$0, 1, 2, 3, 4, \ldots, \omega, \omega+1, \omega+2, \ldots ,\omega+\omega,(\omega+\omega)+1,(\omega+\omega)+2, \ldots$

(3)

where $\omega+\omega$ is a limit ordinal which is usually represented by $2 \cdot \omega$. An ordinal of the form $\alpha+1$ is called a successor ordinal. An ordinal $\delta$ > $0$ that is not a successor ordinal is called a limit ordinal. Cantor used the ordinals to measure the “length” of a well-ordered set.

The natural numbers $0, 1, 2, 3, 4, \ldots$ are sometimes called finite ordinals. Every nonempty subset of the natural numbers has a least element. Similarly, every nonempty set of ordinals has a least element with respect to the ordering in (3). The ordinal numbers are a generalized extension of the natural numbers. One can define the operations of addition, multiplication, and exponentiation on the ordinal numbers. These operations satisfy some (but not all) of the arithmetic properties that hold on the natural numbers, for example, addition is associative (Cunningham 2016).

The set of predecessors of an ordinal is the set of all of the ordinals that come before it in the list (3); for example, the set of predecessors of $\omega$ and $\omega+1$ are the respective sets

$\mathbb{N} = \{0, 1, 2, 3, 4, \ldots\}$, $N’ = \{0, 1, 2, 3, 4, \ldots , \omega \}$.

(4)

The ordinals $\omega$ and $\omega+1$ represent different positions in the list (3); but, the sets $\mathbb{N}$ and $N’$ in (4) have the same cardinality. Note that the cardinality of $\mathbb{N}$ is larger than any finite set, that is, for any natural number $n$, the set $\mathbb{N}$ has cardinality larger than the set $\{0, 1, 2, \ldots, n\}$. For this reason, we say that $\omega$ is a cardinal number.

For any two ordinals $\alpha$ and $\beta$, we say that $\alpha$ < $\beta$ if and only if $\alpha$ appears before $\beta$ in the list (3). For each ordinal $\gamma$, let Pred($\gamma$) = $\{\alpha : \alpha$ < $\gamma\}$ be the set of predecessors of $\gamma$. One can prove, in ZF, that Pred($\gamma$) is a set. In contemporary set theory one usually defines the ordinals so that, for each ordinal $\gamma$, $\gamma$ = Pred$(\gamma)$; that is, each ordinal is defined to be the set of its predecessors. Specifically, a set $\gamma$ is said to be an ordinal if and only if $\gamma$ is well-ordered by the membership relation and is transitive, that is, every element in $\gamma$ is a subset of $\gamma$. Thus, if $\alpha$ < $\beta$, then $\alpha \in \beta$ and $\alpha \subseteq \beta$. For example, $\omega = \{0, 1, 2, 3, 4, \ldots\}$ is an ordinal if the integers (the finite ordinals) are defined as follows:

$0 = \varnothing$,
$1 = \{0\}$,
$2 = \{0,1\}$,
$3 = \{0,1,2\}$,
$4 = \{0,1,2,3\}$.

This approach is due to Von Neumann (Kunen 2009), and such ordinals can be called Von Neumann ordinals. The collection of all ordinals is a proper class (see Cunningham 2016).

b. Cardinal Numbers

An ordinal number $\kappa$ is said to be a cardinal if and only if, for all $\alpha$ < $\kappa$, the set Pred($\alpha$) has smaller cardinality than Pred($\kappa$). It follows that the natural numbers are all cardinals. As noted above, $\omega$ is the first transfinite cardinal, which is often denoted by $\aleph_{0}$. The next transfinite cardinal, after $\aleph_{0}$, is designated by $\aleph_{1}$. This process can be continued to produce the following sequence of finite and transfinite cardinals:

$0, 1, 2, 3, 4, \ldots, \aleph_{0}, \aleph_{1}, \ldots, \aleph_{\omega}, \aleph_{\omega+1}, \ldots, \aleph_{2 \cdot \omega}, \ldots, \aleph_{\omega \cdot \omega}, \ldots$

(5)

where the transfinite cardinal numbers in (5) are indexed by the ordinal numbers. Thus, the collection of all the cardinal numbers is a proper class. A cardinal $\aleph_{\beta}$ is called a successor cardinal if and only if $\beta$ is a successor ordinal; otherwise, it is called a limit cardinal. One can prove, in ZF, that, for every cardinal $\kappa$, there is an ordinal $\alpha$ such that $\kappa = \aleph_{\alpha}$ (Cunningham 2016). Thus, every cardinal appears on the list (5). One can define the operations of addition, multiplication, and exponentiation on the cardinals (exponentiation requires the well-ordering principle). These particular operations are not the same as the corresponding operations on the ordinal numbers (Cunningham 2016).

Cantor used the cardinal numbers to measure the “size” of sets. The well-ordering principle implies that every set A can be assigned a (unique) cardinal number that measures its size. This cardinal number is usually denoted by |$A$|, and is called the cardinality of $A$. Cantor’s Theorem implies that, for any set $A$, |$A$| < |$\wp(A)$|. The operation of cardinal exponentiation allowed Cantor to prove that the cardinality of $\mathbb{R}$, the set of real numbers, is equal to $2^{\aleph_{0} }$, that is, |$\mathbb{R}$| = $2^{\aleph_{0}}$. Since $\aleph_{1}$ is the first cardinal greater than $\aleph_{0}$, Cantor was able to express the Continuum Hypothesis in terms of the equation $2^{\aleph_{0}} = \aleph_{1}$. Moreover, assuming the well-ordering principle, one can conclude that a set $A$ is countable if and only if |$A$| $\leq \aleph_{0}$ and that a set $B$ is uncountable if and only if $\aleph_{1} \leq$ |$B$|.

Infinite cardinals come in two distinct forms: regular or singular. An infinite cardinal $\kappa$ is said to be a regular cardinal if and only if $\kappa$ is not the union of a set consisting of less than $\kappa$ many smaller cardinals. Thus, if $\kappa$ is a regular cardinal, $S$ is a set of cardinals smaller than $\kappa$, and |$S$| < $\kappa$, then $\kappa \neq \bigcup S$. Assuming the well-ordering principle, it follows that each successor cardinal is a regular cardinal. When a cardinal is not regular, it is called a singular cardinal. One can show that an infinite cardinal $\kappa$ is singular if and only if there exists an ordinal $\beta$ < $\kappa$ and a function $f$: Pred($\beta$) → Pred($\kappa$) such that for all $\gamma$ < $\kappa$ there is an ordinal $\alpha$ < $\beta$ such that $\gamma$ < $f(\alpha)$. It follows that $\aleph_{\omega}$ is a singular cardinal.

5. The Axiom of Choice

At the third International Congress of Mathematicians at Heidelberg in 1904, Julius König submitted a proof that the well-ordering principle is false; in particular, he presented an argument showing the set of real numbers cannot be well-ordered. On the next day, Ernst Zermelo identified an error in König’s purported proof. Shortly after the Heidelberg congress, Zermelo (Moore 2012) discovered a proof of the following theorem, which implies that the error found in König’s proof cannot be removed.

Well-Ordering Theorem: Every set can be well-ordered

In his clever proof of the well-ordering theorem, Zermelo formulated and applied the following principle, which he was the first to identify.

Axiom of Choice (AC). Let $T$ be a set of nonempty sets. Then there is a function $F$ such that, for each set $A$ in $T$, $F(A) \in A$.

The function $F$ mentioned in AC is called a choice function for the set $T$. Informally, the axiom of choice asserts that, for any collection of nonempty sets, it is possible to uniformly choose exactly one element from each set in the collection. When $T$ is a finite set, one can prove, in ZF, that there exists a choice function. Today, mathematicians use the axiom of choice when the set $T$ is infinite and it is not clear how to define or construct a desired choice function.

Zermelo applied the axiom of choice to establish the well-ordering theorem. The well-ordering theorem validates both Cantor’s well-ordering principle and that every set can be assigned a cardinal number that measures its size.

a. On Zermelo’s Proof of the Well-Ordering Principle

Zermelo’s proof of the well-ordering theorem is the first mathematical argument that explicitly invokes the axiom of choice. As a result, the proof can be viewed as an important moment in the development of modern set theory. For this reason, we now present a summary of this proof. Let $A$ be a nonempty set and let $T$ be the set of all nonempty subsets of $A$; that is, let

$T = \{ X \in \wp (A)$ : $X \neq \varnothing \}$.

Let $\gamma$ be a choice function for $T$. Call a set $X \in T$ a $\gamma$-set if and only if there is a well-ordering $\leq$ of $X$ such that, for each $a \in X$,

$\gamma(\{z \in A$ ∶ $z$ ≮ $a\}) =a $.

Thus, each element $a \in X$ is the element that the choice function $\gamma$ selects from the set of all elements in $A$ that do not (strictly) precede $a$ in the ordering $\leq$. For example, if $w = \gamma(A)$, then one can show that $\{w\}$ is a $\gamma$-set. Thus, $\gamma$-sets exist. Let $X$ be a $\gamma$-set with well ordering $\leq$ and let $Y$ be a $\gamma$-set with well-ordering $\leq’$. In his proof, Zermelo showed that either $X \subseteq Y$ and $\leq’$ continues $\leq$ or $Y \subseteq X$ and $\leq$ continues $\leq’$, where we say that $\leq’$ continues $\leq$ when the order $\leq’$ only adds new elements that are greater than all of the elements ordered by $\leq$. Zermelo also showed that the union of all of the $\gamma$-sets is a $\gamma$-set and that this union equals $A$. Therefore, $A$ can be well-ordered.

Essentially, the axiom of choice states that one can make infinitely many arbitrary choices. As noted above, Cantor’s acceptance of infinite sets led to a dispute among some of Cantor’s contemporaries. Similarly, Zermelo’s axiom of choice incited further controversy concerning the infinite. The main objection to the axiom of choice was the obvious one: How can the existence of a choice function be justified when such a function cannot be defined or explicitly constructed? Surprisingly, many of the axiom’s severest critics had unwittingly applied the axiom in their own work. In the decades following its introduction, the axiom of choice gained acceptance among most mathematicians; in part, this was because the axiom of choice is a very useful principle whose deductive strength is required to prove many important mathematical theorems (Moore 2012). Moreover, the axiom of choice is equivalent to a number of seemingly unrelated principles in mathematics. For example, in ZF, the axiom of choice is equivalent to Zorn’s lemma, the well-ordering theorem, and the comparability theorem (see Cunningham 2016).

The Zermelo-Fraenkel system of axioms is denoted by ZF and the axiom of choice is abbreviated by AC. The axiom of choice is not one of the axioms in ZF. The result of adding the axiom of choice to the system ZF is denoted by ZFC.

There were many unsuccessful attempts to prove the axiom of choice assuming only the axioms in ZF. As a result, mathematicians began to doubt the possibility of proving the axiom of choice from the axioms in ZF and, eventually, it was shown that such a proof does not exist. The combined work of Kurt Gödel, in 1940, and Paul Cohen, in 1963, confirmed that the axiom of choice is independent of the Zermelo-Fraenkel axioms, that is, AC cannot be proven or refuted using just the axioms in ZF. Nevertheless, the axiom of choice is a powerful tool in mathematics and there are many significant theorems that cannot be established without it. Consequently, mathematicians typically assume the axiom of choice and often cite it when they use it in a proof.

b. Banach-Tarski Paradox

Set theory frequently deals with infinite sets. Moreover, as we have seen, there are times when infinite sets have properties that are unlike those of finite sets. Such properties of infinite sets can appear to be counter-intuitive or paradoxical, because they conflict with the behavior of finite sets or with our limited intuition. Cantor proved a theorem that illustrates this fact. Let $I$ denote the unit interval $\lbrack 0,1 \rbrack$, that is, the set of all real numbers $x$ such that $0 \leq x \leq 1$. Let $S$ denote the unit square in the plane, that is, the set of all ordered pairs $(x,y)$ such that such that $0 \leq x \leq 1$ and $0 \leq y \leq 1$. The sets $I$ and $S$ appear in the following figure:

Cantor initially believed that the set of points in the two-dimensional square $S$ must have cardinality much larger than the set of points in the one-dimensional interval $I$. Then he discovered a proof showing that his initial intuition was wrong. Cantor’s theorem below, which can be proven without the axiom of choice, shows the sets $I$ and $S$ have the same cardinality.

Theorem (Cantor). There exists a bijection $f$: $I \rightarrow S$.

One can use the bijection $f$: $I \rightarrow S$ to proclaim that one can, theoretically, disassemble all of the points in the interval $I$ and then reassemble these points to obtain the unit square $S$. This, of course, is counter-intuitive, as we know that one cannot cut-up a 1-foot piece of thread and then put the pieces together to obtain a square-foot piece of fabric. Thus, there are infinite abstract objects that do not behave in the same way as finite concrete objects.

We now present a theorem due to Stefan Banach and Alfred Tarski (1924). The proof of this theorem uses the axiom of choice, in an essential manner, to prove another counter-intuitive result. Some have claimed that this theorem thus refutes the axiom of choice. First, we identify some terminology. In three-dimensional space, a unit ball is a set of points of distance less than or equal to $1$ from a fixed central point.

Theorem (Banach, Tarski). A unit ball in three-dimensional space can be split into five pieces that can be rigidly moved, rotated, and put back together to form two unit balls.

The Banach–Tarski Theorem is often referred to as a paradox because it is counter-intuitive; for example, the theorem implies that, theoretically, one can split a solid glass ball into five pieces and then use the pieces to create two new glass balls of the same size as the original. However, in the proof of the theorem, the five pieces that are formed are not solids that have a measurable volume; they are five complex infinite sets of points. We repeat: there are infinite abstract objects that do not behave in the same way as finite concrete objects.

The conclusion of the Banach–Tarski Theorem does not refute the axiom of choice, and Cantor’s above theorem does not render the axioms of set theory false. Ever since the ancient Greeks, there have been results in mathematics that were once viewed as being counter-intuitive. Such results eventually become better understood and, as a result, become more intuitive themselves.

6. The Cumulative Hierarchy

Zermelo’s 1904 proof of the well-ordering theorem resembles von Neumann’s 1923 proof of the transfinite recursion theorem, a powerful tool in set theory. A formula $\varphi(g,u)$ is said to be functional if and only if $\forall g \exists ! u \varphi (g,u)$; that is, for all $g$, there is a unique $u$ such that $\varphi(g,u)$. Given a functional formula, $\varphi(g,u)$, consider the class of ordered pairs

$F = \{(g,u)$ ∶ $\varphi(g,u)\}$.

Since $\varphi(g,u)$ is functional, one can view $F$ as a class function (that is, a functional class), and thus, $F(x)$ is a set whenever $x$ is a set. Let $F$|$A$ denote the function obtained by restricting the domain of $F$ to the set $A$. The replacement axiom implies that $F$|$A$ is a set whenever $A$ is a set.

Transfinite Recursion Theorem: Let $\varphi(g,u)$ be a functional formula. Then there is a class function $H$ such that, for all ordinals $\beta$, $\varphi(H$|$\beta,H(\beta))$.

The transfinite recursion theorem is used to define what is commonly known as the cumulative hierarchy of sets and usually denoted by $\{V_{\beta} : \beta \text{ is an ordinal}\}$, which satisfies (see figure below)

$V_{0} = \varnothing$,
$V_{\gamma + 1} = \wp (V_{\gamma})$, for any ordinal $\gamma$,
$V_{\beta} = \bigcup \{V_{\alpha}$ : $\alpha $ < $\beta\}$, for any limit ordinal $\beta$.

One obtains $\{V_{\beta} : \beta \text{ is an ordinal}\}$ by repeatedly applying the power set operation at successor ordinals and by taking the union of all the previous sets at limit ordinals. In particular, $V_{0} = \varnothing$ and

$V_{1} = \wp (V_{0})= \{ \varnothing,\{ \varnothing \} \}, \ldots , V_{\omega} = \bigcup \{ V_{n}$ : $n$ < $\omega\}, \ldots$

The regularity axiom implies that for every set $x$, there exists an ordinal $\alpha$ such that $x \in V_{\alpha}$. For this reason, the proper class $V = \bigcup \{V_{\beta} : \beta \text{ is an ordinal}\}$ is called the universe of sets. It follows that each set $V_{\beta}$ is in $V$ and that all of the axioms in ZF are true in $V$. In addition, as one ascends the “ordinal spine,” one obtains sets $V_{\gamma}$ of ever greater complexity that become better and better approximations to $V$ (see above figure). This is confirmed by the reflection principle (see below) which, in essence, asserts that any statement that is true in $V$, is also true in some set $V_{\beta}$.

Let $\varphi (v_{1}, \ldots , v_{n})$ be a formula in the language of set theory with free variables $v_{1}, \ldots , v_{n}$. For any ordinal $\alpha$ and $x_{1}, \ldots , x_{n} \in V_{\alpha}$, we write

$(V_{\alpha}, \in) \vDash \varphi (x_{1}, \ldots , x_{n})$

to mean that $\varphi(x_{1}, \ldots ,x_{n})$ is true in $V_{\alpha}$. The following theorem of ZF, due to Azriel Levy (Levy 1960) and Richard Montague (Montague 1961), implies that any specific truth that holds in $V$ likewise holds in some initial segment $V_{\beta}$ of $V$; in fact, it holds in unboundedly many initial segments.

Reflection Principle: Let $\varphi(v_{1}, \ldots, v_{n})$ be a formula and let $\alpha$ be an ordinal. Then there is an ordinal $\beta $ > $\alpha$ such that, for all $x_{1}, \ldots , x_{n} \in V_{\beta}, \varphi (x_{1}, \ldots ,x_{n})$ is true in $V$ if and only if $(V_{\beta}, \in) \vDash \varphi (x_{1}, \ldots, x_{n})$.

As a corollary, for any finite number of formulas that hold in $V$, the reflection principle implies that all of these formulas also hold in some $V_{\beta}$. As noted before, there are an infinite number of axioms in ZF. Montague (Montague 1961) used the reflection principle to conclude that if ZF is consistent, then ZF is not finitely axiomatizable. Hence, ZF is not equivalent to any finite number of the axioms in ZF. This follows from Gödel’s second incompleteness theorem (see Kunen 2011, page 8), which implies that, if ZF is consistent, then one cannot prove, in ZF, the existence of a set model of ZF, that is, a set $M$ such that $(M,\in) \vDash \varphi$, for every axiom $\varphi$ in ZF.

7. Gödel’s Constructible Universe

As we have seen, the cumulative hierarchy of sets is constructed in stages. At successor stages, one adds all possible subsets of the previous stage and, at limit stages, one takes the union of all of the previously produced sets. To prove that the axiom of choice and the Continuum Hypothesis are consistent with ZF, Kurt Gödel (1938) constructed the “inner model” $L$ of $V$ commonly known as the universe of constructible sets. As we will see, $L$ is a subclass of $V$. The idea behind Gödel’s construction of $L$ is to modify the cumulative hierarchy structure so that the end result will produce a (smaller) class that satisfies ZF. For any set $X$, define $D(X)$ to

$D(X) = \{A \subseteq X: A$ is definable over $(X,\in)\}$

where $A$ is definable over $(X,\in)$ means that there are $x_{1},\ldots,x_{n}$ in $X$ and a formula $\varphi(v,x_{1},\ldots,x_{n})$ such that, for all $a$ in $X$,

$a \in A$ if and only if $(X,\in) \vDash \varphi (a,x_{1},\ldots,x_{n})$.

One can show, in ZF, that $D$ is a class function (Moschovakis 2009, 8D). Using the transfinite recursion theorem and the “definable subset” operation $D$, Gödel defined the class $\{L_{\beta} : \beta \text{ is an ordinal}\}$ by applying the operation $D$ at successor ordinals and by taking the union of all of the previous sets at limit ordinals. The class $\{L_{\beta} : \beta\text{ is an ordinal}\}$ satisfies the following (see figure below):

$L_{0} = \varnothing$,
$L_{\gamma + 1} = D(L_{\gamma})$, for any ordinal $\gamma$,
$L_{\beta} = \bigcup \{L_{\alpha}$ : $\alpha $ < $\beta\}$, for any limit ordinal $\beta$.

Consequently, at each successor stage of the construction, one extracts only the definable subsets of the previous stage. The proper class $L = \bigcup\{L_{\beta} : \beta\text{ is an ordinal}\}$ is called the universe of constructible sets.

Assuming ZF, Gödel proved that $L$ satisfies ZF, the axiom of choice, and the Continuum Hypothesis (Gödel 1990). Thus, if ZF is consistent, then so is the theory ZF+AC+CH. This result does not prove that the axiom of choice and the Continuum Hypothesis are true in $V$, but it does show that one cannot prove, in ZF, that either AC or CH is false.

The proper class $L$ (with the $\in$ relation restricted to $L$) is called an inner model, because it is a transitive class (a class that includes all of the elements of its elements), contains all of the ordinals, and satisfies all of the axioms in ZF.

Gödel’s notion of a constructible set has led to interesting and fruitful discoveries in set theory. By generalizing Gödel’s definition of $L$, contemporary set theorists have defined a variety of inner models that have been used to establish new consistency results (Kanamori 2003, pp. 34-35). Each of these inner models contains $L$ as a subclass, and to understand the structure of these inner models, one must be familiar with the above definition of Gödel’s constructible sets. Moreover, a penetrating investigation into the structure of $L$ has led researchers to discover many fascinating results about $L$ and its relationship to the universe of sets $V$ (Jech 2003).

8. Cohen’s Forcing Technique

In 1963, the mathematician Paul Cohen introduced an extremely powerful method, called forcing, for the construction of models of Zermelo-Fraenkel set theory. A model M of set theory is a transitive collection of sets in which the ZF (ZFC) axioms are all true, denoted by M $\vDash$ ZF (M $\vDash$ ZFC).

As discussed in section 7, Gödel showed that one cannot prove, in ZF, that either AC or CH is false. Cohen used his forcing technique to construct a model of ZFC in which the Continuum Hypothesis is false. Hence, one cannot prove, in ZFC, that CH is true. Thus, if ZFC is consistent, then CH is undecidable in ZFC. Cohen (1963) also showed that his technique of forcing can be used to produce a model of set theory in which ZF holds and the axiom of choice is false. Thus, AC is not provable in ZF. So, if ZF is consistent, then AC is undecidable in ZF.

Cohen’s idea was to start with a given set model $M$ of ZFC (the ground model) and extend it by adjoining a “generic” set $G$ to $M$ where $G \notin M$. The resulting model $M[G]$ (a generic extension of $M$) includes $M$, contains $G$, and satisfies ZFC. Cohen showed how to find a set $G$ so that CH fails in $M[G]$. In a similar manner, Cohen was able to add a new set $G$ to $M$ such that there is an inner model of $M[G]$ in which ZF holds and the axiom of choice is false. For his work, Cohen was awarded the Fields Medal in 1966. This award is considered to be the “Nobel Prize” of mathematics. Gödel stated that Cohen’s forcing method was “the greatest advance in the foundations of set theory since its axiomatization” (Kanamori 2003, page 32).

The discussion in the previous paragraph about $M$ is neither complete nor entirely correct. In order to prove that the desired generic set $G$ exists, Cohen, in fact, had to assume that $M$ is a countable transitive set model of ZFC. Let us do the same. A partial order is a pair $(P,\leq)$ such that $P \neq \varnothing$ and $\leq$ is a relation on $P$ which is reflexive, antisymmetric, and transitive. By varying $(P,\leq)$, one can obtain generic extensions that satisfy a wide variety of statements that are consistent with ZFC. Let $(P,\leq) \in M$ be a partial order that is definable in $M$, and suppose that, in $M$, the definition of $(P,\leq)$ and its properties are based only on the fact that $M \vDash ZF.$ Since $M$ is countable, there exists a generic set $G \subseteq P$ (Kunen 2012, Lemma IV.2.3). Let us presume that $(P,\leq)$ has the properties required to ensure that $M[G] \vDash \varphi$, where $\varphi$ is a sentence in the language of set theory; for example, $\varphi$ could be “not CH.” Hence, $M[G] \vDash$ ZFC $+~\varphi$. Thus,

if $M$ is a countable transitive set model of ZFC, then ZFC $+~\varphi$ is consistent.

(6)

To conclude that ZFC $+~\varphi$ is consistent, it appears that one must first show that there exists a countable transitive set model of ZFC. However, by Gödel’s second incompleteness theorem, one cannot prove, in ZFC, that such a set model exists (unless ZFC is inconsistent). Is there a way around this difficulty? Note that there are finitely many axioms in ZFC such that if just these axioms hold in $M$, then one can still prove that $M[G] \vDash \varphi$ (Kunen 2011).

We now discuss how the above argument used to establish (6) can be modified to correctly conclude that ZFC $+~\varphi$ is consistent. Let $T$ be a finite set of axioms in ZFC. Using the reflection principle, one can prove, in ZFC, that

there is a countable transitive set model $M$ in which the axioms in $T$ are true.

(7)

For any finite set $S$ of axioms in ZFC, the forcing method shows that there is a finite set $T$ of axioms in ZFC such that $S \subseteq T$ and

if $M$ is a countable transitive set model in which the axioms in $T$ hold, then there is a generic extension $M[G]$ in which $\varphi$ and the axioms in $S$ hold.

(8)

Since $T$ is a finite set of axioms, we conclude from (7) that there is a countable transitive set model $M$ that satisfies all of the axioms in $T$. Therefore, by (8), there is a generic extension $M[G]$ that satisfies $\varphi$ and all of the axioms in $S$. Since proofs are finite, we conclude that, in ZFC, one cannot prove $\neg \varphi$. Hence, ZFC $+~\varphi$ is consistent, assuming that ZFC is consistent.

Cohen’s forcing technique is very versatile and has been used to show that there are many statements, both in set theory and in mathematics, that are undecidable (or unprovable) in ZF and ZFC. For example, in mathematics, the Hahn–Banach theorem is a crucial tool used in functional analysis. The proof of this theorem uses the axiom of choice. The forcing method has been used to show that Hahn–Banach theorem is not provable in ZF alone (Jech 1974). Moreover, using forcing results and the universe of constructible sets, Saharon Shelah (1974) has shown that a famous open problem in abelian group theory (Whitehead’s Problem) is undecidable in ZFC.

As suggested earlier, since essentially all mathematical concepts can be formalized in the language of set theory, set theory offers a unifying theory for mathematics. Thus, the theorems of mathematics can be viewed as assertions about sets. Moreover, these theorems can also be proven from ZFC, the Zermelo-Fraenkel axioms together with the axiom of choice. Cohen’s forcing method clearly shows that ZFC is an incomplete theory, as there are statements that cannot be resolved in it. This motivates the following question:

What path should be taken to try to settle the Continuum Hypothesis and other undecided statements in mathematics?

In contemporary set theory, the most common answer to this question is called Gödel’s Program:

Search for new axioms, which, when added to ZFC, will determine the truth or falsity of unresolved statements.

This program was inspired by an article of Gödel’s in which he discusses the mathematical and philosophical aspects of mathematical statements that are independent of ZFC (Gödel 1947). Sections 9 and 10 will discuss two directions that this program has taken: large cardinal axioms and determinacy axioms.

9. Large Cardinal Axioms

Roughly, a large cardinal axiom is a set-theoretic statement that asserts the existence of an uncountable cardinal $\kappa$ that satisfies a particular property that implies that there is a set $M$ such that $(M,\in)$ is a model of ZFC; such a $\kappa$ is called a large cardinal. Gödel’s second incompleteness theorem implies that, in ZFC, one cannot prove the existence of large cardinals. Thus, a large cardinal axiom is a “new axiom.” Most modern set theorists believe that the standard large cardinal axioms are consistent with ZFC.

Assuming ZFC, let us say that a cardinal $\kappa$ is a strong limit cardinal if and only if, for every cardinal $\lambda$, if $\lambda$ < $\kappa$, then $2^{\lambda}$ < $\kappa$. A cardinal $\kappa$ is said to be inaccessible if and only if $\kappa$ is uncountable, regular, and a strong limit cardinal. Recall that a cardinal $\kappa$ is regular if $\kappa$ is not the union of fewer than $\kappa$ many sets of size each less than $\kappa$. If $\kappa$ is an inaccessible cardinal, then, in ZFC, one can prove that $(V_{\kappa},\in)$ is a model of ZFC (Kanamori 2003). Hence, such a $\kappa$ is an example of a large cardinal and so, the statement “there exists an inaccessible cardinal” is a large cardinal axiom.

There are other large cardinal axioms. The description of these large cardinal axioms usually involves the concept of an elementary embedding of the universe, that is, a nontrivial truth preserving transformation from $(V,\in)$ into $(M,\in)$ where $M$ is a transitive subclass of $V$. A theorem of Kenneth Kunen (Jech 2003) shows that there is no nontrivial elementary embedding of the universe $V$ into itself. Thus, for any nontrivial truth preserving transformation from $(V,\in)$ into $(M,\in)$ where $M$ is a transitive subclass of $V$, $M \neq V$. More specifically, a large cardinal axiom can be expressed as asserting that there exists a nontrivial (class) function

$j: V \rightarrow M$

such that for each formula $\varphi(v_{1},v_{2},\ldots,v_{n})$ (in the language of set theory) and for all elements $x_{1},\ldots,x_{n}$ in $V$,

$(V,\in) \vDash \varphi(x_{1},\ldots,x_{n})$ if and only if $(M,\in) \vDash \varphi(j(x_{1}),\ldots,j(x_{n}))$.

Since the embedding $j$ is not the identity, there must be a least ordinal $\kappa$ such that $\kappa$ < $j(\kappa)$. This ordinal is called the critical point of $j$ and is denoted by $\kappa$ = crit$(j)$. It follows that $\kappa$ is a cardinal; indeed, $\kappa$ is the large cardinal that is confirmed by the existence of the embedding $j$.

A cardinal $\kappa$ is said to be measurable if and only if there exists an embedding $j: V \rightarrow M$ such that $\kappa$ is the critical point of $j$. In this case, one can prove that $V_{\kappa+1} \subseteq M$. Therefore, there is some resemblance between $M$ and $V$. Increasingly stronger large cardinal axioms demand a greater agreement between $M$ and $V$. For example, if one requires that $V_{\kappa+2} \subseteq M$, then one obtains a stronger large cardinal axiom. For another example, a cardinal $\kappa$ is said to be superstrong if and only if there is a transitive class $M$ and a nontrivial elementary embedding $j: V \rightarrow M$ such that $\kappa$ = crit$(j)$ and $V_{j(\kappa)} \subseteq M$. Even stronger large cardinal axioms are obtained by requiring greater and greater resemblance between $M$ and $V$ (Woodin 2011).

Large cardinal axioms are statements that assert the existence of large cardinals. These axioms are widely viewed as being very promising new axioms for set theory. Large cardinal axioms do not resolve the Continuum Hypothesis but they have led mathematicians to formulate conditions under which Cantor’s hypothesis is false (Woodin 2001, p. 688). As already mentioned, one cannot prove, in ZFC, that large cardinals exist. Yet, there is very strong evidence that their existence cannot be refuted in ZFC (Maddy 1988).

10. The Axiom of Determinacy

Descriptive set theory has its origins, in the early 20th century, with the theory of real-valued functions and sets of real numbers developed by Borel, Baire, and Lebesgue. These analysts, respectively, introduced

the hierarchy of Borel sets of real numbers,
the Baire hierarchy of real-valued functions,
Lebesgue measurable sets of real numbers.

Descriptive set theory extends the work of these mathematicians (Moschovakis 2009). Recall that $\omega = \{0,1,2,3,4,\ldots\}$ is the set of natural numbers. Let $^{\omega}\omega$ be the set of all functions from $\omega$ to $\omega$. The set $^{\omega}\omega$ is denoted by $\mathbb{R}$ and is called Baire Space. $\mathbb{R}$ is often referred to the set of reals; and if $x \in \mathbb{R}$, then $x$ is called a real. $\mathbb{R}$ is regarded as a topological space by giving it the product topology, using the discrete topology on $\omega$. The space $\mathbb{R}$ is homeomorphic to the set of irrational numbers which is a subspace of the set of real numbers (Moschovakis 2009).

Descriptive set theory is a branch of set theory that uses set theoretic tools to investigate the structure of definable sets and functions over $\mathbb{R}$. One can identify the level of complexity of such definable sets of reals (Moschovakis 2009). Thus, there is a natural hierarchy on the definable subsets of $\mathbb{R}$, which, in increasing order of complexity, is called the projective hierarchy.

As a result of Gödel’s and Cohen’s work, it has been shown that many questions in descriptive set theory are not decidable in axiomatic set theory. For example, in 1938, Gödel showed that in $L$, the universe of constructible sets, there are projective sets of reals that are not Lebesgue measurable. In 1970, using the method of forcing, Robert Solovay showed that if there is an inaccessible cardinal, then ZFC is consistent with the statement that every projective set is Lebesgue measurable. Thus, one can neither prove nor disprove, in ZFC, the Lebesgue measurability of projective sets. Hence, in ZFC, the theory of projective sets is incomplete. For this reason, modern descriptive set theory focuses on new axioms; one such axiom concerns infinite games.

Gale and Stewart (1953) introduced the general concept of an infinite game of perfect information and began the study of these games. Other mathematicians then pursued this subject and discovered that it can be used to resolve problems in descriptive set theory.

We now turn to a description of infinite games and strategies. For each $A \subseteq \mathbb{R}$, we associate a two-person infinite game on $\omega$ with payoff $A$, denoted by $G_{A}$, where players I and II alternately choose natural numbers $a_{i}$ in the order given in the diagram:

After completing an infinite number of moves, the players produce the real

$x =$ ⟨$a_{0},a_{1},a_{2},\ldots$⟩.

Player I is said to win if $x \in A$, otherwise player II is said to win. As each player is aware of all the previous moves before making a next move, the game is called a game of perfect information. The game $G_{A}$ is said to be determined if and only if either player has a “winning strategy,” that is, a function that ensures the player will win the game regardless of how the other player makes his or her moves. The Axiom of Determinacy (AD) is a regularity hypothesis about such games that states: For all $A \subseteq \mathbb{R}$, the game $G_{A}$ is determined.

In the theory ZF+AD, one can resolve many open questions about the sets of real numbers. For example, one can prove Cantor’s original form of the continuum hypothesis: Every uncountable set of real numbers has the same cardinality as the full set of real numbers.

Moreover, it has been shown that the axiom of choice implies that AD is false; that is, using the axiom of choice, one can construct a set of reals $A$ such that the game $G_{A}$ is not determined. Thus, the axiom of determinacy is incompatible with the axiom of choice. However, it is not clear that one can establish, without the axiom of choice, the existence of a set of reals $A$ such that the game $G_{A}$ is not determined (Moschovakis 2009). Moreover, there are weaker versions of AD that are compatible with ZF together with a weaker choice principle called the axiom of dependent choices.

Axiom of Dependent Choices (DC). Let $R$ be a relation on a nonempty set $A$. Suppose that for all $x \in A$ there is a $y \in A$ such that $R(x,y)$. Then there exists a function $f: \omega \rightarrow A$ such that, for all $n \in \omega$, $R(f(n),f(n+1))$.

Many mathematicians working in descriptive set theory operate within the background theory ZF+DC and the following determinacy axiom: For every projective set $A$, the game $G_{A}$ is determined. This axiom is denoted by PD (projective determinacy). Under the theory ZF+DC+PD, the classic open questions about projective sets have been successfully addressed (Moschovakis 2009). In particular, this theory implies that all projective sets are Lebesgue measurable.

Generalizing the construction of the inner model $L$, one can construct the inner model $L(\mathbb{R})$, the smallest inner model that contains all the ordinals and all the reals. The set $\wp(\mathbb{R}) \cap L(\mathbb{R})$ can be viewed as a natural extension of the projective sets. The determinacy hypothesis denoted by AD$^{L(\mathbb{R})}$, asserts that AD holds in $L(\mathbb{R})$. Since the inner model $L(\mathbb{R})$ contains all of the projective sets, the assumption AD$^{L(\mathbb{R})}$ implies PD.

There are very deep results that connect determinacy hypotheses and large cardinal axioms. In 1988, Martin and Steel, working in ZFC, identified a large cardinal axiom that implies PD. By assuming a stronger large cardinal axiom, Woodin, within ZFC, was able to prove that AD$^{L(\mathbb{R})}$ holds and so, $L(\mathbb{R})$ satisfies ZF+AD. Moreover, PD and AD$^{L(\mathbb{R})}$, individually, imply the consistency of certain large cardinal axioms (Kanamori 2003). Investigating the relationships between determinacy hypotheses and large cardinals has become an important component of modern set theory.

11. Concluding Remarks

Set Theory is a rich and beautiful branch of mathematics whose fundamental concepts permeate all branches of mathematics. It is a most extraordinary fact that all standard mathematical objects can be defined as sets. For example, the natural numbers and the real numbers can be constructed within set theory. In addition, algebraic structures, functional spaces, vector spaces, and topological spaces can be viewed as sets in the universe of sets $V$. Consequently, mathematical theorems can be regarded as statements about sets. These theorems can also be proven from ZFC, the axioms of set theory. Thus, mathematics can be embedded into set theory.

Since all of conventional mathematics can be developed within set theory, one can view certain results in set theory as being part of metamathematics, the field of study within mathematics that uses mathematical tools to investigate the nature and power of mathematics. For example, using the forcing technique and inner models, it has been shown that there are mathematical statements that cannot be proven or disproven in ZFC. Thus, when a particular mathematical statement is unresolved, set theory can sometimes show that there is neither a proof nor a refutation of the statement in ZFC. As noted above, this situation has inspired the search for new set theoretic axioms.

Of course, the fact that set theory offers a foundation for mathematics indicates that set theory is a very important branch of mathematics. However, the concepts and techniques developed within set theory demonstrate that, in itself, set theory is a deep and exciting branch of mathematics with significant applications to other areas of mathematics. This success has inspired some philosophers of mathematics to direct their attention to the philosophy of set theory and the search for new axioms (Maddy 1988a, 1988b, 2011).

12. References and Further Reading

a. Primary Sources

Banach, S. and Tarski, A. 1924. “Sur la décomposition des ensembles de points en parties respectivement congruentes,” Fund. Math., 6, pp. 244–277.
Cantor, Georg. 1874. “Über eine Eigenschaft des Inbegriffes aller reellen algebraischen Zahlen,” Journal fur die reine und angewandte Mathematik (Crelle). 77, 258–262.
Cohen, Paul J. 1963. The independence of the axiom of choice. Mimeographed.
Cohen, Paul J. 1963a. “The independence of the continuum hypothesis I.” Proceedings of the U.S. National Academy of Sciences 50, 1143-48.
Cohen, Paul J. 1964. “The independence of the continuum hypothesis II.” Proceedings of the U.S. National Academy of Sciences 51, 105-110.
Cohen, Paul J. 1966. Set Theory and the Continuum Hypothesis, New York: Benjamin.
Cunningham, Daniel W. 2016. Set Theory: A First Course, New York: Cambridge University Press.
Dauben, Joseph W. 1979. Georg Cantor: his mathematics and philosophy of the infinite, Cambridge, Mass., Harvard University Press; reprinted: Princeton, Princeton University Press, 1990.
Dunham, William. 1990. Journey Through Genius: The Great Theorems of Mathematics (1st ed.). John Wiley and Sons.
Gale, D. and Stewart, F.M. 1953. “Infinite games with perfect information,“ Annals of Math. Studies, vol. 28, pp. 245–266.
Gödel, Kurt. 1947. “What is Cantor’s Continuum Problem?,” American Mathematical Monthly, vol. 54, pp. 515-525.
Gödel, Kurt. 1986. Collected Works, Volume I: Publications 1929–1936, (Solomon Feferman, editor-in-chief), Oxford University Press, New York.
Gödel, Kurt. 1990. Collected Works, Volume II: Publications 1938–1974, (Solomon Feferman, editor-in-chief), Oxford University Press, New York.
Gödel, Kurt. 1995. Collected Works, Volume III: Unpublished Essays and Lectures, (Solomon Feferman, editor-in-chief), Oxford University Press, New York.
Hilbert, David. 1923. On the infinite. Reprinted in the Philosophy of Mathematics: Selected Readings, 1983, edited by Paul Benacerraf and Hilary Putnam, pp. 83-201.
Jech, Thomas. 2003. Set theory. Third Edition, New York: Springer.
Jech, Thomas. 1973. The Axiom of Choice, North-Holland Publishing Company, Studies in logic and the foundations of mathematics, vol. 75, Amsterdam.
Kanamori A. 2003. The Higher Infinite. Perspectives in Mathematical Logic. Second edition. Berlin: Springer.
Kanamori A. 2012. Set theory from Cantor to Cohen, a book chapter in: Handbook of the History of Logic: Sets and Extensions in the Twentieth Century. Volume editor: Akihiro Kanamori. General editors: Dov M. Gabbay, Paul Thagard and John Woods. Elsevier BV.
Kunen, Kenneth. 2009. The Foundations of Mathematics. Studies in Logic, vol. 19. London: College Publications.
Kunen, Kenneth. 2011. Set Theory. Studies in Logic, vol. 34. London: College Publications.
Lévy, Azriel. 1960. “Axiom schemata of strong infinity in axiomatic set theory,” Pacific Journal of Mathematics, 10, pp. 223–238.
Maddy, Penelope H. 1988a. “Believing the axioms I.” The Journal of Symbolic Logic, 53(2), 481–511.
Maddy, Penelope H. 1988b. “Believing the axioms II.” The Journal of Symbolic Logic, 53(3), 736–764.
Maddy, Penelope H. 2011. Defending the axioms. On the philosophical foundations of set theory. Oxford: Oxford University Press.
Montague, Richard M. 1961. Fraenkel’s addition to the axioms of Zermelo. Essays on the foundations of mathematics, dedicated to A. A. Fraenkel on his seventieth anniversary, edited by Y. Bar-Hillel, E. I. J. Poznanski, M. O. Rabin, and A. Robinson for The Hebrew University of Jerusalem, Magnes Press, Jerusalem, and North-Holland Publishing Company, Amsterdam, pp. 91–114.
Moore, Gregory H. 2012. Zermelo’s Axiom of Choice: Its Origins, Development, and Influence. Mineola, NY: Dover Publications. Reprint of the 1982 original published by Springer.
Moschovakis, Yiannis. 2009. Descriptive Set Theory, 2nd edition, vol. 155 of Mathematical Surveys and Monographs, American Mathematical Society, Providence, 2009.
Solovay, Robert. 1970. “A model of set theory in which every set is Lebesgue measurable.” Annals of Mathematics, vol. 92, 1–56.
Shelah, Saharon. 1974. “Infinite abelian groups, Whitehead problem and some constructions.” Israel J. Math, vol. 18, 243–256.
Woodin, Hugh. 2001. “The Continuum Hypothesis, Part II.” Notices of the American Mathematical Society, vol. 48, no. 7.
Woodin, Hugh. 2011. Infinity, a book chapter in: Infinity: New Research Frontiers. Cambridge: Cambridge University Press.
Zermelo, Ernst. 2010. Collected Works. Gesammelte Werke. Volume I: Set Theory, Miscellania. Mengenlehre, Varia, edited by H.-D. Ebbinghaus and A. Kanamori, Springer, Berlin and Heidelberg, xxiv + 654 pp.

b. Secondary Sources

Ebbinghaus, Heinz-Dieter. 2007. Ernst Zermelo. An Approach to His Life and Work. Berlin: Springer. In cooperation with Volker Peckhaus.
Enderton, Herbert B. 1977. Elements of Set Theory. New York: Academic Press.
Enderton, Herbert B. 2001. A Mathematical Introduction to Logic. 2nd edn. Burlington, MA: Harcourt/Academic Press.
Feferman, Solomon, Parsons, Charles and Simpson, Steven G. (Eds.). 2010. Kurt Gödel: essays for his centennial. Cambridge: Cambridge University Press.
Halmos, Paul R. 1974. Naïve Set Theory. New York: Springer. Reprint of the 1960 edition published by Van Nostrand.
Hauser, Kai. 2006. “Gödel’s Program Revisited Part I: The Turn to Phenomenology.” Bulletin of Symbolic Logic, 12(4), 529–590.
Heller, Michael and Woodin, Hugh. (Eds.). 2011. Infinity: New Research Frontiers. Cambridge: Cambridge University Press.
Kanamori, Akihiro. 2012. “In praise of replacement.” Bulletin of Symbolic Logic, 18(1), 46–90.
Levy, Azriel. 2002. Basic Set Theory. Mineola, NY: Dover Publications. Reprint of the 1979 original published by Springer.
Moschovakis, Yiannis. 2006. Notes on Set Theory. 2nd edition. Undergraduate Texts in Mathematics. New York: Springer.
Potter, Michael. 2004. Set theory and Its Philosophy. New York: Oxford University Press.

c. Internet Sources

Author Information

Daniel Cunningham
Email: cunnindw@buffalostate.edu
State University of New York Buffalo State
U. S. A.

Future Contingents

The riddle of the future bewilders human beings. On the one hand, we are inclined to think that future events are real in some sense, because we ask questions and make assertions about them. On the other hand, we are inclined to think that future events may depend on our choices, because we conceive of ourselves as free agents. These two inclinations seem to clash. If an event belongs to the future, then it is a fact that it will occur, and we cannot prevent it from occurring. Inversely, if we can prevent an event from occurring, then it cannot be a fact that it will occur. This apparent conflict is at the core of the debate on future contingents, a philosophical dispute that goes back to antiquity. Future contingents are sentences that concern future events that can occur or not occur. The question that started the debate—whether future contingents are true or false—is a question that has no clear answer, given that one may have different views about the truth and falsity of a sentence about the future. Yet an answer must be provided, and it cannot be just any answer. The constraints that define the problem of future contingents determine a restricted set of admissible answers, each of which gives rise to doubts, troubles, and complications.

The Problem
Three Logical Options
Three Metaphysical Views
The Open Future
References and Further Reading

1. The Problem

a. Speaking about the Future

Tomorrow many things will happen. Some of them are things of which it seems correct to assert that they will happen, others are things of which it does not seem correct to assert that they will happen. For example, it seems correct to assert that the sun will rise. Alternatively, it does not seem correct to assert that exactly 3,245 pigeons will walk in Piazza San Marco.

The reason why in certain cases it seems correct to assert that things will go a certain way is that in those cases we take it to be true that things will go that way. As far as we know, the sun will rise tomorrow. Of course, we are not absolutely certain that it will. We might be wrong, due to unforeseen circumstances. However, the evidence that supports our prediction is solid.

Similarly, the reason why in certain cases it does not seem correct to assert that things will go a certain way is that in those cases we do not know whether things will go that way; that is, it may easily be false that things will go that way. We are not in a position to tell whether exactly 3,245 pigeons will walk in Piazza San Marco. As far as we know, the number of pigeons that will walk in Piazza San Marco may easily be bigger or smaller.

In this respect, assertions about the future resemble assertions about the past. The cases in which it seems correct to assert that things went a certain way are cases in which we take it to be true that things went that way. For example, it seems correct to assert that dinosaurs disappeared long time ago. Conversely, the cases in which it does not seem correct to assert that things went a certain way are cases in which we do not know whether things went that way. For example, it does not seem correct to assert that Caesar was annoyed by a mosquito while crossing the Rubicon.

More generally, the ordinary use of language suggests that assertions about the future, just like assertions about the past, can be correct or incorrect. Therefore, this suggests that future-tense sentences, like past-tense sentences, can be true or false. For example, “The sun will rise tomorrow” seems true. Conversely, “The sun will not rise tomorrow” seems false. Note that “The sun will rise tomorrow” does not express a necessary truth, that is, it is not a sentence such as “2+2=4.” Although unlikely, it is possible that it is false. Similarly, “The sun will not rise tomorrow” does not express a necessary falsity, that is, it is not a sentence such as “2+2=5.”

The problem discussed above, and that this article addresses, concerns future contingents; that is, sentences about future events that can occur or not occur. According to a line of thought that goes back to Aristotle, these sentences cannot be true or false. Hence, the linguistic analogy just considered is misleading: Assertions about the future are not like assertions about the past.

b. The Sea Battle

In chapter 9 of De Interpretatione, Aristotle asks whether it makes sense to say that a sentence about a future event that can occur or not occur is true or false. His answer is that it does not make sense, for if the sentence were true or false, then the event would be necessary or impossible:

Let us take, for example, a sea battle. It is requisite on our hypothesis that it should neither take place nor fail to take place tomorrow. These and other strange consequences follow, provided we assume in the case of a pair of contradictory opposites having universals for subjects and being themselves universal or having an individual subject, that one must be true, the other false, that there can be no contingency and that all things that are or take place come about in the world by necessity. (Aristotle, De interpretatione 18b23 ff)

Aristotle’s reasoning seems to be the following. Consider the sentences (1) and (2) as uttered today:

(1) There will be a sea battle tomorrow.

(2) There will not be a sea battle tomorrow.

If (1) were true, and (2) were false, then it would be settled today that there will be a sea battle tomorrow, so the sea battle would be necessary. Similarly, if (2) were true, and (1) were false, then it would be settled today that there will not be a sea battle tomorrow, so the sea battle would be impossible. Since the sea battle is contingent, that is, it is neither necessary nor impossible, this shows that (1) and (2) are neither true nor false.

For Aristotle, the claim that (1) and (2) are neither true nor false is consistent with the plausible assumption that the disjunction formed by (1) and (2) is true:

(3) Either there will be a sea battle tomorrow or there will not.

Aristotle seems to think that (3) expresses a necessary truth, although the same does not hold for (1) and (2) taken separately:

That every thing is or is not is necessary, and also that it will be or it will not be; however, certainly not that, taken separately, one or the other is necessary. I say for example that it is necessary that either there will be a sea battle tomorrow or there will not be a sea battle tomorrow, but it is neither necessary that a sea battle will occur tomorrow nor that it will not occur. Rather, it is necessary that it will occur or not. (Aristotle, De Interpretatione, 19a25-30)

Another aspect of Aristotle’s point is that the claim that (1) and (2) are neither true nor false does not reduce to the observation that we do not know whether there will be a sea battle tomorrow. Of course, we do not know whether there will be a sea battle tomorrow. The absence of truth or falsity that Aristotle ascribes to (1) and (2), however, is independent of our epistemic condition. The problem of future contingents concerns truth rather than knowledge. Compare (1) with “There was a sea battle yesterday.” We can easily imagine a situation in which one does not know whether a sea battle occurred the day before. Despite this, independently of whether one knows it or not, it seems right to say that “There was a sea battle yesterday” is either true or false. Its truth or falsity depends on what happened the day before. Aristotle suggests that (1) differs in this respect, because there is nothing that can make it true or false.

c. Bivalence, Excluded Middle, Fatalism

The problem of future contingents stems from the combination of three ingredients. Two of them are fundamental logical principles, namely, bivalence and excluded middle. The third is a controversial metaphysical doctrine, namely, fatalism.

Bivalence is the principle according to which truth and falsity are reciprocally exclusive and jointly exhaustive values. Classical logic relies on bivalence, in that it assumes that every sentence is true or false. If the letter p is used as a schematic expression that stands for any sentence, this assumption can be stated as follows:

(B) Either “p” is true or “p” is false.

For example, “p” can be replaced with “Snow is white,” “Snow is green,” or any other sentence.

Here, “any other sentence” includes not only simple sentences, such as those just considered, but also complex sentences, such as “Snow is not white,” “If snow is green, then it is not white,” and “Either snow is white or it is green.” The last three sentences are respectively a negation, a conditional, and a disjunction, in that they are formed by means of the connectives “not,” “if/then,” and “or.” In classical logic, complex sentences formed in this way are treated as truth functions of their constituents, which means that their truth or falsity is determined by the truth and falsity of their constituents. More precisely, the negation of a sentence is true if and only if the sentence is false, a conditional is true if and only if it is not the case that its antecedent is true and its consequent is false, and a disjunction is true if and only if at least one of its disjuncts is true. Thus, bivalence is consistent with the assumption that some connectives—such as “not,” “if/then,” and “or”—are truth-functional, that is, that the complex sentences formed by means of these connectives are truth functions of their constituents.

Excluded middle is the principle according to which every disjunction formed by a sentence and its negation is true. For instance:

(E) Either p or not-p

Classical logic justifies (E) in that it assumes that negation and disjunction are defined in the way explained. From that definition, it turns out that, no matter whether it is the case that p, one of the disjuncts of (E) must be true.

Finally, fatalism is the doctrine according to which nothing is contingent, that is, everything is either necessary or impossible:

(F) Either it is necessary that p or it is impossible that p

From (F) we get that if p, then it is necessary that p, and if not-p, then it is impossible that p. Suppose that p. Then the second disjunct of (F) is false, and hence the first must be true. Suppose that not-p. Then the first disjunct of (F) is false, and hence the second must be true. Note that here “necessary” and “impossible” are understood as “necessary given our past and our present” and “impossible given our past and our present,” that is, without taking into account what could happen if our past and our present were different. The problem of future contingents concerns future possibilities. It does not concern past or present possibilities.

The thesis that nothing is contingent is sometimes called “necessitarianism,” and the term “fatalism” often expresses the view that no one has free will, understood as the ability to do otherwise than what one actually does. However, even when a distinction is drawn between necessitarianism and fatalism, it is usually taken for granted that there is a close connection between them: If we are unable to do otherwise than we actually do, it is because what we do is necessary. In any case, independently of what “fatalism” means, (F) is controversial because it is at odds with free will. If nothing is contingent, then it is hard to see how one can be free to choose one course of action rather than another.

d. Two Arguments

The reasoning that emerges from the first quote in section 1.b suggests that bivalence entails fatalism. Suppose that (1) is either true or false. Assuming that the truth of (1) makes the sea battle necessary, and that the falsity of (1) makes the sea battle impossible, it follows that either it is necessary or it is impossible that there will be a sea battle. The argument may be phrased in schematic form as follows:

[BF]

(B) Either “p” is true or “p” is false.

(A1) If “p” is true, then it is necessary that p.

(A2) If “p” is false, then it is impossible that p.

So, (F) Either it is necessary that p or it is impossible that p.

[BF] is valid, in that its conclusion follows from its premises. Suppose that (B), (A1), and (A2) are true. Then one of the disjuncts of (B) is true. This means that either the antecedent of (A1) or the antecedent of (A2) is true, hence that either the consequent of (A1) or the consequent of (A2) is true. So (F) must be true. If one accepts the premises of a valid argument, one is compelled to accept its conclusion. Therefore, one cannot accept (B), (A1), and (A2) without accepting (F). By contraposition, if one takes (F) to be false, one must think that there is something wrong in the premises of [BF]. Aristotle thinks that the mistake lies in (B), as he takes (A1) and (A2) to be true.

Since (B) and (E) are distinct logical principles, rejecting (B) does not amount to rejecting (E). Aristotle is clearly aware of this fact, as shown by the second quote in section 1.2. However, there is another fact that he does not take into account, namely, that if one grants two apparently innocuous assumptions about truth and falsity, one can get bivalence from excluded middle. The argument is the following:

[EB]

(E) Either p or not-p.

(A3) If p, then “p” is true.

(A4) If not-p, then “p” is false.

So, (B) Either “p” is true or “p” is false.

[EB] is valid, as is [BF]. Here, again, the first premise is a disjunction, the second and third premises are conditionals in which the two disjuncts occur as antecedents, and the conclusion is a disjunction formed by the two consequents. This means that if (E), (A3), and (A4) are true, then (B) must be true.

Now the problem of future contingents becomes evident. According to [BF], bivalence entails fatalism. According to [EB], excluded middle entails bivalence. Therefore, from the combination of [EB] and [BF] we get that excluded middle entails fatalism. Since fatalism is unacceptable—or so assume Aristotle and many others after him—there must be something wrong with at least one of the premises of [BF] and [EB]. Determining which is the problem. Questions arise as to whether bivalence and excluded middle are sound logical principles, whether bivalence really entails fatalism, and whether excluded middle really entails bivalence. To solve the problem of future contingents is to provide satisfactory answers to these questions.

2. Three Logical Options

a. Neither Bivalence nor Excluded Middle

Now we will consider three distinct theses about bivalence and excluded middle, which constitute the main logical options available to solve the problem of future contingents. These three theses share two basic assumptions: One is that fatalism is wrong, and the other is that [BF] and [EB] are valid. Thus, they agree that (E) and (A1)-(A4) are not all true. If (E) and (A1)-(A4) were all true, on the second assumption it would follow that (F) is true, contrary to the first assumption.

The first option—option 1—is to deny both bivalence and excluded middle. According to this option, bivalence does not hold. Since (A1) and (A2) are true, if (B) were true, then (F) would be true. Excluded middle does not hold either, for (A3) and (A4) are just as true as (A1) and (A2). So, if (E) were true, then (B) would be true as well. In other terms, [BF] and [EB] are alike in that their first premise is false.

In the debate over future contingents, the theory that best expresses option 1 is Lukasiewicz’s three valued logic (Lukasiewicz 1970). This theory, which intends to provide a coherent interpretation of Aristotle, shares with classical logic the tenet of truth-functionality; that is, it takes for granted that the value of a complex sentence is determined by the values of its constituents. However, it differs from classical logic in that it contemplates three values instead of two: truth, falsity, and indeterminacy.

Lukasiewicz rejects bivalence because he thinks that some sentences are indeterminate. A sentence is indeterminate when the way things are does not make it true and does not make it false. For example, (1) is indeterminate, because no fact or event today can make it true or false.

Lukasiewicz also rejects excluded middle. In his logic, the negation of an indeterminate sentence is itself indeterminate. For example, (2) is indeterminate, for its truth would amount to the falsity of (1), and its falsity would amount to the truth of (1). Moreover, a disjunction is indeterminate if both its disjuncts are indeterminate. So (3) is indeterminate. In general, every disjunction formed by an indeterminate sentence and its negation turns out indeterminate.

The rejection of bivalence is an essential feature of any three-valued logic, for what defines such a logic is just the hypothesis that there are three values instead of two. The rejection of excluded middle, instead, is not essential in this sense. Assuming that there are three values, and that some connectives are truth-functional, there is no unique way to define those connectives. In particular, negation and disjunction could be so defined as to validate excluded middle.

However, it seems that there are no independent reasons for changing the definitions of negation and disjunction proposed by Lukasiewicz. First, it would make little sense to stipulate that the negation of an indeterminate sentence is true rather than indeterminate. Since (1) and (2) are about the same event, it is hard to see how (2) can be true if (1) is indeterminate. Second, it would make little sense to stipulate that a disjunction formed by two indeterminate sentences is true rather than indeterminate, because in that case, “Either there will be a sea battle tomorrow or it will rain tomorrow” would be true, which seems unreasonable.

On the other hand, from the perspective of a three-valued logic it would be impermissible to claim that some negations of indeterminate sentences are indeterminate while others are true, or that some disjunctions formed by indeterminate sentences are indeterminate while others are true. This would amount to giving up truth-functionality, which is essential to any such logic. To assume that “not” and “or” are truth functional is to assume that the value of a negation or a disjunction—no matter whether truth, falsity, or indeterminacy—solely depends on the value of its constituents.

Thus, although Lukasiewicz’s logic is not the only three-valued logic that we can imagine, it is reasonable to think that no other three-valued logic can provide a better account of future contingents. Accordingly, we assume that three-valued logic invalidates both bivalence and excluded middle.

One merit of option 1 is that it accepts [EB]. This is plausible, given that [EB] is valid and that (A3) and (A4) express principles about truth and falsity that seem evident. According to [EB], if one accepts (E), one must also accept (B). So, by contraposition, if one rejects (B), one must also reject (E).

The rejection of excluded middle, however, constitutes a flaw of option 1, for it is hard to believe that a disjunction formed by a sentence and its negation, such as (3), is not true. Even though we do not know what will happen tomorrow, it seems certain that either there will be a sea battle tomorrow or there will not.

Another problem that affects option 1—the assertion problem—derives from the rejection of bivalence. As we have seen in section 1.a, the ordinary use of language suggests that some assertions about the future are correct, and hence that some future contingents are true. For example, “The sun will rise tomorrow” seems true. If all future contingents are indeterminate, however, this sentence cannot be true, so it is not clear why one should assert it. Those who adopt option 1 must explain how we can make apparently correct assertions by using future contingents.

b. Excluded Middle without Bivalence

The second option—option 2—is to deny bivalence but accept excluded middle. According to this option, bivalence entails fatalism, but excluded middle does not entail fatalism, because excluded middle does not entail bivalence. In other words, the argument that does not work is [EB], for one can accept (E) without accepting (B). This is the most plausible reading of Aristotle, advocated by Boethius, Peter Auriol, and many other scholars.

To justify option 2, one must explain why [EB] does not work. That is, one must explain why (A3) and (A4) are not true. Supervaluationism, a theory elaborated by Thomason (1984) on the basis of ideas expressed by Prior (1967) and Van Fraassen (1966), provides one coherent explanation. Supervaluationism rests on the assumption that future-tense sentences can be evaluated as true or false relative to possible futures. For example, in some possible futures there will be a sea battle tomorrow, while in others there will be peace. (1) is true in a future of the first kind, while it is false in one of the second kind. According to supervaluationism, to ask whether a future-tense sentence is true or false is to ask whether it is true or false in any possible future. This idea can be phrased in a precise way if we define a “history” as a whole possible course of events, that is, a course of events that includes a possible future, and we assume that, for any future contingent “p,” uttered at a moment m, there is a set of accessible histories such that in each of them “p” is either true or false at m. Truth in the non-relative sense—truth simpliciter—is defined in terms of truth relative to histories: “p” is true at m if and only if it is true at m in all the histories of the set. Similarly, “p” is false at m if and only if it is false at m in all the histories of the set. The name of the theory comes from this idea. If we call “valuation” each attribution of value to a sentence relative to a history, we can call “supervaluation” an attribution of value to the sentence that takes into account all the valuations.

Supervaluationism draws a principled distinction between bivalence and excluded middle. Consider (1). Since (1) is true today in some histories and false today in other histories, (1) is neither true nor false today. The same goes for (2). In general, future contingents are neither true nor false, because they are true in some histories and false in others. Therefore, bivalence does not hold. Now consider (3). In every history, either the first disjunct is true today, or the second disjunct is true today. Consequently, (3) is true today. In general, a disjunction formed by a sentence and its negation is always true. Therefore, excluded middle holds.

Note that this account of excluded middle involves an essential duality with respect to truth-functionality. There is a sense in which (3) is a truth function of its constituents, the sense in which, for any history h, (3) is true in h if and only if one of its disjuncts is true in h. There is also a sense in which (3) is not a truth function of its constituents, the sense in which (3) is true simpliciter even though neither of its disjuncts is true simpliciter. Truth-functionality holds at the level of truth relative to histories, but not at the level of truth simpliciter. This makes supervaluationism a partially non-classical theory.

Now let us go back to (A3) and (A4). Supervaluationism provides a motivation for rejecting (A3). Suppose that “p” is a future contingent that is true at m in h. Then the antecedent of (A3) is true at m in h. Its consequent, however, is not true at m in h, because in order to be true at m in h, “p” should be true at m in all histories. Therefore, (A3) is not true at m in h. It follows that (A3) is not true at m. A similar reasoning motivates the rejection of (A4). Suppose that “not-p” is a future contingent that is true at m in h. Then the antecedent of (A4) is true at m in h. Its consequent, however, is not true at m in h, because “p” is not false at m in all histories. So (A4) is not true at m in h. It follows that (A4) is not true at m.

Although this explanation is consistent with the supervaluationist definition of truth, it is not entirely satisfactory, or so one might argue. The rejection of (A3) and (A4) speaks against supervaluationism, for (A3) and (A4) are very plausible assumptions. It seems trivial that “Snow is white” is true if snow is white, and that “Snow is white” is false if snow is not white. Just because it seems trivial, it should turn out true.

Independently of (A3) and (A4), the supervaluationist definition of truth may cause some perplexity. Some might contend that this definition mistakenly identifies truth with necessity. To say that “p” is true is not the same thing as to say that it is necessary that p, or so it appears. Imagine that Bob and Rob are at the racecourse and that Bob bets on Frisco. Bob and Rob are indeterminists, so they believe that it is possible that Frisco will win and that it is possible that Frisco will not win. In the middle of the race, Rob says to Bob: “Don’t worry, Frisco will win,” to which Bob replies, “I really hope that’s true.” Presumably, what Bob hopes is not that his philosophical convictions are false; that is, he does not hope that Frisco’s victory is necessary. To hope that Frisco will win is not the same thing as to hope that it is necessary that Frisco will win. It is consistent to hope that Frisco will win and think that it is possible that Frisco will not win. It thus seems that the truth of the sentence uttered by Rob does not amount to its truth in all histories.

The intuitive difference between the claim that “p” is true and the claim that it is necessary that p becomes even clearer when we consider retrospective attributions of truth. Suppose that Frisco really wins and that at the end of the race Bob exults: “You were right! It was true!” What Bob wants to say is that the sentence uttered by Rob during the race was true. However, the supervaluationist definition of truth entails that that sentence was neither true nor false, as it was false in some histories. This seems wrong, because the truth that Bob retrospectively attributes to the sentence uttered by Rob does not rule out its possible falsity. It is consistent to think that what Rob said was true and that, in the moment in which he said it, it was possible that Frisco would not win. Again, it seems that the truth of the sentence uttered by Rob does not amount to its truth in all histories.

Supervaluationism is not the only theory in line with option 2. Another theory, advocated by Belnap and others (Belnap, Perloff, and Xu 2001), implies that there is no such thing as truth simpliciter. Future contingents are true or false only relative to histories, because it is only relative to histories that they express a determinate content. Suppose that (1) is uttered today. Since at the moment of the utterance different futures are possible, each of which includes a different tomorrow, the word “tomorrow” in (1) does not denote a determinate moment, which means that (1) does not express a determinate content. Therefore, it makes no sense to ask whether (1) is true or false today. The only meaningful question that can be asked is whether (1) is true or false relative to a given history. This theory shares with supervaluationism the assumption that future contingents can be evaluated as true or false relative to possible futures, but does not identify truth simpliciter with truth in all histories, because it rejects the very idea of truth simpliciter.

MacFarlane (2003, 2008) has proposed a third theory. Just like Belnap and others, MacFarlane claims that there is no such thing as truth simpliciter. In this case, the motivation provided is that a parameter of evaluation other than the history has to be taken into account. According to MacFarlane, the value of a future contingent uttered at a given moment can vary depending on the context of assessment, that is, on the moment in which it is evaluated. Suppose that (1) is uttered today and that tomorrow there is a sea battle. Today, at the moment of the utterance, (1) is neither true nor false. Tomorrow, however, in the middle of the sea battle, (1) is true. Consequently, the same sentence, as uttered at a given moment, can have different values in different contexts of assessment.

Both theories reject bivalence: Future contingents are not true or false, because they are not true or false in some absolute sense. Moreover, they both preserve excluded middle, because they make it valid in a relative sense. For example, (3) is always true today, in that it is true today in every history or in any context of assessment. These two theories thus have much in common with supervaluationism.

Leaving specific problems aside, both theories considered run into the assertion problem, as they reject bivalence. If one claims that “The sun will rise tomorrow” is neither true nor false, independently of the motivation adopted, one has to explain why it seems correct to assert this sentence.

To conclude, option 2 differs from option 1 in that it saves excluded middle, which is a merit. Its main flaws are essentially two. One is that it must provide a plausible definition of truth that—among other things—enables us to explain what is wrong with [EB]. The other is that it must address the assertion problem, which it shares with option 1.

c. Both Bivalence and Excluded Middle

The third option—option 3—is to accept both bivalence and excluded middle. According to this option, excluded middle entails bivalence, but bivalence does not entail fatalism. In other terms, the argument that does not work is [BF], for one can accept (B) without accepting (F).

To justify option 3, one must explain why [BF] does not work, that is, it must be explained why (A1) and (A2) are not true. One way to do so is to endorse Ockham’s idea that one of the possible futures is the actual future, that is, the way things will actually go. In his Tractatus de praedestinatione et praescientia Dei respectu futurorum contingentibus, which aims to explain how divine foreknowledge is compatible with the contingency of events, Ockham draws a distinction between truth and determinate truth. The former is understood as truth in the actual future, the latter is understood as truth in all possible futures. According to Ockham, future contingents are true or false, even though they are not determinately true or determinately false (1978).

The distinction between truth and determinate truth—which has been defended by Von Wright (1984), Lewis (1986) and Horwich (1987), among others—can be illustrated by means of the two examples considered in section 2.b. Suppose, as before, that Rob says to Bob, “Don’t worry, Frisco will win!” and that Bob replies, “I really hope that’s true.” As we have seen, it seems that Bob’s hope is not that Frisco’s victory is necessary. One obvious candidate for what he does hope for is the following: What Bob hopes is that Frisco will actually win, namely, that the possible future that will become reality is a future in which Frisco wins. Now, suppose that Frisco really wins and that Bob says to Rob: “You were right! It was true!” As we have seen, it seems correct to say that the sentence uttered by Rob was true, even though it was possible that Frisco would not win. If the truth of that sentence does not amount to its truth in all possible futures, it is unclear what it amounts to. Again, one obvious answer is that it amounts to the fact that Frisco actually won. Thus, a sentence can be true without being determinately true, if it is true in the actual future but false in some other future.

The theory that we will call Ochkamism is inspired by Ockham in that it defines truth in terms of the actual future. Ockhamism, just like the theories considered in section 2.b, adopts a relative notion of truth: A future contingent “p,” uttered at a moment m, can be evaluated as true or false in a set of accessible histories. Truth in the non-relative sense—truth simpliciter—is defined in terms of this notion: “p” is true at m if and only if “p” is true at m in the actual history. Similarly, “p” is false at m if and only if “p” is false at m in the actual history (Øhrstrøm 2009; Rosenkranz 2012; Iacona 2013, 2014; Wawer 2014; Malpass and Wawer 2018).

If truth is defined in terms of the actual history, then truth does not entail determinate truth. This is why Ockhamism rejects (A1) and (A2). Suppose that “p” is true at m in the actual history. In this case, the antecedent of (A1) is true at m, while its consequent is false at m. Similarly, suppose that “p” is false at m in the actual history. In this case, the antecedent of (A2) is true at m, while its consequent is false at m.

This prompts the question of whether it makes sense to say that one of the possible futures is the actual future. The very idea of a unique actual future may easily raise doubts and misgivings. If one among the many possible futures is the actual future, it is unclear how the other futures can be equally possible, given that they will not become real. In other words, it seems impossible that what will happen is not predetermined. In order to adequately justify the distinction between truth and determinate truth, some convincing responses to these questions must be provided.

In sum, option 3 rescues bivalence and excluded middle, in accordance with classical logic. Moreover, it does not run into the assertion problem, because it implies that some future contingents are true, so it can explain the apparent correctness of some assertions about the future. The most problematic aspect of this option is the very idea of the actual future.

d. Further Considerations

The three logical options considered so far define the main positions within the debate on future contingents. Since these options do not exhaust the logical space of possibilities, this section dwells briefly on the only combination this article has not considered, namely, bivalence without excluded middle.

One way to give substance to this option, which comes from Pierce as interpreted by Prior, is the following: Future contingents are all false, because they describe future events as inevitable. For example, (1) and (2) are both false, because (1) says that there will necessarily be a sea battle tomorrow, while (2) says that there cannot be a sea battle tomorrow. Therefore, excluded middle does not hold: (3) is false, for both its disjuncts are false. Yet bivalence holds, because every sentence, including future contingents, is either true or false (Øhrstrøm and Hasle 1995; Prior 1967; Todd 2016).

The same problems that affect option 1 affect this position. First, the rejection of excluded middle is difficult to accept. (3) seems true, not false. Second, the assertion problem is still there. If all future contingents are false, then “The sun will rise tomorrow” cannot be true, in spite of the fact that it seems correct to assert it.

Independently of these two problems, the idea that all future contingents are false gives rise to further troubles. Consider (1) and (2). On the assumption that (2) is the negation of (1), as its syntactic structure suggests, it is unreasonable to think that (1) and (2) are both false. So, the most plausible way to claim that (1) and (2) are both false is to say that (2)—contrary to what its syntactic structure suggests—is not the negation of (1). The negation of (1) would rather be “It is not the case that there will be a sea battle tomorrow.” On the hypothesis that (2) and “It is not the case that there will be a sea battle tomorrow” express different contents, it is consistent to say that the former is false while the latter is true. Note, however, that this way, “Either there will be a sea battle tomorrow or it is not the case that there will be a sea battle tomorrow” turns out true. Thus, there is a clear sense in which excluded middle holds: If “It is not the case that there will be a sea battle tomorrow” is the negation of (1), the sentence that instantiates (E) is “Either there will be a sea battle tomorrow or it is not the case that there will be a sea battle tomorrow,” not (3). Moreover, we still need an explanation of why (2) and “It is not the case that there will be a sea battle tomorrow” express different contents, given that they seem to say exactly the same thing.

These troubles explain the scarce popularity of the option just considered. The debate on future contingents almost never sees the acceptance of bivalence combined with the rejection of excluded middle, because most thinkers take it for granted that bivalence is at least as controversial as excluded middle.

3. Three Metaphysical Views

a. Past, Present, and Future Entities

So far, we have considered three logical options that differ with respect to bivalence and excluded middle. Now we will address the key metaphysical issue that underlies the problem of future contingents: what there is in front of us.

Let us first introduce four basic ontological conceptions of time, that is, four conceptions of the existence of past, present, and future entities. Past entities and future entities resemble present entities in some respects but not in others. On the one hand, there is a sense in which Caesar is like us and unlike the Abominable Snowman: Ceasar was a real person, while the Abominable Snowman has never existed. The same goes for future children, who will be real persons just like us. On the other hand, there is a sense in which Caesar is not like us: We are here, while he is no longer here. Similarly, future children are not here yet. The four conceptions considered in this article weigh these similarities and differences in different ways.

Presentism is the conception according to which only present entities exist. We exist, but Ceasar and future children do not exist. Existing and being present are the same thing. Imagine an incredibly big and incredibly thin slice of salami. The slice is the present, and we are in it. Behind us there is nothing, because the past does not exist, and ahead of us there is nothing, because the future does not exist. This conception—which is defended by Prior (1970), Bigelow (1996), and Bourne (2006), among others—is represented in figure 1.

Figure 1: Presentism

The growing block theory, alternatively, is the conception according to which past and present entities exist, but future entities do not exist. Ceasar exists, we exist, but future children do not exist. This conception—defended by Broad (1923), Tooley (1997), and Correia and Rosenkranz (2018), among others—describes reality as a totality that constantly increases as time passes. In figure 2, the slice of salami that represents the present is attached to the portion of salami that precedes it, the past.

Figure 2: Growing block

A third conception that is purportedly opposite to the growing block theory is the shrinking block theory. According to this theory, which is not widely accepted (though see, for example, Casati and Torrengo 2011), present and future entities exist, but past entities do not exist. We exist, future children exist, but Ceasar does not exist. Reality is what is left, so to say, and the future is constantly eroded as time passes. In figure 3, the slice of salami that represents the present is attached to the portion of salami that follows it, the future.

Figure 3: Shrinking block

Finally, eternalism is the view according to which past, present, and future entities exist. We exist, and the same goes for Ceasar and future children. This conception is defended by Williams (1951), Taylor (1955), Smart (1963), Putnam (1967), Mellor (1998), and Sider (2001), among others. In figure 4, the slice of salami that represents the present is part of a whole salami, a history, which may be conceived of as a sequence of moments.

Figure 4: Eternalism

While the first three conceptions are essentially dynamic, in that they imply that the passage of time is metaphysically real, eternalism may be understood either dynamically, assuming that the present really moves along the line of time, or statically, assuming that the experience of the passage of time is merely illusory. On both interpretations, the idea that underlies eternalism is that temporal relations are somehow similar to spatial relations. For example, Turin, Milan and Venice are located on three points ordered along the west-east axis. Although each of these three cities offers a distinct perspective on the other two, the spatial relations among them—the order in which they are located along the west-east axis—do not vary with the point of observation. According to eternalism, the same goes for temporal relations. Being present is like being in Milan. There is no ontological difference between Caesar, us, and future children, just as there is no ontological difference between Turin, Milan, and Venice (see the time).

The classification just presented will help with understanding the three metaphysical views considered in the next three sections. As these sections show, these three views can be associated with options 1-3, although there is no necessary connection between them. Each view provides a distinct answer to the question of what is there ahead of us.

b. No Future

The first view—the no-future view—says that there is absolutely nothing ahead of us: The future does not exist. Certainly, many things will happen, and it makes perfect sense to talk about such things. However, what will happen will exist only when it will happen; it does not exist now. When it will happen, it will no longer be future.

Presentism and the growing block theory entail the no-future view. Although these two conceptions differ with respect to the question of whether the past exists, they agree on the non-existence of the future. By contrast, the shrinking block theory and eternalism contradict the no-future view. Although these two conceptions differ with respect to the question of whether the past exists, they agree on the existence of the future. Therefore, the no-future view can be maintained either in a presentist perspective or in a growing-block perspective.

Of the three logical options considered in section 2, the one that best suits the no-future view is option 1. If the future does not exist, there is nothing that can make future-tense sentences true or false. For example, there is nothing that can make (1) and (2) true or false. It is thus sensible to claim that future-tense sentences violate bivalence. This is probably what Lukasiewicz had in mind, although he did not explicitly address the distinction between presentism and growing block theory.

Perhaps it is also sensible to claim that future-tense sentences violate excluded middle. If nothing can make true (1) or (2), the same goes for (3). The “perhaps” is due to the fact that the inference from the absence of truth of (1) and (2) to the absence of truth of (3) requires a further constraint that plays a crucial role in three-valued logic, namely, truth-functionality. Assuming that a disjunction is true only if one of its disjuncts is true, from the absence of truth of (1) and (2) we can infer the absence of truth of (3). Without that assumption, instead, the inference is not legitimate. As we have seen in section 2.2, supervaluationism differs from three-valued logic precisely in that it gives up truth-functionality to save excluded middle.

The no-future view—especially in the growing block version—provides a metaphysical substratum for the idea that future-tense sentences are sui generis from the logical point of view. The difference at the logical level can be explained by a difference at the metaphysical level: The past and the present exist, whereas the future does not exist. This is not to say that, strictly speaking, the no-future view entails that idea. For example, Correia and Rosenkranz (2018) argue that the growing block theory is consistent with bivalence.

c. Many Futures

The second and the third view differ from the first in that they entail the existence of future entities. Although this makes them compatible both with the shrinking block theory and with eternalism, they are usually framed in an eternalist perspective. In such a perspective, the contingency of a future event cannot be conceived of in terms of absence, as in the no-future view, because an event cannot be future without existing. Rather, it will be conceived of in terms of presence in some but not in all possible futures. This is why the second and the third view contemplate a plurality of histories. A history is a possible world, that is, a totality of past, present, and future entities that is completely defined in its spatial and temporal properties.

The second view—the many-futures view—says that there are many futures ahead of us, that is, many possible continuations of the present. These continuations are like branches that depart from the same trunk, and they are metaphysically on a par, that is, they all exist and they are all actual (or none of them is). Figure 5 illustrates the many-futures view by recalling the salami analogy. The slice is the present, as in the previous figures, but there are two portions of salami on the right, that is, two possible continuations of the present. Each of these two portions, together with the left portion, forms a whole salami. Therefore, the slice belongs to two distinct salami.

Figure 5: Branching

The idea illustrated in figure 5 can be represented in a more abstract way by using simple lines. In figure 6, h₁and h₂are histories, while m₀, m₁and m₂are moments. m₀belongs both to h₁and to h₂. Instead, m₁belongs only to h₁, and m₂belongs only to h₂. While m₀precedes both m₁and m₂, m₁and m₂are unrelated, in that neither of them precedes the other. Diagrams of this kind, introduced by Kripke and Prior, are often employed in temporal logic to represent the set of future possibilities (Prior 1967).

Figure 6: One past, one present, two futures

The case of the sea battle can be described in terms of this figure. Suppose that m₀is today, that is, the moment at which (1) and (2) are uttered. h₁and h₂are histories that lead to different tomorrows: m₁is a peaceful tomorrow, while m₂is a tomorrow in which there is a sea battle. h₁and h₂have a part in common, that is, our past until today. The two portions of h₁and h₂that follow m₀are distinct possible futures. The contingency of the sea battle consists precisely in the existence of these futures.

Note that figure 6 shows two distinct tomorrows instead of one. Each of these two tomorrows belongs only to one history. However, this does not mean that it makes no sense to describe m₁and m₂as simultaneous. On the contrary, assuming that there is an absolute temporal axis, that is, that time can be measured from a point of view that is external to the histories, we can say that m₁and m₂are located at the same point along that axis. If we call instant an absolute temporal unit, definable as a set of equivalent moments, we can say that two moments that belong to different histories are in the same instant. In figure 7, i₀is the present instant, that is, the instant that includes m₀, and i₁is the instant that includes m₁and m₂.

Figure 7: The sea battle

The many-futures view is clearly in line with option 2. In the framework just sketched, future contingents can be evaluated as true or false at moments relative to histories. For example, (1) is true at m₀in h₂but false at m₀in h₁. Similarly, (2) is true at m₀in h₁but false at m₀in h₂. According to the supervaluationist definition of truth, this entails that (1) and (2) are neither true nor false at m₀, so that bivalence does not hold. Instead, excluded middle holds. (3) is true at m₀, for it is true at m₀both in h₁, given that (2) is true at m₀in h₁, and in h₂, given that

(1) is true at m₀in h₂. The two further theories considered in section 2.b fit the many-futures view equally well, in that they employ the same notion of truth relative to histories.

d. One Future

The third view—the one-future view—says that there is one future ahead of us, our future. This view has two versions. According to one of them—the thin red line—many possible futures depart from our present, but these futures are not metaphysically on a par because only one of them is actual. According to the other—divergence—we have a single future because we belong to a single history, the actual history, although there are other histories that are exactly like our history up to the present but have a different future. The key difference between the two versions concerns the possibility of overlap. To endorse the thin red line is to think that two histories can overlap, that is, that they can have some part in common. To endorse divergence, instead, is to conceive histories as entirely disconnected totalities. Here we will focus on divergence, although what will be said applies, mutatis mutandis, to the thin red line.

Figure 8 illustrates divergence. Imagine that we are in the salami below, and that the left portion of the salami above—the portion that precedes the slice—is identical to the left portion of our salami, but that the right portion of the salami above—the portion that follows the slice—differs from the right portion of our salami. In this case the two salami are divergent histories.

Note that figure 8 shows two presents, each of which belongs to a single history. This is not to say that it makes no sense to describe such moments as simultaneous. As in the many-futures view, simultaneity can be defined in terms of instants. Figure 9 represents the two histories considered above as horizontal lines, h₁and h₂, and represents the instant that the two presents have in common as a vertical line that intersects h₁and h₂. Our present, m₀, is in h₁and differs from m₁, which is in h₂. However, m₀and m₁are simultaneous in the sense that they belong to the same instant i₀.

Figure 8: Divergence

Figure 9: Two pasts, two presents, two futures.

The question is who the individuals in the other history, who are exactly like us up to now, are. Lewis, who defends divergence, calls such individuals counterparts. If we are in h₁, then in h₂there are other individuals who are our counterparts. Just as we have a future, the right portion of h₁, our counterparts have their own future, the right portion of h₂(Lewis 1986).

Now let us go back to the sea battle. Figure 10 represents two histories h₁and h₂that are exactly alike up to i₀but then differ. m₀and m₁are two distinct but qualitatively identical todays, each of which has its own tomorrow: m₂is a peaceful tomorrow, while m₃is a tomorrow in which there is a sea battle. Therefore, (1) is true at m₁, while it is false at m₀. Since m₁and m₀belong respectively to h₂and h₁, this is to say that (1) is true in h₂, while it is false in h₁. Whether (1) is true or false simpliciter depends on which of the two histories is the actual history. If we are in m₀we will have peace, whereas if we are in m₁we will find ourselves in the middle of a sea battle.

Figure 10: The sea battle

It is important to note that being in a given history does not mean being in a position to discern that history from other histories. Suppose that we are in h₁. Since m₀is qualitatively identical to m₁, and the same goes for any moment that precedes m₀, for us h₁is indistinguishable from h₂. So at i₀we are not in a position to know whether we are in h₁or in h₂. Consequently, we are not in a position to know whether our future includes m₂or m₃. In a way, we do not know what will happen tomorrow because we do not know where we are.

The one-future view suits option 3. The framework just sketched preserves bivalence. Suppose, as before, that (1) is true at m₁and false at m₀. Then, no matter which of the two histories is the actual history, (1) is either true or false. This is not to say that (1) is determinately true or determinately false. Assuming that determinate truth at a moment amounts to truth at all moments in the same instant, and that determinate falsity at a moment amounts to falsity at all moments in the same instant, (1) is neither determinately true at m₁nor determinately false at m₀. Excluded middle is preserved as well. (3) is true both at m₀and at m₁. Therefore, it is determinately true.

4. The Open Future

a. Alternative Possibilities

Most discussions on future contingents take for granted that fatalism is wrong. Despite this, it is not obvious what the right view is. The thought that underlies the rejection of fatalism is often expressed by saying that the future is open. The contemporary literature on future contingents, widely employs the metaphor of openness to characterize the view that the future is unsettled. Yet it is possible to understand openness in more than one way. This last section provides some clarifications about the claim that the future is open.

A simple and straightforward way to interpret the claim that the future is open is to define openness in terms of the existence of alternative possibilities: To say that the future is open is to say that, for some “p,” it is possible that p and it is possible that not-p. This interpretation is simple and straightforward because it equates the claim that the future is open with the pure negation of fatalism. As it turns out from section 1.c, fatalism is the claim that, for every “p,” either it is necessary that p or it is impossible that p. Consequently, its negation is the claim that, for some “p,” it is neither necessary nor impossible that p, that is, it is possible that p and it is possible that not-p.

If the openness of the future is understood in terms of the existence of alternative possibilities, then it is consistent with the three metaphysical views outlined in section 3. If one endorses the no-future view, one can say that, although there is presently nothing ahead of us, it is possible that what will exist is such that p and it is possible that what will exist is such that not-p. If one endorses the many-futures view, one can say that there are possible futures in which p and possible futures in which not-p. The same goes for the one-future view, even though in the case of divergence the possible futures have distinct pasts and distinct presents.

b. Indetermination

Another way to interpret the claim that the future is open is to define openness in terms of indetermination, understood as absence of determination: To say that the future is open is to say that nothing determines the future. This can mean two things: either that the future is not determined by some divine entity, or that the future is not determined by the laws of nature. Here we focus on the second reading, which became widespread by the early 21^st century, although these considerations apply to the first as well.

The idea that every event is determined by the laws of nature goes back to antiquity and has been widely discussed in modern and contemporary philosophy. According to this idea, every event follows as an effect from some cause in accordance with the laws of nature. Determination may be defined as a relation between states, understood as global conditions in which the universe can be at an instant. Given a state S that obtains at i₀and given a state S⁰that obtains at i₁, S determines S⁰if and only if the obtaining of S at i₀, together with the laws of nature, entails that S⁰obtains at i₁. Determinism is the view that, for every instant, the state that obtains at that instant is determined by the states that obtained at previous instants (Hoefer, 2003).

None of the three metaphysical views outlined in section 3 entails determinism. Suppose that i₀is the present instant and that S is the state of the universe at i₀. According to the no-future view, given an instant i₁later than i₀, nothing exists in i₁, even though when we will be in i₁, another state S⁰will obtain. The no-future view says nothing about the relation between S and S⁰, so it is consistent with the hypothesis that S does not determine S⁰. Now consider the many-futures view. Suppose, as in figure 7, that m₀is the present moment and that m₁and m₂are future moments that belong to i₁. If S is the state that obtains at m₀, while S⁰and S⁰⁰are the states that obtain respectively at m₁and m₂, then S determines neither S⁰or S⁰⁰, for it is compatible both with S⁰and with S⁰⁰. Finally, consider the one-future view. Suppose, as in figure 10, that m₀and m₁are in i₀, and that m₂and m₃are in i₁. If S is the state that obtains at m₀and m₁—in that h₁and h₂are identical up to i₀while S⁰and S⁰⁰are the states that obtain respectively at m₂and m₃—then S determines neither S⁰or S⁰⁰, for it is compatible both with S⁰and with S⁰⁰.

It is important to note that indetermination is not the same thing as indeterminateness, understood as absence of determinateness. If determinateness is the property that a possible future has when it is completely defined in its spatial and temporal properties, then indetermination does not entail indeterminateness. It is consistent to claim, as in the case of branching or divergence, that indetermination holds because there are many possible futures, each of which is completely defined in its spatial and temporal properties. Indetermination and indeterminateness are independent properties.

c. Causal Power

A third way to interpret the claim that the future is open is to define openness in terms of causal power: To say that the future is open is to say that we can affect the future, in that our present actions have future effects. For example, if tonight we set the alarm on our phone to 7 a.m., the sound that the phone will emit tomorrow at 7 a.m. is an effect of the movements that we perform tonight.

The idea that our present actions have future effects is obviously consistent with the three metaphysical views outlined in section 3. In each of the three cases, it makes perfect sense to say that an event which occurs at a given time causes another event that occurs at a later time.

Note that the past does not depend on us in the same sense, because our present actions do not have past effects. This asymmetry can be described in terms of counterfactual dependence, as Lewis has suggested. The future counterfactually depends on the present, because it would be different if the present were different. Suppose that tonight we set the alarm on our phone to 7 a.m. It is correct to say that, if the alarm were not set, the phone would not emit any sound tomorrow at 7 a.m. Instead, the past does not counterfactually depend on the present, because it would not be different if the present were different. If the alarm were not set, what happened yesterday would remain exactly the same (Lewis 1979).

The claim that we can affect the future must not be confused with the claim that we can change the future, that is, that we can replace the future with another future. It is one thing is to say that a future event, such as the sound that the phone will emit tomorrow at 7 am, is caused by a present event; it is quite another thing is to say that a future event can be replaced by a different future event. The claim that we can change the future is hardly intelligible, or so it appears to most philosophers (an exception is Todd 2016). In any case, this claim seems incompatible with the three metaphysical views outlined in section 3. If the no-future view is true, then the future does not exist, so nothing can be changed. If the many-futures view is true, then there are many possible futures, so it makes no sense to say that we can change “the” future. And in any case, each of the possible futures is essentially identical to itself. Finally, if the one-future view is true, then there is a unique future, which cannot be changed.

d. Other Definitions

As it turns out from sections 4.a-4.c, there are three plausible interpretations of the claim that the future is open: The first is that, for some “p,” it is possible that p and it is possible that not-p; the second is that the future is not determined; and the third is that we can affect the future. Each of these interpretations is consistent with the three metaphysical views outlined in section 3: No matter whether one endorses the no-future view, the many-futures view, or the one-future view, one can coherently claim that the future is open. Since options 1-3 accord, respectively, with the no-future view, the many-futures view, and the one-future view, this suggests that the claim that the future is open, on the three interpretations considered, is compatible with any solution to the problem of future contingents.

Of course, the three interpretations considered are not the only admissible interpretations. Other interpretations are possible. Nothing prevents us from defining openness in terms of some specific logical option or metaphysical view. The question then arises of whether the future is really open in the sense defined. Merely stipulating that openness amounts to this or that condition does not provide any reason to think that the stipulation captures some pre-theoretical intuition.

Some philosophers have suggested that the openness of the future amounts to the failure of bivalence for future-tense sentences (as in Markosian 1995). On this interpretation, the claim that the future is open yields substantive consequences, for it licenses options 1 and 2 while it rules out option 3. However, as some have observed (Barnes and Cameron 2009; Besson and Hattiangadi 2014), it is controversial whether the future is open in this sense. Aristotle needed an argument to show that bivalence does not hold for future contingents.

Other philosophers have suggested that the openness of the future amounts to the many-futures view: To say that the future is open is to say that there are multiple branching futures which are metaphysically on a par (as in MacFarlane 2003). On this interpretation, again, the claim that the future is open yields substantive consequences, for it rules out both the no-future view and the one-future view. However, it is controversial whether the future is open in this sense.

The controversy emerges clearly in the dialectic between branching and divergence. According to the advocates of the many-futures view, divergence does not preserve openness. Suppose that Betty wonders whether she can become an internationally acclaimed photographer. As far as divergence is concerned, the answer is affirmative if Betty will become a door-to-door cosmetics seller, but there is a history in which another individual very similar to Betty—call her Betty*—will become an internationally acclaimed photographer. The fact, however, is that what Betty wonders—what concerns her—is whether she, Betty, can become an internationally acclaimed photographer, not whether another person has that opportunity. It does not seem that Betty’s future be open if it only includes the sale of cosmetics. The openness of the future seems to imply that the alternative possibilities not only exist, but that they exist for the same individuals.

To this objection it might be replied that divergence does not deny that one and the same individual has alternative possibilities. Let us assume that “Betty can become an internationally acclaimed photographer” is true. Insofar as divergence explains the truth of this sentence in terms of the existence of a history in which Betty* becomes an internationally acclaimed photographer, the individual to whom it is correct to attribute the modal property of possibly becoming an internationally acclaimed photographer is Betty, not Betty*. Certainly, this explanation cannot be understood as a description of what Betty has in mind when she wonders whether she can become an internationally acclaimed photographer. However, the same holds for any other explanation of the same fact. Just as Betty does not think about Betty*, she does not think that she inhabits two histories that share a common segment and branch towards the future.

It is difficult to judge who is right. The objection against divergence stems from a line of thought that goes back to Kripke and that is antithetical to the theory of counterparts defended by Lewis. According to this line of thought, the truth or falsity of a sentence that attributes a modal property to an individual depends on what happens to the same individual in possible worlds other than the actual world. For example, Kripke claims that the sentence, “It might have been the case that Aristotle was not a philosopher,” is true because there are possible worlds in which Aristotle, the same Aristotle, was not a philosopher. The question of which of these two positions is preferable concerns possible worlds in general, and cannot be settled simply by appealing to intuitions.

5. References and Further Reading

Barnes, E. and Cameron, R. 2009. The Open Future: Bivalence, Determinism and Ontology. Philosophical Studies, 146:291–309.
Besson, C. and Hattiangadi, A. 2014. The Open Future, Bivalence and Assertion. Philosophical Studies, 162:251–271.
Bigelow, J. 1996. Presentism and Properties. Philosophical Perspectives, 10:35–52.
Bourne, C. 2006. A Future for Presentism. Oxford: Oxford University Press.
Broad, C. D. 1923. Scientific Thought. London: Routledge.
Casati, R. and Torrengo, G. 2011. The Not So Incredible Shrinking Future. Analysis, 71:240–244.
Correia, F. and Rosenkranz, S. 2018. Nothing To Come: A Defence of the Growing Block Theory of Time. Cham, Switzerland: Springer.
Dowden, B. 2018. Time. Internet Encyclopedia of Philosophy. https://www.iep.utm.edu/time/.
Hoefer, C. 2003. Causal Determinism. Stanford Encyclopedia of Philosophy. https://plato.stanford.edu/entries/determinism-causal/
Horwich, P. 1987. Asymmetries in Time. Cambridge (MA): MIT Press.
Iacona, A. 2013. Timeless Truth. In Around the Tree: Semantic and Metaphysical Issues Concerning Branching and the Open Future, edited by F. Correia and A. Iacona, 29–45. Cham, Switzerland: Springer.
Iacona, A. 2014. Ockhamism without Thin Red Lines. Synthese, 191:2633–2652.
Lewis, D. 1979. Counterfactual Dependence and Time’s Arrow. Noûs, 13:455–476.
Lewis, D. 1986. On the Plurality of Worlds. Oxford: Blackwell.
Lukasiewicz, J. 1970. On Three-Valued Logic. In Selected Works, edited by L. Borkowski, 87–88. Amsterdam: North-Holland.
MacFarlane, J. 2003. Future Contingents and Relative Truth. Philosophical Quarterly, 53:321–336.
MacFarlane, J. 2008. Relative Truth. In Truth in the Garden of Forking Paths, edited by M. Garcia-Carpintero and M. Kölbel, 81–102. Oxford: Oxford University Press.
Malpass, A. and Wawer, J. 2018. Back to the Actual Future. Synthese.
Markosian, N. 1995. The open past. Philosophical Studies, 79:95–105.
Mellor, H. 1998. Real Time II. London: Routledge.
Ockham, W. 1978. Tractatus de praedestinatione et de praecientia dei respectu futurorum contingentibus. In Opera philosophica et theologica, volume II. St. Bonaventure, New York: The Franciscan Institute.
Øhrstrøm. P. 2009. In Defence of the Thin Red Line: A Case for Ockhamism. Humana Mente, 8:17–32.
Øhrstrøm, P. and Hasle, P. F. V. 1995. Temporal Logic. From Ancient Ideas to Artificial Intelligence. Dordrecht: Kluwer.
Perloff, M., Belnap, N., and Xu, M. 2001. Facing the Future. Oxford: Oxford University Press.
Prior, A. N. 1967. Past, Present and Future. Oxford: Clarendon Press.
Prior, A. N. 1970. The Notion of the Present. Studium Generale, 23:245–248.
Putnam, H. 1967. Time and Physical Geometry. Journal of Philosophy, 64:240–247.
Rosenkranz, S. 2012. In Defence of Ockhamism. Philosophia, 40:617–31.
Sider, T. 2001. Four Dimensionalism. Oxford: Oxford University Press.
Smart, J. J. C. 1963. Philosophy and Scientific Realism. Humanities Press, 1963.
Taylor, R. 1955. Spatial and Temporal Analogies and the Concept of Identity. Journal of Philosophy, 52:599–612.
Thomason, R. H. 1984. Combinations of Tense and Modality. In Handbook of Philosophical Logic, volume 2, edited by D. Gabbay and G. Guenthner, 135–165. Dordrecht: Reidel.
Todd, P. 2016. On Behalf of a Mutable Future. Synthese, 193:2077–2095.
Tooley, M. 1997. Time, Tense, and Causation. Oxford: Oxford University Press.
van Fraassen, B. 1966. Singular Terms, Truth-Value Gaps, and Free Logic. Journal of Philosophy, 63:481–495.
von Wright, G. H. 1984. Determinism and Future Truth. In Truth, Knowledge, and Modality, 1–13. Oxford: Blackwell.
Wawer, J. 2014. The Truth about the Future. Erkenntnis, 79:365–401.
Williams, D. C. 1951. The Myth of the Passage. Journal of Philosophy, 48:457–472.

Author Information

Andrea Iacona
Email: andrea.iacona@unito.it
University of Turin
Italy

Metaphysics of Science

Metaphysics of Science is the philosophical study of key concepts that figure prominently in science and that, prima facie, stand in need of clarification. It is also concerned with the phenomena that correspond to these concepts. Exemplary topics within Metaphysics of Science include laws of nature, causation, dispositions, natural kinds, possibility and necessity, explanation, reduction, emergence, grounding, and space and time.

Metaphysics of Science is a subfield of both metaphysics and the philosophy of science—that is, it can be allocated to either, but it exhausts neither. Unlike metaphysics simpliciter, Metaphysics of Science is not primarily concerned with metaphysical questions that may already arise from everyday phenomena such as what makes a thing (a chair, a desk) the very thing it is, what its identity criteria are, out of which parts is it composed, whether it remains the same if we exchange a couple of its parts, and so forth. Nor is it concerned with the concrete entities (superstrings, molecules, genes, and so forth) postulated by specific sciences; these issues are the subject matter of the special philosophies of science (for example, of physics, of chemistry, of biology).

Metaphysics of Science is concerned with more abstract and general concepts that inform all of these sciences. Many of these concepts are interwoven with each other. For example, metaphysicians of science inquire whether dispositionality, lawhood, and causation can be accounted for in nonmodal terms; whether laws of nature presuppose the existence of natural kinds; and whether the properties of macrolevel objects supervene on dispositional or nondispositional properties.

This article surveys the scope (section 1), historical origin (section 2), exemplary subject matters (section 4), and methodology (section 5) of Metaphysics of Science, as well as the motivation that drives it (section 3).

What Is Metaphysics of Science?
Metaphysics of Science in the 20th (and Early 21st) Century
Why Do We Need Metaphysics of Science?
Sample Topics in Metaphysics of Science
The Methodology of Metaphysics of Science
References and Further Reading

1. What Is Metaphysics of Science?

Metaphysics of Science is a subdiscipline of philosophy concerned with philosophical questions that arise at the intersection of science, metaphysics, and the philosophy of science. The term “Metaphysics of Science,” which combines the names of these disciplines, is of 20^th century coinage. In order to fully understand what Metaphysics of Science is, it is helpful to clarify how it differs from both metaphysics simpliciter and philosophy of science.

a. Metaphysics and Metaphysics of Science

Metaphysics simpliciter seeks to answer questions about the existence, nature, and interrelations of different kinds of entities—that is, of existents or things in the broadest sense of the term. It enquires into the fundamental structure of the world. For example, it asks what properties are, how they are connected to the entities which have them, and how the similarity of objects can be explained in terms of their properties. The subject matter of metaphysics is somewhat heterogeneous: topics include the composition of complex entities (such as tables, turtles, and angry mobs), the identity and persistence of objects, problematic kinds of entities (that is, entities about which it is unclear whether or in what sense they exist at all, like numbers and fictional objects such as unicorns), and many more. Metaphysics is usually understood as working at an abstract and general level: it is not concerned with concrete individual things or particular relations but rather with kinds of things and kinds of relations.

Metaphysics of Science is not completely disjoint from metaphysics simpliciter. Not only does it draw on the pool of methodological tools employed in metaphysics, but there is also substantial overlap regarding subject matter. Metaphysicians have their own reasons, independently of science, to investigate causation, modality, and dispositional properties, for example. Like space and time, these concepts pertain also to everyday phenomena. Although Metaphysics of Science, too, is usually attentive to our everyday intuitions and opinions about such phenomena, it engages in a specific investigation of the roles these concepts play in scientific contexts.

Metaphysicians of science often take scientific realism for granted—that is, they hold the philosophical stance that the sciences are apt to find out what the world is really like, that they track the truth, and that the entities they postulate exist. Antirealism about science, on the other hand, often coincides with a skeptical or agnostic attitude towards metaphysics. In the context of some broader metaphysical inquiries, scientific endeavors might well be seen as but one way to the truth. A mainly science-guided metaphysics might even be seen as mistaken (as, for example, in phenomenological approaches (compare Husserl 1936; 1970)).

Moreover, metaphysicians of science demand of themselves that they pay attention to discourses within the sciences. For example, some physicists like Richard Feynman (1967) speak of fundamental symmetry principles and conservation laws as being constraints on other, less fundamental laws of nature (they are the laws of laws, so to speak), rather than being laws about what is going on in the world. Metaphysicians working to develop a philosophical theory of nomicity (lawhood), therefore, should allow for the possibility of there being laws of nature as well as laws of laws.

In short, Metaphysics of Science is that part of metaphysics that enquires into the existence, nature, and interrelations of general kinds of phenomena that figure most prominently in science. Also, Metaphysics of Science grants the sciences authority in their categorization of the world and in their empirical findings.

In terms of content, the transition between Metaphysics of Science and science might well be smooth with no clear border, so the distinction might be one that can only be made sociologically, regarding the departmental structure of universities or focusing on the practitioners and their methods of inquiry. Whereas many physicists (although perhaps not all: see theoretical physics) engage in experimental work, metaphysicians are happy merely to consult the findings of their empirically working colleagues from the science departments.

b. Philosophy of Science and Metaphysics of Science

On the other hand, Metaphysics of Science may just as well be called a part of the philosophy of science. Philosophy of science consists of the philosophical reflection on the preconditions, practices, and results of science in general and of the particular sciences (such as physics, biology, mathematics, sociology, and so forth). Many philosophers of science are engaged in debates surrounding science as a (putative) source of knowledge: what makes scientific results especially reliable? That is, what distinguishes science from non- or pseudoscience, everyday knowledge, and philosophy? Which kinds of methods do and should scientists employ? What is scientific progress? Are scientific theories true (despite being fallible)? Are we ever justified in advocating a particular scientific theory, given that most scientific theories of the past have been replaced by others (like, for example, Newtonian mechanics was replaced with relativistic mechanics)? Can the sciences be unified into one big Theory of Everything? Together, these questions constitute the epistemology of science, that part of the philosophy of science which studies scientific knowledge.

Metaphysics of Science complements the epistemology of science. Whereas the latter asks questions of the sort, “How do we know of x?” Metaphysics of Science enquires, “What is the nature of x?” where “x” is a placeholder for some (kind of) entity, state of affairs, or fact discovered or postulated by science.

The task of Metaphysics of Science is not simply to list these entities or facts. Rather, it operates at a higher level of abstraction. For example, whereas the particular sciences inquire into specific causal relations—or, differently put, into some particular relation that holds between two particular measurable quantities, like the concentration of a drug and the soothing effect it has on headaches—Metaphysics of Science attempts to say what causation is in general. That is, it asks exactly which features a relation must have in order to count as a causal relation (like regular occurrence or modal force), and what the respective relata are. In short, Metaphysics of Science enquires into the key concepts of science not at the empirical but at a more abstract and general level.

c. Explication

Philosophers disagree about which key concepts constitute the subject matter of Metaphysics of Science. Some (like Mumford and Tugby 2013, 6) argue for a narrow interpretation of the term and claim that Metaphysics of Science is primarily concerned with concepts which are relevant to all branches of science, because without these central concepts, science would not be possible. For example, they suggest (16) that kindhood, lawhood, and causation are concepts of this kind. Others, for example the Society for the Metaphysics of Science, are more permissive: they also include in the domain of Metaphysics of Science issues that arise in only some branches of science, such as problems regarding species (biology), intentionality and consciousness (psychology), and social kinds (social science). Probably due to the emphasis that 20^th century philosophy of science placed on physics, the larger part of debates within Metaphysics of Science revolves around topics that occur most prominently within the realm of physics, but which figure or bear connections to the other sciences as well:

laws of nature, causation, and dispositions
necessity, possibility, and probability
(natural) kinds and essences
reduction, emergence, and grounding
space and time.

Regardless of whether philosophers defend a narrow or a more permissive notion of Metaphysics of Science, they agree that the concepts in question are in need of explanation. At the very least, such an explanation must show how the concepts cohere. Some metaphysicians take one or more of the concepts they discuss (alongside their related phenomena) to be primitive, meaning that these concepts cannot be analyzed in terms of other concepts and their related phenomena cannot be subsumed under other phenomena. Typically, they then proceed to show that other concepts (alongside their related phenomena) can be explicated in terms of these primitive concepts. (For an exemplary account of some potentially primitive concepts and how they cohere, see parts a through d in section 4.)

As a discipline in its own right, Metaphysics of Science is still relatively young, especially when compared to other areas of philosophy (such as epistemology and ethics). Its topics, however, are not. For as long as science has existed, there has been metaphysical reflection on central scientific concepts. Metaphysics of Science of the 21^st century differs from natural philosophy of the past in that the aspiration of natural philosophy was to speculatively describe the world as it is, whereas Metaphysics of Science is more concerned with what the world would be like if our best scientific theories were to turn out true (compare Carrier 2007, 42).

2. Metaphysics of Science in the 20th (and Early 21st) Century

a. The Logical Empiricist Critique of Metaphysics

Of the many historical roots of modern philosophy of science, Logical Empiricism (often interchangeably called “Logical Positivism”) stands out. The Logical Empiricists and their sympathizers (especially Rudolf Carnap, Moritz Schlick, Otto Neurath, Hans Reichenbach, Alfred Ayer, and Carl Gustav Hempel) were the progenitors of a new kind of philosophy (that directly relates to the philosophical work of Gottlob Frege, Bertrand Russell, and Ludwig Wittgenstein, which later came to be known as “analytic philosophy”). They influenced many of the most prominent philosophers of the late 20^th century (among them Karl Popper and Willard Van Orman Quine). In a sense, it is with them and their themes (laws of nature, causation, counterfactuals) that modern Metaphysics of Science begins, although they would have rejected much that currently goes by that name. Their ideas sparked many of the debates central to Metaphysics of Science.

In the 1930s, the Logical Empiricists proposed an empiricist, positivist program. They held that experience is our only source of nondefinitional knowledge (hence Logical Empiricism) and that the task of philosophy is logical analysis; that is, analysis of the logical features of and relations between sentences (hence Logical Empiricism). According to the Logical Empiricists, all the empirical propositions we believe can be reduced to so-called protocol sentences, which are direct renderings of our perceptual experience, or “the given.” Only if we know how a sentence could in principle be verified—that is, which possible observations would result in our accepting it as true—can we say that the sentence is meaningful. This so-called verifiability criterion of meaning has one purpose in particular, namely, to exclude metaphysical speculation from the realm of meaningful discourse. For example, the metaphysical sentence “every thing has an immaterial substance” cannot be empirically verified; hence, according to the verifiability criterion of meaning, it is meaningless. A radical antimetaphysical stance was one of the key tenets of Logical Empiricism. Note that verificationism recasts the Empiricists’ epistemic doctrine that all factual knowledge comes from sense perception as a semantic doctrine. Indeed, if we believe that what we know is expressed (or at least expressible) in meaningful sentences, then the transition from Empiricist epistemology to semantics is straightforward: all factual knowledge is expressed in meaningful sentences and only those sentences for which we are able to give a method of verification in observation are meaningful.

It soon became apparent, however, that Logical Empiricism, and especially the verifiability criterion of meaning, houses some serious flaws. Two major blows came from Willard Van Orman Quine’s seminal paper, Two Dogmas of Empiricism (1951), which argued that two assumptions the principle of verification has to presuppose are untenable: the first is that there is a clear distinction between analytically true and synthetically true sentences. The second is that each meaningful sentence faces the tribunal of sense experience on its own for its verification or falsification (rather than holistically in concert with other sentences).

Logical Empiricism faces further problems. Clearly, the Logical Empiricists held the sciences in high esteem. Usually, it is taken for granted that the sciences aim to discover natural laws and that they research properties such as electro-conductivity of different materials, reactiveness of chemical compounds, and fertility of organisms. Prima facie, it seems that many laws of nature can be expressed as general statements, that is, as statements of the form “any particular thing x which has property F also has property G” (in logical notation: ∀x(Fx → Gx)). For example, we say that all samples of metal expand when heated. But universal generalisations of this kind cannot ever be proven true by actual empirical observations (because they have far more instances, maybe infinitely many, than could ever be observed and confirmed), so the verifiability criterion rules out (at least some) laws of nature as meaningless. Even if this consequence could be avoided, what the laws of nature say is often taken to not be merely accidentally true, but to ensue with modal force. Empirically, we cannot account for modality: we can only observe what is actually the case, not what else is possibly or necessarily true.

Similarly, Logical Empiricism runs into problems regarding dispositional properties. Everyday properties such as solubility and scientific properties like conductivity cannot easily be reduced to the observable qualities of soluble or conductive objects. For example, a sugar cube is a somewhat solid object, much like a matchstick, but if we were to place the sugar cube in water, it would dissolve, whereas the matchstick would not. Its manifest properties such as solidity, color, and taste provide no clue as to what will happen to the sugar cube if placed in water. What is more, even if a particular sugar cube (or even all the sugar cubes in the world) were never placed in water at all (or if it were placed in water but the water was already supersaturated with sugar so that the sugar cube would not dissolve in that particular situation), it would nevertheless retain its dispositional property of being soluble, although there is nothing about it that we observe which hints at its solubility. An analogous case can be made regarding dispositional properties discussed in the sciences, like conductivity or chemical bonding propensity, and similarly, regarding science’s theoretically postulated, not directly observable, entities like quarks or superstrings. Because dispositional properties, theoretical entities, and universally generalized laws of nature appear to belong to the conceptual inventory of the sciences, Logical Empiricism, which fails to adequately account for them, quickly became an unattractive option. (For more on laws of nature and dispositions, see section 4c and 4a.)

b. The Return to Metaphysics

The failure of Logical Empiricism to cope with some of the key concepts of science eventually led to the development of Metaphysics of Science. Philosophers realized that if concepts such as law of nature and necessity could not be eliminated by reduction to observation terms, it must then be legitimate to examine them thoroughly, by whatever means seem fit. The most likely candidate to fulfill this task is metaphysics. (For an overview of methods commonly applied in Metaphysics of Science, see section 5.)

The development of Metaphysics of Science occurred simultaneously with the revival of metaphysics in the analytic tradition of philosophy, a tradition that was rooted in Logical Empiricism (as well as in the linguistic turn, manifested by the ideal and ordinary language philosophies of the late 19^th and mid-20^th centuries). Analytic philosophers were initially hostile towards metaphysical questions. They rejected questions which transcended empirical observation or fell outside of the scope of the sciences. However, philosophers like Willard Van Orman Quine (most famously in his essay “On What There Is” (1948)) and Peter Strawson (especially in his monograph Individuals (1959)) soon realized that there is a supposedly innocent way of practicing metaphysics by describing human conceptual schemes rather than by speculatively conjuring up grand metaphysical edifices. Instead of laying claims to knowledge of the unobservable, they focused on finding out how humans in fact conceptualize reality—in their everyday language (Strawson) or their scientific theories (Quine) where, if stronger authority is given to the sciences, the latter may revise the commitments of the former. Quineans favor the revision and are, hence, closer to the attitude of Metaphysics of Science, where Strawsonians give much credibility also to folk’s general metaphysical background assumptions.

Encouraged by the failure of Logical Empiricism and the fact that metaphysical questions were once again beginning to be the subject of philosophical discussion, philosophers developed a renewed interest in metaphysics. They gradually grew confident in talking not merely about observations, semantics, and language, but also about reality.

Another significant step towards the return to metaphysics was the development of modal logic. Begun by Carnap—for example, in his Meaning and Necessity (1947)—the logic of necessity, possibility, and counterfactuality was refined considerably by Ruth Barcan Marcus (1947), Saul Kripke (1963), and David Lewis (1973a). Later, with Kripke’s Naming and Necessity (1980) and Hilary Putnam’s “The Meaning of ‘Meaning’” (1975), the formalisms were given ontological interpretations and the belief in necessity in nature gained new justifications. Building on these developments further still, even (Aristotelian) essences saw their revival: see Kit Fine’s work (1994) and its application within Metaphysics of Science by, for example, Brian Ellis (2001) and Alexander Bird (2007).

The return to metaphysics in the 20^th century was not merely a trailblazing event for the development of modern Metaphysics of Science; rather, the two evolved alongside each other. For example, when it became acceptable for metaphysicians to speak of necessities in nature and discuss statements like “Water is necessarily H₂O,” this paved the way for a realistic reading of other modalities, like nomological necessity or counterfactuality. These are, as we will see (in section 4b and 4c), central notions in debates on the nature and status of laws of nature in Metaphysics of Science.

c. Naturalized Metaphysics and Inductive Metaphysics

In the early 21^st century, some philosophers argued for a naturalization of metaphysics. Their argument typically rests on the fact that the sciences appear to surpass metaphysics in many respects. The sciences, they claim, have a shared stock of accepted theories, a pool of respected methods and institutionalized standards, and they have predictive and technological successes to show for themselves. In contrast, there is long lasting dissent over positions and methods in metaphysics that rarely ever gets dissolved, and it is unclear what would even count as criteria for metaphysical success. As some metaphysical questions—such as “What is the world ultimately made of?” and “What is life?”—also belong to the domain of the sciences (physics and biology, respectively), naturalists insist that we must draw upon scientific findings to properly answer them.

Naturalistic metaphysicians come in all shapes and sizes. Some naturalists wish to prohibit any metaphysics that is not scientifically evaluable (compare Ladyman and Ross 2007). Some suggest that we should take our clues from scientific practice. For example, Tim Maudlin (2007) argues that lawhood is primitive, as working scientists see no need to analyze the concept. (For more on Maudlin’s position, see section 4c.) Others still allow for the possibility of relevant questions which may not have straightforwardly scientific answers. For example, consider the question “What is it for a thing to persist through time?” Imagine we take a ship out to sea and, little by little, replace every single part of it until none of the original parts remain. Certainly, science can describe how the ship changes, but it will not tell us whether the ship we sail home is still the same as the ship that put out to sea. The latter becomes a pressing, genuinely metaphysical problem, especially when we ask an analogous question about a person’s change and persistence through time.

What is important to remember is that although a naturalized metaphysics may, in a sense, also be called a “Metaphysics of Science,” its proponents may have a very different sort of metaphysics in mind than that presented in section 4.

In the 21^st century, some philosophers have stressed that Metaphysics of Science could well be an inductive/abductive enterprise that, just as the sciences do, generalizes empirical data and builds explanatory models on that basis (Paul 2012; Williamson 2016; Schurz 2016; the research group Inductive Metaphysics). (Interestingly, precursors of the idea of an inductive/abductive metaphysics developed simultaneously with Logical Empiricism (Scholz 2018).) If so, metaphysical hypotheses might turn out to be fallible, only approximately true, and contingent.

3. Why Do We Need Metaphysics of Science?

In section 1 it was said that Metaphysics of Science examines the key concepts of science. But why do philosophers even bother to argue over issues in Metaphysics of Science? Is it not relatively clear what the basic concepts in science are and what they mean? Surely scientists know very well what they mean to say when they talk about the solubility of sugar, the second law of thermodynamics, and the relativity of space-time?

What inspires Metaphysics of Science is, of course, the idea that there is more to know about these phenomena and the concepts involved than science can say. Think of causation, for example. The concept of causation is commonsensical: we encounter causal processes in everyday life, like when we hit a golf ball with a putter and the ball begins to move, or when we drop a glass and it shatters. We intuitively distinguish these causal processes from noncausal processes. For example, if somebody in the next room sneezes as you raise your arm, you just know that raising your arm was not the cause of the other person’s sneezing. Still, it is quite complicated to say what establishes a causal connection between two events and what exactly distinguishes the putter-and-golf-ball scenario from the raise-arm-and-sneeze incident. Science records measurements and reveals statistical correlations between phenomena. It also has apt intuitions about whether two events are indeed causally connected or whether they merely co-occur accidentally, albeit regularly. Yet science is rarely interested in a general overall theory (detached from particular, concrete cause-effect relations) of what exactly distinguishes causes from accidents. Concepts such as causation or laws of nature, although relevant for science, are rarely the subject matter of science itself.

Science and Metaphysics of Science have different but complementary approaches to reality: the scientist’s work in this respect is predominantly empirical and consists in finding instantiations—describing particular causal interactions, listing things which are disposed in certain ways, pinning down particular laws of nature, and so on—while the metaphysician’s focus is on understanding and clarifying general concepts or the corresponding phenomena (like causation, disposition, and law of nature).

Still, the critic may object that even if the metaphysician’s and the scientist’s approaches to reality are indeed complementary, we can do perfectly well without Metaphysics of Science. For example, if science manages to find out the different variables and constants that determine how things in the world hang together, why do we also need to know what the general characteristics of a law of nature are or how that notion can be analyzed in terms of other notions? Isn’t this superfluous information? Clearly, scientists do not need metaphysicians to tell them about causation or dispositions in order to perform their research. Nevertheless, metaphysicians of science believe that questions regarding the existence and nature of causation, natural kinds, and necessity are valuable in their own right. At the very least, they are pressing questions that cannot be ignored by those who yearn to thoroughly understand the world we live in. By way of example, consider the dispute between defenders of Humean supervenience and antiHumeans, which revolves around the question of whether there are necessities in nature or not. (See 4a for a brief account of the debate.) Clearly, this is not a question that can be answered by purely scientific methods, but it is one that metaphysicians will nevertheless take to be meaningful and profound.

Some of the issues discussed in Metaphysics of Science are also relevant for practical contexts. For example, failure to render assistance (in case of an accident, a medical emergency, or the like) can lead to prosecution or social repercussions due to immoral behavior. However, you can only be held legally and morally responsible for events you are also causally responsible for. Accordingly, both ethics and law require a concept of causality that accounts not just for positive but also for negative causation, that is, causation by the absence of an event or act. If you pass an unconscious person lying on train tracks and fail to alert the authorities or pull him off the tracks, then you are (partly) causally responsible for his death if he is later killed by a train. Thus, although many questions within Metaphysics of Science are primarily aimed at complementing science, its debates may have far-reaching consequences in other fields as well.

To more fully understand the difference between the scientific and the metaphysical approach to the key scientific concepts that constitute the subject matter of Metaphysics of Science, it is helpful to consider samples of actual work in Metaphysics of Science (section 4) and to take a closer look at the methodology employed (section 5).

4. Sample Topics in Metaphysics of Science

As Metaphysics of Science is the study of the key concepts of science, its subject matter depends directly on what the sciences study and which concepts they employ. Because there are many different branches of science, there are also many potential topics for metaphysicians to discuss. It is impossible to name them all in a survey article, much less discuss them in detail. However, it is practically impossible to fully grasp what Metaphysics of Science is from general definitions only. (The same is true of metaphysics in general. No layperson will understand what metaphysicians do from hearing that metaphysics is the study of the fundamental structure of reality.)

In order to give the reader an idea of both the scope of Metaphysics of Science and its practice, this section briefly and tentatively introduces seven debates which have preoccupied metaphysicians of science in the past: counterfactuals and necessities, dispositions, laws of nature, causation, natural kinds, reduction and related concepts, and space and time. (See the respective articles for more information on modal logic and modality, laws of nature, reductionism, emergence, and time.)

a. Dispositions

Some objects have dispositional properties. For example, sugar is soluble, matchsticks are inflammable, and porcelain vases are fragile. Properties like solubility or fragility are often conceived of as becoming manifest only under so-called “triggers” or “stimulus conditions,” which set off the manifestation of the dispositional property. For example, for a sugar cube to manifest its solubility by dissolving, it must be placed in water.

Not all properties are like that. So-called categorical properties need no stimulus; they are always manifest. Just think of the properties of being solid, having a certain molecular structure (for example, being H₂O), being rectangular, and so on. The distinction between categorical and dispositional properties is often drawn with the following three features in mind:

(i) Untriggered dispositions are not directly observable, whereas many categorical properties are. For example, from looking at some sort of powder, we cannot tell whether it is soluble or not. Looking at a football, we immediately see that it is round.

(ii) Because dispositional properties bestow objects with possibilities (of behaving in certain ways under certain circumstances), they are said to be modal properties: they imply, by their very nature, what can, might, or (given certain circumstances) must be the case. Categorical properties are not usually conceived of in this way.

(iii) Dispositional properties are often identified with productive powers. For example, scratching a match is not enough for it to light up; the match’s inflammability, too, is causally responsible for the flame. Usually, no such productive, causal force is directly associated with categorical properties.

Dispositional properties are not just a phenomenon we encounter in everyday contexts, but in science as well. For example, the property of being charged appears to fit this profile: it is not directly observable, it determines how objects would behave under certain conditions, and an object’s charge can be a vital factor in causal processes. Dispositionality has hence been of interest to Metaphysics of Science since its very beginning. In fact, the failure of Logical Empiricism to properly account for dispositional properties played a seminal role in the emergence of the discipline (see section 2a).

Because of their shared belief that all of our knowledge ultimately reduces to observational experience, Logical Empiricists like Rudolf Carnap (1936) attempted to account for dispositional properties in terms of observational properties using a simple conditional to connect the trigger to the manifestation: to say that a sugar cube is soluble just means that if we put it in water, it will dissolve. This and similar attempts at reduction fail, however, as they do not account for the modal behavior of disposed objects. For example, they do not supply a basis on which to ascribe (or not to ascribe) solubility to objects which have never been placed in water. This strikes us as odd, as it does not correspond to our everyday practice.

In order to adequately capture the modal nature of dispositions, philosophers soon suggested that we employ a counterfactual connective instead of the simple conditional. To say that some object has a disposition, they argued, means just that if the object were exposed to the trigger conditions, the disposition would manifest. This approach faces at least two problems. First, it requires a theory that specifies truth conditions for counterfactual conditionals (see section 4b). Second, there are some interesting counterexamples to the effect that under certain conditions we would intuitively ascribe dispositions to objects for which the proposed analysis fails (as in Charles Martin’s 1994 electro-fink example).

Although early attempts at reducing dispositions to categorical properties have failed, problems like the above have convinced some philosophers that we should strive for a reductive analysis after all. The philosophical position that holds that all properties are categorical and that supposedly dispositional properties can somehow be reduced to categorical properties is called “categoricalism.” For many categoricalists, a large part of their motivation comes not from Logical Empiricism but a fundamental insight of classical empiricism. David Hume famously observed that necessary connections, like those between causes and their effects, cannot be detected empirically. Hence, Hume concludes, we have no reason to assume that any sort of productive, necessary, or modal connection of events in nature exists. (This has come to be known as Hume’s Dictum.) Twenty-first century Humeans, too, claim that there are no necessary connections in nature. Consequently, they deny that there are irreducible, metaphysically fundamental dispositional properties that seem to imply some sort of necessary or modal connection between the trigger and the manifestation.

However, as reduction proves to be notoriously complicated, other philosophers opt for dispositionalism instead, which is, in its most radical form (pan-dispositionalism), the view that all properties are of a dispositional nature. Both categoricalism and pan-dispositionalism are monistic theories, as both claim that there is, at the fundamental level, only one type of property. It is also possible for philosophers to hold a neutral or dualistic view, according to which there are both categorical and dispositional properties at the fundamental level of reality.

The debate over dispositions has had substantial impact on other debates within Metaphysics of Science and vice versa. For example, some philosophers argue that laws of nature and causation are grounded in dispositional properties: a law of nature like “Like-charged objects repel each other” could well be true because of the dispositional nature of charge, and causal successions of events could be determined by the dispositional properties of objects involved (for example, wood paneling can be a partial cause of a house fire because it is inflammable). Other philosophers see the direction of dependence exactly the other way around: dispositions depend on laws of nature, because if the laws of nature were different, objects might have different dispositions. For example, if the laws of ionic bonding were different, salt might not dissolve in water. Similarly for causation: maybe salt has its disposition to dissolve because its ionic structure is a potential cause of dissolving. Hence, the debate over dispositions should not be viewed in isolation.

b. Counterfactuals and Necessities

We learned above that a central feature of dispositions is that they establish a modal relationship between the disposed object’s being in the trigger condition(s) and the disposition’s manifestation. A plausible candidate for understanding the nature of this modal relationship is counterfactual dependence. The standard notation for counterfactual dependence reads p □→ q: if p were the case, then it would be the case that q. If a sugar cube is soluble, then that means, at least in part, that if it were placed in water, it would dissolve.

The sentential connective □→ is an intensional connective, which means that the truth value of the entire conditional cannot simply be read off the truth values of the antecedent and the consequent. The reason is easily understood: counterfactual conditionals describe counterfactual situations, which means that both the antecedent and the consequent are usually not currently true. Yet some such counterfactuals with a (currently) false antecedent and a (currently) false consequent are true (the above one capturing solubility, for example) and some such counterfactuals are false (such as “If I were to say ‘abracadabra’ a rabbit would appear”). How then can we evaluate the truth of counterfactual conditionals, given that the truth or falsity of its components is not decisive?

An idea proposed by Nelson Goodman (1947, 1955) and Roderick Chisholm (1946) is to have the truth of a counterfactual conditional depend on both the laws of nature and the background conditions on which they operate. On this account, a counterfactual conditional p □→ q is true if and only if there are true laws of nature L and background conditions C which hold, such that p, L, and C communally imply q. (Some further conditions must be met, like that the background conditions must be logically compatible with p.) Obviously, if the laws of nature or the background conditions were different, p □→ q might turn out not to be the case.

An alternative way of thinking about counterfactuals called “possible world semantics” was introduced by David K. Lewis (1973a). Lewis’s most important tool is the concept of a possible world. According to Lewis, our actual world is only one among a multitude of possible worlds. A possible world is best thought of as one way (of many) the actual world could have been: all other things being equal, the word “multitude” in the last sentence could have been misspelled, Lewis could never have been born, or atoms could have been made of chocolate. Robert Stalnaker (1968) proposed a similar account but without defending modal realism (that is, realism regarding possible worlds). To him, possible worlds are tools, and as such no more than descriptions of worlds that do not exist.

Some possible worlds are more similar to ours than others. For example, a world which is like ours in every respect except that “multitude” is misspelled in the preceding paragraph is more similar to the actual world than a world with chocolate atoms. In evaluating a counterfactual’s truth value, this fact plays a seminal role. Consider, for example, the sentence “If David had not overslept, he would not have been late for work.” In a world where all vehicles miraculously disappeared that morning, where the floor of David’s bedroom was covered in super strong instant glue, or where the laws of nature suddenly changed so that movement is no longer possible, he would not have made it into work in time, even if he had gotten up early. But these worlds do not interest us; this is clearly not what we mean by saying that had David not overslept, he would have made it in time. To judge whether the counterfactual conditional is true regarding our world, we need to consider only worlds where the laws of nature remain the same and everything else is rather normal—that is, similar to what actually did happen—except for the fact that David did not oversleep (and maybe some minor differences).

Lewis and Stalnaker suggest that an ordering of worlds with respect to similarity to our world is possible. Naturally, worlds where many facts are different from the facts of our world, and worlds with different laws of nature, count as particularly dissimilar. Counterfactual truth can then be determined as follows: of all the possible worlds where p is the case (for short, the p-worlds), some will be q-worlds and others non-q-worlds (that is, worlds where q is true or not true, respectively). To determine whether the counterfactual conditional p □→ q is true for our world, we need to check whether the p-worlds that are also q-worlds are more similar to our world than the p-worlds that are non-q-worlds. So to find out whether it is true that David would have gotten to work in time had he not overslept, we look at possible worlds where David did not oversleep and check whether the worlds where he makes it into work are more similar to the actual world than worlds where he does not (because, say, all buses disappear or the floor is sticky).

According to this analysis, the consequent need not be true in all possible worlds (but only in similar p-worlds) in order for a counterfactual to be true. For example, had David overslept in a world where objects can be transported via beaming, he might still have made it to work in time. But as it is doubtful whether this technology will ever be available in our world (as it is not clear whether it is compatible with our laws of nature), the world where beaming has been invented is not relevant for the evaluation of the counterfactual conditional.

Related to what has just been said, we can point out a welcome feature of counterfactual conditionals: it can be true both that if David had not overslept, he would not have been late for work; and that if David had not overslept, yet the bus had had an accident, he would (still) have been late for work. This is a feature that necessary conditionals and mere material implications cannot well accommodate (or only with the undesirable implications that it is impossible for David to oversleep together with the bus having been involved in an accident).

In addition to providing a way of understanding counterfactual conditionals, possible world semantics allows us to spell out the modal notions of necessity and possibility in terms of quantification over possible worlds. Thus, a sentence p is necessarily true (in logical notation: □p) if and only if it is true in all possible worlds. If p is necessarily true, there is no way that p could be false; that is, there is no possible world where p is false. Similarly, p is possibly true (in logical notation: ◊p) if and only if it is true in at least one possible world.

Necessity is thus expressed in terms of universal quantification over (all) possible worlds, whereas possibility is existential quantification over (all) possible worlds. Like the general and existential quantifiers, necessity and possibility, too, are interdefinable: if p is necessary, then it is impossible that non-p, and if p is possible, then it is not necessarily the case that non-p.

Note that there are different sorts of necessity which can be easily accounted for if we conceive of necessity and possibility in terms of quantification over possible worlds: Logical, metaphysical, and nomological necessity can be defined by restricting the scope of worlds over which we quantify. For nomological necessity, for example, we restrict quantification to all and only worlds where our laws of nature hold.

Possible world semantics faces several problems, however. For example, it is unclear just how we can know about what is or is not the case in other possible worlds. How do we gain access to possible worlds that are not our own? However, possible world semantics is a valuable tool for understanding some of the most central issues in Metaphysics of Science, such as dispositions and causation. In addition, necessity is a crucial element in theories of laws of nature, essences, and properties. The modalities of necessity, possibility, and counterfactuality are also important in their own right: after all, knowing what would happen if something else were the case or what can or must happen is key to scientific understanding.

c. Laws of Nature

Here are some intuitions philosophers have about laws of nature: laws are true or idealized, objective, universal statements. Laws of nature support counterfactuals, are confirmable by induction, and are explanatorily valuable as well as essential for predictions and retrodictions. Laws have modal power in that they force certain events to happen or forbid them from occurring. Any analysis of the concept will attempt to account for at least some of these features. Roughly, there are five types of theories of laws of nature: regularity accounts, necessitation accounts, counterfactual accounts, dispositional essentialist accounts, and accounts which take laws to be ontological primitives.

The basic idea of early regularity accounts is that a law of nature is a true, lawlike universal generalization (usually of the form “All F are G,” or in formal notation: ∀x (Fx → Gx)). Whether a given generalization is true is, of course, an empirical matter and must be determined by the sciences, but what it means for a statement to be lawlike is left for metaphysics to define. Not all general statements are lawlike. For example, some general statements state logical truths which clearly are not laws of nature (like “All ravens are ravens”). The main challenge for regularity theories is figuring out what makes a universal statement lawlike without appealing to any sort of connection between events other than regularity.

The Best Systems Account (Lewis 1973a) is an example of a sophisticated regularity theory. It asks us to imagine that all facts about the world are known, such that you know of every space-time point what natural properties are instantiated at it. There are many different ways of systematizing this knowledge by using different sets of generalizations. These generalizations make up competing deductive systems. Defenders of the Best Systems Account hold that a (contingent) generalization is a law of nature if and only if it is a theorem within the best such system. Which system is the best is determined by appeal to certain criteria: simplicity, strength (or informational content), and fit.

The Best Systems Account has been criticized for not taking seriously the intuitions that laws of nature are objective, have explanatory value, and hold with modal force. The Best Systems Account yields regularities, but it does not explain why they obtain. Opponents of regularity theories stress that laws do not merely state what is the case, but enforce or produce what happens.

Necessitation accounts are alternatives to the Best Systems Account that endorse this idea. Such accounts have been proposed by David Armstrong (1983), Fred Dretske (1977), and Michael Tooley (1977). For Armstrong, a law of nature is a necessitation relation N between natural properties. (Armstrong speaks of universals.) For two natural properties to be related by necessitation means that one of them gives rise to and must be accompanied by the other (hence necessitation). To give a coarse-grained example: Coulomb’s law (which states, very roughly, that charges exert forces onto other charges), is a true law statement if and only if necessitation holds between the properties of having a certain charge (C) and exerting a certain force (F): N(C, F).

Necessitation accounts have some advantages over regularity theories. For example, they can more easily allow for uninstantiated laws. But how exactly do we know which properties are related by the necessitation relation, and why should we even assume that it exists? Armstrong argues that necessitation can be experienced insofar as it manifests in causal processes. However, not all laws are causal laws. Defenders of necessitation accounts must work out these issues.

The counterfactuals account focuses on a feature related to necessity, namely, the fact that laws of nature are stable under counterfactual perturbations. For example, that nothing can be accelerated beyond the speed of light is a law of nature because it is a fact that no matter what fantastical interventions we were to devise, we still couldn’t travel faster than the speed of light. Versions of the counterfactuals account of laws of nature have been proposed by James Woodward (1992), John Roberts (2008), and Marc Lange (2009).

A bullet that counterfactual accounts have to bite is that the intuitive order of explanation regarding laws and counterfactuals is upside down: whereas the counterfactual theory of laws says that it is a law that all bodies fall down to earth because it is fundamentally true that “were some arbitrary massive body dropped it would fall,” we intuitively believe that “were we to drop this body it would fall” is true because the law of gravitation holds. In other words, it is more intuitive to hold that the laws of nature support counterfactuals rather than that counterfactuals support the laws.

Another prominent way to account for laws of nature is to appeal to dispositional essentialism. Dispositionalists, like Brian Ellis (2001), Alexander Bird (2007), or Mumford and Anjum (2011), believe that some or even all properties are essentially dispositional. For example, if an object has the property of being electrically charged, that just means that it has the dispositional property of being attracted or repelled by other charged objects nearby. In this sense, the property of being electrically charged is essentially dispositional, because no object is electrically charged unless it is disposed to be attracted or repelled in this way.

Now, if natural properties bestow on their bearers dispositions, then that means it is always true that if something has a given natural property (Px), it also has a certain disposition (Dx) and thus it will manifest in a certain way (Mx), given that the disposition’s corresponding trigger occurs (Sx). (In formal notation: ☐∀x((Px ∧ Sx) → Mx)). This is precisely what many metaphysicians ask of laws: that they bring about or make necessary what happens when something else is the case. Dispositional essentialists thus claim that dispositions ground nomological facts: laws arise from the dispositions things have.

Obviously, the dispositional essentialist account of lawhood hinges on non-trivial premises, which must be evaluated in their own right—for example, the premise that dispositions are basic.

If analyzing lawhood is so complicated an affair that it requires elaborate theories and intricate tools, why not assume that lawhood is conceptually and ontologically primitive—that is, that the concept of lawhood cannot be defined in terms of other concepts, and that it cannot be reduced to underlying phenomena? Tim Maudlin (2007) argues that scientists do not seek to analyze laws, but rather accept their existence for a brute fact in their daily practice, and that philosophers should do likewise.

To Maudlin, a law of nature is that which governs a system’s evolution through time and determines what future states can be produced from the current state of the system. As lawhood is a primitive concept for Maudlin, he attempts to utilize it in defining other notions, like causation and counterfactual truth. Whether Maudlin’s approach is viable or not depends to a large part on whether these definitions of causation and counterfactual dependence by means of laws of nature work out or not.

d. Causation

Causation is obviously intimately connected to the laws of nature, as we would expect at least some laws to govern some causal relationships. Causation, however, is not a straightforward notion. For example, philosophers disagree over which kinds of entities are the proper relata in causal relationships, some potential candidates being substances, properties, facts, or events. There are several approaches to understanding causation: regularity theories, counterfactual theories, transfer theories, and interventionist theories.

Regularity theories follow in the footsteps of David Hume’s treatment of causation. According to regularity theories, all that can be said about causation comes down to stating a regularity in the sequence of events. The motivation for regularity theories stems from the fact that instances of a regularity can be observed, unlike the production of one event by another or a necessary relation between events.

One of the most widely known regularity theories is John Mackie’s INUS account of causation (1965). According to Mackie, an event is a cause if it is an Insufficient but Necessary part of an Unnecessary yet Sufficient condition for the effect to occur. For example, a short circuit (C) alone is not sufficient for a house to burn down (E); there must also be inflammable materials nearby (A) and there must not be sprinklers which extinguish the fire (B). Call this a complex condition (ABC). As the absence of sprinklers and the presence of inflammable materials is not enough to cause a fire, the short circuit is necessary within this complex condition, which is then sufficient for the fire. But there may be other complex events (DFG, HIJ, and so on) which could also bring about the same effect. For example, a lit candle in a dried-up Christmas tree may also cause the house to burn down. As the short-circuit scenario (ABC) is only one of many potential causes of a fire, it is not necessary for the effect to occur, but if it occurs, it is sufficient to bring about the fire.

Like other regularity theories, Mackie’s INUS theory has the disadvantage of classifying as causal some regularly co-occurring coincidences that are, for all we know, not causally related. For illustration, consider a simpler type of regularity theory according to which causation is just regular succession. The problem is that if causation were nothing but regular succession, then we would be forced to say that the rise of consumer goods prices in the late 20^th century causes the oceans’ water levels to rise. Obviously, these events coincided but are not causally related.

To forgo this problem, philosophers devised counterfactual theories of causation. The initial idea presented by David K. Lewis (1973b) is to equate causal dependence with counterfactual dependence. The idea seems plausible: had the cause not occurred, there would (all else being equal) not have been the effect. More precisely, for event e to (causally) depend on event c, whether e occurs or not must depend (counterfactually) on whether c occurs or not (that is, on whether both c □→ e and ¬c □→ ¬e are true, where ¬ is the negation operator). For example, if the short circuit is the cause of the fire, then the house would have burned down if the short circuit had occurred, and it would not have burned down if the short circuit had not occurred.

Lewis saw that this initial account is flawed as it yields intuitively incorrect results in so-called pre-emption scenarios. Imagine two people, Suzy and Billy, throwing stones at a bottle. Now picture a situation where if Suzy does not throw her rock, Billy will. Suppose Suzy throws her rock, hits, and the bottle shatters. The effect, namely the shattering of the bottle, is evidently caused by Suzy’s throwing the rock. However, the effect would have occurred even if Suzy had not thrown, because in that case Billy would have thrown his rock and shattered the bottle. In this scenario, we recognize Suzy’s throw as the cause of the shattering, but the latter does not counterfactually depend on the former (because it is incorrect that had Suzy not thrown, the bottle would not have been shattered).

Although more sophisticated counterfactual theories are more successful in dealing with pre-emption and other problems, some philosophers choose to take a different approach. Proponents of transfer or conserved quantity theories like Salmon (1984, 1994), Phil Dowe (1992), and Max Kistler (2006) claim that causation is best understood as a transfer of a physical quantity from one event to another. For example, Suzy is causally responsible for shattering the bottle (and Billy is not) if it was her energy that set the stone in motion to physically interact with the bottle on impact and shatter it. Transferable quantities include energy, momentum, and charge, for example. These quantities are subject to conservation laws, which means that in any isolated system, the sum total of the remaining and the transferred amount of the quantity will always equal the initial amount.

Transfer theories face difficulties in accounting for negative causation. For instance, omitting to water plants may cause them to wither, but there is no transfer of a conserved quantity from anything to the withering. Other problems derive from examples where the supposed causal relationship is not obviously of a physical nature. For example, we may say that wild speculations at the stock market caused the economy to break down or that Suzy’s throwing Billy a kiss causes him to blush.

Of the fourth group of theories of causation, interventionist theories, James Woodward’s approach (2003) is a prime example. Woodward suggests that causation is best characterized by appeal to intervention. Consider the following example: Testing a drug for efficiency consists in finding out whether a group of people who are administered the drug are cured while a group who does not receive the drug remains uncured. In other words, drug testers intervene by giving the drug to some patients and a placebo to others. If the drug intervention leads to recovery while the placebo intervention does not, the drug is said to be causally relevant for the recovery.

Woodward places further constraints on interventions, one of which is that the intervention (of administering the drug or the placebo, respectively) must be performed in such a way that other potential influences are absent. For example, if the drug were given to healthy and young patients while only the elderly and frail receive the placebo, the test might falsely attribute causal efficacy to the drug.

Even when these precautions are taken, Woodward’s theory is at risk of being circular: the analysis presupposes that we understand beforehand what it means to intervene on a system. Intervention, however, is itself a causal notion. Woodward has clarified that his theory is meant to explicate and enlighten our concept of causation, not to reduce causation to other phenomena.

It seems that all theories of causation face difficulties (either in the form of recalcitrant exemplary cases or in that they do not capture certain features of causation). One possible conclusion to draw from this is that causation is not one unified phenomenon but at least two and potentially many more. For example, Ned Hall (2004) argues that our intuitions characterize causation both as production and counterfactual dependence, and that the problems of analyses of causation can be traced back to the attempt of squeezing these into one unified concept.

The debates over the nature of dispositions, modality, laws of nature, and causation are still ongoing. Many promising approaches have been proposed in their course and will continue to be explored in the future. (For a detailed account of the relation between the debates surrounding dispositions, counterfactuals, laws of nature, and causation in Metaphysics of Science, see Schrenk (2017).)

e. Natural Kinds

In everyday contexts we habitually classify objects or group them together. Some of these groupings seem more natural to us than others. Philosophers who believe that nature comes with her very own classifications speak of “natural kinds.” For example, samples of gold closely resemble each other, differ clearly from other chemical elements, and share a common microstructure, whereas sea life comprises organisms of very different sorts (including crustaceans, fish, and mammals). Terms like “sea life” and “tile-cleaning fluid” are convenient for human purposes such as thinking and talking about groups of things, but we do not expect them to reflect the structure of the natural world (which does not mean that the classifications they introduce are entirely arbitrary). Natural kinds, on the other hand, supposedly “carve nature at the joints” (Plato’s Phaedro 265d–266a). They are also highly projectible: we can inductively infer from the behavior of one object to that of all objects of the same natural kind.

If natural kinds exist and contribute to the structuring of the world, then ideally we want the sciences to discover what natural kinds there are. A natural kind enthusiast may claim that physics tells us that electrons and quarks exist, chemistry says that there are chemical elements like gold (Au) and compounds like water (H₂O), and biology seems to suggest that organisms are ordered hierarchically along the lines of family, genus, and species. However, there are also conventionalists who believe that so-called natural kinds are not independent of the minds, theories, and ambitions of human beings, or that no way of dividing up the world is inherently better than any other. To illustrate their claims, they remind us that the concept of biological species used to be regarded a prime example for natural kinds, but that, in the meantime, various paradigms (based on the morphology, interbreeding capacities, or shared ancestry of organisms) have been proposed, each leading to a different system of classifications.

If natural kinds exist in nature, then what are they? What makes a natural kind the kind it is? Different ideas have been proposed and have given rise to a multitude of questions: Do objects which belong to natural kinds share at least some properties? Are these special, “natural” properties? Are natural kinds determined by the roles they play in inductive inferences or laws of nature? Is there a hierarchy of natural kinds, such that some kinds are more fundamental than others?

A position that has been particularly influential in the 20^th century is the view that natural kinds have essences. It supposedly follows from Hilary Putnam’s Twin Earth thought experiment (1975). Suppose there is a planet just like Earth in every way, but there is a liquid that the inhabitants of Twin Earth call “water” and which resembles water in every respect except for its microstructure, which is not H₂O, but XYZ. Intuitively, Putnam claims, XYZ is not water, which leads him to assume that, unlike the superficial properties of being wet, potable, and so on, being H₂O is a necessary condition for being water. Similar conclusions can be drawn from Saul Kripke’s argument that if we were to find out that the color we have up to now associated with elementary gold is actually an illusion, we would all agree that gold remains gold so long as it has atomic number 79, no matter what color it is (1980). (Kripke and Putnam’s primary aim is to show that the meaning of the terms “water” and “gold” comes not from our concepts but is determined by the structure of the world. We must, hence, acquire it a posteriori.)

Linked to but distinct from the question of what natural kinds are is the question of whether natural kinds form an ontological category in their own right, or if they can be reduced to other existents like properties. Realists regarding natural kinds believe that talk of natural kinds and successful inferences presupposes the existence of natural kinds in nature. Reductionists, on the other hand, may argue that membership in natural kinds is not only determined by a number of shared properties, but also that it consists in nothing over and above having these properties.

Unsurprisingly, metaphysicians of science are especially interested in finding out which, if any, natural kinds are postulated or discovered by the various branches of science and whether they really identify as natural kinds by the standards of contemporary metaphysical theories, or whether the theories of natural kinds need to be revised.

f. Reduction, Emergence, Supervenience, and Grounding

The world consists of many different things. Philosophers have always dreamed of rendering it more orderly by systematizing it in just the right way. An important step towards doing so seems to entail an analysis of the relationships and dependencies between things which belong to different strata or levels of reality. The world apparently comes structured in levels, with things on higher levels somehow depending on the things on lower levels. For example, a factory consists of machines, conveyor belts, and so forth; machines are made of various interacting cogs, levers, and wires (which, if left to themselves, cannot fulfill the functions they fulfill within the machine); the cogs are made out of molecules, the molecules are made of atoms, and the atoms are made of protons, neutrons, electrons, and so on. Dependencies like these are studied by the various special sciences. (Note that the idea that science suggests that the world comes structured in levels has been contested by some philosophers (Ladyman et al. 2007, 178).) It is clear, however, that a factory is not composed of machines, conveyor belts, and so on in the same way that an atom consists of particles. Surveying the whole of science, Metaphysics of Science strives to account for the various ways higher level objects depend on lower level entities. The aim is not just to establish what depends on what, but to also clarify and explicate the nature of the dependencies. The kinds of relations most fervently discussed in Metaphysics of Science include reduction, emergence, supervenience, and grounding.

Reduction is often conceived of as a two-place, asymmetrical relationship to the effect that one thing is somehow made of, accounted for, or explained in terms of another thing. Typically, the reduced thing is conceived of as somehow less fundamental or less real, or even considered to be eliminated. Two types of reduction are relevant to Metaphysics of Science. First, there is reduction of one theory to another. For example, is it possible to express some theories of chemistry in terms of physical theories? If so, can all chemical theories be thus reduced? What about biological, psychological, and sociological theories? Second, reduction is sought between different sorts of entities or ontological categories such as phenomena, events, processes, and so on. Potential candidates include reduction of macro-level objects to molecules, atoms, and subatomic particles, reduction of properties to sets of objects which resemble one another, reduction of states of affairs to entities and properties (including relations), and reduction of the mental to the physical. The latter especially has been widely discussed in metaphysics. (Note that the first and second kind of reduction cohere: if reduction of one theory to another succeeds, then ontological reduction of the entities postulated by the former to the entities mentioned in the latter may thereby also be achieved.) For Metaphysics of Science, claims of reduction pertaining to entities postulated by the sciences are of great interest, as are claims regarding reductive relationships between theories and their key concepts.

In a way, then, an armchair is reducible to its constituent parts: the fabric, upholstery, wood, and metal springs. However, an armchair is obviously not the same as a random pile of these materials. Unsurprisingly, philosophers disagree over whether, for particular cases, complete reduction can be achieved or not. For example, how could Bach’s Brandenburg Concerto No. 6 be reduced to its physical properties? Sure, a particular performance depends on the physical movements of the musicians and on how the created soundwaves causally impact on the hearers’ eardrums, but the Concerto is not identical to these physical properties apparent in any given performance of it, as it exists independently of them.

Those who argue that such reductions do not succeed often speak of the irreducible as emergent from the underlying basis. They want point out that although there is a dependence of the higher on the lower level, the higher level adds something novel and can thus not be completely reduced to the lower level. An emergent property or phenomenon cannot be accounted for by reduction, because it is believed not to be a property of any of the component parts, and it is not obviously caused solely by their interplay. For example, whether or not you find abortion morally reprehensible does not seem to depend on the physical facts. Given the same situation, somebody else might pass the opposite moral judgment. Whereas such moral considerations are of no great professional import to the metaphysician of science, emergent properties in the sciences are. For example, biology still struggles to explain why higher forms of life have certain properties like consciousness, aspirations, and phenomenal experiences, which are not obviously properties of the underlying matter.

Reduction and emergence are interlevel relations. The most innocent, weakest dependence relation that is compatible with both reduction and emergence is called supervenience. Some thing A (the so-called supervenience set) is said to supervene on some other thing B (the so-called supervenience base) if and only if there can be no difference in A without there also being a difference in B—or, for short, if there is no A-difference without a B-difference. For example, an oil painting’s macro-properties (A)—what it depicts and how it looks to us—supervene on its microphysical properties (B): unless the location, intensity, or color of the paint blotches are changed, the painting will always look the same to us. To better understand the world, metaphysicians of science research supposed supervenience relations in the sciences.

In the early 21^st century, metaphysicians turned their attention to another sort of interlevel relation: grounding relations. Grounding relations are metaphysical relations which establish a special sort of (noncausal) priority of one over the other. Of two propositions or facts which are related by a grounding relation, one is taken to ground, or account for, the other. Grounding is stronger than supervenience, as it amounts not just to the claim that some A-facts only vary when B-facts vary—which may occur coincidentally—but that A-facts vary because B-facts vary. Unlike some forms of reduction, grounding does not seek to eliminate the grounded fact; attributing full existence to both of them, it merely ascribes a more fundamental status to the grounding fact.

Debates over grounding revolve around a number of pivotal questions, such as whether instances of grounding are all of the same kind or whether they embody a number of different relations (which fall under the larger category of grounding relations), whether the grounding relation is primitive or can be analyzed in terms of other relations, and whether it is an irreflexive, asymmetric, and transitive relation or if other properties should be ascribed to it. The answers to these questions may also have an effect on how we should conceive of interlevel relations in the sciences, and the latter are of great interest to metaphysicians of science.

g. Space and Time

To most philosophers interested in the field, Metaphysics of Science is not confined to discussing concepts that pervade the whole of science (as, arguably, law of nature and causation do). It is also concerned with metaphysical questions that arise with respect to the particular sciences, like “What is life?” (biology) or “What is the ontological status of cultures, governments, and money?” (sociology). The philosophy of physics, too, gives rise to many interesting metaphysical questions. Among them are questions regarding the nature of space and time, which have been debated since the early dawn of western philosophy and, in the light of modern-day physics, are still at issue in philosophical debates.

As humans, we perceive space and time as different phenomena with differing properties. Space, as we perceive it, extends in three dimensions, and we can (almost) freely move in any direction. Through the physical forces which act upon our bodies, we are capable of detecting some sorts of motion through space (like when we run or jump) but not others (like Earth’s rotation). Time, on the other hand, has a sort of directedness to it (commonly referred to as “the flow of time”). We cannot linger at a particular moment in time, and we cannot go back to previous times. Entities somehow change yet persist through time.

Metaphysicians of science are interested in these phenomena especially in the light of Albert Einstein’s theories of Special and General Relativity. These theories were proposed in order to make sense of the fact that the speed of light was measured to remain constant regardless of the motion of the light source, whereas the velocities of objects depend on the motion of the object relative to an observer. For example, the speed of a train measured by a stationary observer on the platform is greater than its relative speed with respect to another, slower train that moves in the same direction. The speed of light emitted by a lamp on the train, however, will be the same regardless of whether it is measured by a passenger or a bystander. In popular interpretations, Einstein’s theory of Special Relativity suggests that the problem can be solved by postulating that the three spatial and the one temporal dimension form a continuum by the name of space-time. An astonishing consequence could be this: Different observers are at motion with respect to different objects. Their perception of the present is determined by which information is accessible to them, which in turn is a matter of which light signals reach them at a given moment. Therefore, their individual present, past, and future differ according to their state of motion with respect to other objects. Thus, an objective, observer-independent order of points in time does not exist. This view is often referred to as the block universe view, because everything seems to simply exist conjointly, with no objective past or future. Some philosophers also suggest that, on this view, familiar material things are three-dimensional slices of four-dimensional objects (sometimes called “space-time worms”).

Some philosophers claim that the block universe view is incompatible with presentism (the philosophical position that holds that only what is present exists) and supports eternalism (the view that all events past, present, and future exist). Unfortunately, the latter seems not to correspond to our subjective experiences of time. This poses a genuine dilemma for metaphysicians: should we accept Einstein’s theories and dismiss our subjective experiences, or do we need to reinterpret the remarkably well corroborated theories to accommodate our everyday conceptions of space and time?

More such fascinating questions remain. How is the (perceived) directedness of time and its irreversibility (which manifests as increase of entropy) best explained? Are space and time finite or infinite? Do they exist fundamentally and independently of the objects in them, or does their existence hinge on the existence of those objects? Quite obviously, these are questions on which scientific theories have a bearing, and Metaphysics of Science works towards solutions that are both philosophically rewarding and scientifically tenable.

5. The Methodology of Metaphysics of Science

Although Metaphysics of Science is concerned with the key concepts that figure prominently in science, its methods are not predominately those of the sciences. Apart from referencing scientific results and practices, Metaphysics of Science has a number of argumentative tools at its disposal that do not usually play an explicit role in scientific methodology but are not entirely unscientific either. In science these forms of arguments are implicitly employed to establish hypotheses when the empirical evidence is insufficient (for example, because two theories are equally well supported by the available evidence). Unlike many scientific theories, metaphysical claims often cannot be tested experimentally at all—not because we lack the technological means to do so, but because the very nature of these claims defies empirical confirmation or falsification. Think, for example, of the claim that laws of nature hold across all possible worlds. This is why reference to theoretical virtues, Inferences to the Best Explanation, arguments from indispensability and serviceability, extensional adequacy, and the Canberra Plan method are of great argumentative importance in Metaphysics of Science.

Note that some philosophers—for example, proponents of naturalized metaphysics (as mentioned in section 2b)—may reject all or some of these methodological tools as transcendental or indefensibly a priori. However, the issue is not currently settled among philosophers, and the tools described below remain widely used in contemporary Metaphysics of Science.

a. Theoretical Virtues

In both science and metaphysics, we strive for internally consistent, comprehensive, unambiguous theories which cohere with our accepted beliefs, have an adequately large scope, and so on. Among the various desiderata, explanatory power and simplicity are often accorded a central role. To strive for an explanatorily powerful theory is to demand that a theory must explain a certain number of phenomena which stand in need of explanation, that it does so thoroughly and systematically, and that it is not ad hoc. The value of explanatory power is obvious: explanation (or at the very least, systematization) is the very purpose of any hypothesis. Not so with simplicity. There are many ways a theory can be simpler than its competitors; for example, it may contain fewer variables than another. Usually, the call for simplicity is understood in terms of parsimony. Occam’s Razor, a principle frequently appealed to in this context, says that entities must not be multiplied beyond necessity—that is, if faced with otherwise equally good theories (in terms of their explanatory power, for example), we are to prefer the one that postulates fewer (kinds of) entities. However, it is unclear whether simplicity and the other explanatory virtues are truth conducive or whether they are primarily pragmatic or aesthetic theoretical virtues (which means, for example, that simplicity is preferable because it is easier to work with simple theories or because they are somehow more agreeable).

Although theory choice criteria are certainly at work in everyday reasoning, philosophy, and science—remember that nobody wants a complicated, inconsistent, unclear, shallow, or incomprehensive theory—the application of such criteria is not straightforward: they must be measured and traded off against each other. Unfortunately, there are no shared standards or guidelines on how this should be done. How do we find out which of two theories is simpler or more consistent with the body of already accepted beliefs? How do we know which criterion trumps another? What is more, whereas in science theory choice criteria are interim solutions until a theory can be empirically proven, there is usually no such post hoc test in Metaphysics of Science. For all these reasons, justifying our appeal to theoretical virtues is not a trivial or easy task.

b. Inference to the Best Explanation

Once it has been determined through careful assessment of the theoretical virtues which available theory is the best explanation for a given phenomenon, we tend to infer that it must also be the correct explanation. In most cases, we will then also say that the entities (objects, fields, structures) postulated in the explanatory theory really exist. That is, we apply a so-called Inference to the Best Explanation (often referred to as “IBE”). For example, astronomers found that the best explanation for a divergence in the orbit of Uranus is the existence of another planet, Neptune, whose gravity interferes with Uranus’ trajectory. Thus, they inferred that Neptune must exist. This hypothesis was confirmed when Neptune was later discovered through telescopes. Similarly, many metaphysicians of science believe that IBE can be applied to metaphysical theories. For example, Nancy Cartwright believes that the best explanation for the fact that laboratory results produced in controlled, sterile settings can be applied to the messy circumstances of the outside world is the existence of underlying dispositions that are examined in the laboratory but also pervade the rest of the world, and she therefore accepts this view as true (Cartwright 1992, 47–8).

Quite obviously, IBEs are not deductively valid, and even the best explanations we have at our disposal can later turn out to be incorrect. For example, when astronomers sought to explain anomalies in the orbit of Mercury, they failed to find Vulcan, a planet postulated explicitly for this purpose, and the anomalies were later explained with the help of the General Theory of Relativity.

Note also that Occam’s Razor and IBEs sometimes pull in opposite directions: whereas IBEs often enrich, rather than reduce, our ontology, Occam’s Razor is set on eliminating as many entities as possible from our ontology. On the other hand, one of the marks of a good explanation is that it does not postulate more than is necessary; that is, it is parsimonious in the sense of Occam’s Razor. Either way, even if metaphysicians can agree on using theoretical virtues and IBEs as argumentative tools, there is still room for debate.

c. Indispensability and Serviceability Arguments

In addition to IBEs, metaphysicians appeal to further inferential arguments to the effect that we should accept certain hypotheses as true. More specifically, indispensability and serviceability arguments basically consist in claiming that if X plays a crucial role with respect to Y, and if Y is either uncontroversial or relates to some postulate that we are unwilling to let go, then the existence of X can (or must) be asserted—that is, we should believe that X exists for the sake of Y.

One reason for accepting the existence of an entity X may be that its existence is indispensable for the existence of Y; that is, Y cannot be the case unless X exists. For example, some metaphysicians argue that the existence of mathematical entities is indispensable for science, and as science is important and probably at least approximately true, we have every reason to believe in the existence of numbers (as Platonic objects, say). Very roughly put, indispensability arguments infer from the premise that X is indispensable for Y and the premise that Y is the case to the conclusion that X exists.

(An older variant of the argument from indispensability is the so-called transcendental argument, which usually runs like this: if X is a necessary condition for the possibility of Y, and if we believe that Y is the case, we should also hold that X exists.)

Serviceability arguments are weaker than indispensability arguments. They advise us to accept the existence of a (kind of) entity X if X is serviceable towards end Y. For example, David K. Lewis argues that the assumption that possible worlds are concrete objects (just as our actual world) is highly serviceable (1986, 3): among other things, it provides us with the means to spell out the semantics of counterfactual conditionals. However, there may be other ways of accounting for the truth conditions of counterfactuals (for example, by referring to complete descriptions of fictitious possible worlds instead). Whereas indispensability offers a strong argument for the existence of some sort of entity, serviceability allows for contenders. Different kinds of entities may serve equally well to implement a goal, and serviceability arguments alone may not suffice to determine which of these entities we should believe in.

The evaluation of indispensability and serviceability arguments depends on what you already believe and what goals you pursue (as represented by variable Y). At best, they yield conditional existence claims: if you believe that science is successful and that science would not be successful if it were not for the existence of mathematical entities, then you had better believe in the existence of mathematical entities. If you do not believe that science is successful, then the argument is moot. Awareness of the occurrences of these kinds of arguments within debates in Metaphysics of Science will certainly help you understand your opponent, but it will seldom suffice to settle the issue.

d. Extensional Adequacy and the Canberra Plan

One particularly useful tool in evaluating metaphysical hypotheses is the test for extensional adequacy. To test a theory for extensional adequacy means to examine cases that, according to pretheoretical, intuitive judgment, fall under a concept the theory aims to explicate and to check whether the theory indeed subsumes these cases as instances of the concept. In addition, the theory may be tested with regard to scenarios in which its concepts should intuitively not apply; if the theory (wrongly) applies, it may have to be corrected. For example, suppose someone proposes a metaphysical theory as to what a law of nature is in claiming that a law of nature is nothing but a general statement of the form “All things which have property F also have property G.” This theory will quickly be challenged: “All pigs can fly” is a general statement, but, intuitively, it is not a law of nature, because it is clearly false. Whereas the sentence matches the alleged criterion for lawhood, it is intuitively not a law and thus a counterexample to the proposed analysis of lawhood.

Tests from extensional adequacy presuppose judgments regarding the extension of the concept in question; that is, it presupposes having a strong intuition about which entities or phenomena fall under it or are denoted by it. Preconceptions and intuitions as to what a concept denotes can diverge, however. They may be products of the culture we live in or the way we speak, and professional philosophers’ intuitions may well differ from the preconceptions of the folk.

Understanding a concept is not merely a matter of knowing what it denotes. Usually, concepts also carry meanings, or intensions. The so-called Canberra Plan is a complex two-step method for clarifying both the correct extension and intension of concepts. In other words, the Canberra Plan first seeks to fix the meaning of concepts (intension) by describing the role that instances of a given concept have to fulfill then, second, strives to identify its actual fulfillers (extension). It was proposed by philosophers associated with the Research School of Social Sciences in Canberra (most notably Frank Jackson and David K. Lewis). First, a concept’s use in everyday, scientific, and philosophical contexts is analyzed by collecting all sorts of platitudes about it. A platitude can be anything we say or believe about the concept. For example, regarding causation, we might believe that causes always precede their effects, that nothing causes itself, and so on. By systematizing the platitudes, the Canberra Planners determine which roles the referents of the concept are usually expected to fulfill. In the second step, they then search for referents, that is, entities or phenomena in the world that match these roles. For our example of causation, the transfer of energy could be proposed as such a role player. Because scientific theories are elaborate attempts at describing the world and because Canberra Planners are generally inclined to believe that scientific theories are at least approximately true (that is, they are scientific realists), particular attention is given to the postulates of the sciences. Depending on whether the second step is successful, we may find out the real extension of the concept in question—or we may have to concede that it has no basis in reality and should be discarded. However, note that there are multiple ways of systematizing platitudes and evaluating scientific theories, and hence the outcome may vary.

Apparently, whichever method(s) we employ, there will always be ways to question our claims in Metaphysics of Science (and in philosophy generally). Apart from the proponents of a radical naturalization of metaphysics, philosophers tend to see this not as a fatal flaw but simply as a characteristic feature which is grounded in the very nature of the discipline. The fact that Metaphysics of Science knows no ultimately decisive method but draws on many different tools that may result in different outcomes is not necessarily a bad thing: these tools may just be the best we have to answer questions that we cannot avoid asking, and there may nonetheless be progress in the form of ever more precise, extensionally adequate theories. At the very least, they allow us to map the field of possible views within Metaphysics of Science.

6. References and Further Reading

- Armstrong, D. M. 1983. What Is a Law of Nature? Cambridge: Cambridge University Press.
  - Argues that laws of nature are necessitation relations between universals.
- Barcan Marcus, R. 1946. “A Functional Calculus of First Order Based on Strict Implication.” Journal of Symbolic Logic 11: 1-16.
- Barcan Marcus, R. 1967. “Essentialism in Modal Logic.” Noûs 1: 91-96.
  - Both seminal texts by Barcan Marcus lay the groundwork for formal modal logic and afford later developments like Kripke’s and Putnam’s ideas on direct designation, rigid designation, and essence.
- Bird, Alexander. 2007. Nature’s Metaphysics. Oxford: Oxford University Press.
  - Develops a dispositional essentialist account of laws of nature according to which laws are grounded in dispositions and turn out to be metaphysically necessary.
- Carnap, R. 1936. “Testability and Meaning.” Philosophy of Science 3: 419–471 and 4: 1–40.
  - Discusses the simple conditional analysis and proposes the reduction sentences analysis of dispositionality.
- Carnap, R. 1947. Meaning and Necessity. Chicago: University of Chicago Press.
  - Historically relevant work on the semantics of natural and formal languages which lays the foundations for modal logic.
- Carrier, M. 2007. “Wege der Wissenschaftsphilosophie im 20. Jahrhundert.” In Wissenschaftstheorie: Ein Studienbuch, edited by A. Bartels and M. Stöckler, 15–44. Paderborn: Mentis.
  - Brief historical introduction to 20th century philosophy of science (in German).
- Cartwright, N. 1992. “Aristotelian Natures and the Modern Experimental Method.” In Inference, Explanation, and other Frustrations, edited by J. Earman, 44–70. Berkeley: University of California Press.
  - Argues that one cannot make sense of modern experimental method unless one assumes that laws are basically about capacities/dispositions.
- Chisholm, R. 1946. “The Contrary-to-Fact Conditional.” Mind 55: 289–307.
  - An early attempt at analyzing counterfactual conditionals.
- Cooper, J. M., ed. 1997. Plato: Complete Works. Indianapolis: Hackett.
  - Collection of English translations of works ascribed to Plato with helpful footnotes and introductory information.
- Dowe, P. 1992. “Wesley Salmon’s Process Theory of Causality and the Conserved Quantity Theory.” Philosophy of Science 59: 195-216.
  - Criticizes Salmon’s process theory of causality and suggests that a causal theory based on conserved physical quantities should replace it.
- Dretske, F. 1977. “Laws of Nature.” Philosophy of Science 44: 248–268.
  - Argues that laws of nature are relations between universals.
- Ellis, Brian. 2001. Scientific Essentialism. Cambridge: Cambridge University Press.
  - Defends the view that the fundamental laws of nature depend on the essential properties of the things on which they are said to operate and that they are metaphysically necessary.
- Feynman, R. 1967. The Character of Physical Law. Cambridge: MIT Press.
  - A series of lectures discussing several physical laws and analysing their common features, with a focus on mathematical features.
- Fine, K. 1994. “Essence and Modality.” Philosophical Perspectives 8: 1-16.
  - Criticizes the idea that essence is a special case of metaphysical necessity (and argues that it actually is the other way around) and discusses the relationship between essence and definition.
- Göhner, J.F., K. Engelhard, and M. Schrenk. 2018. Special Issue: Metaphysics: New Perspectives on Analytic and Naturalised Metaphysics of Science. Journal for General Philosophy of Science 49: 159-241.
  - Addresses various aspects regarding the relationship between metaphysics and science, with a focus on the questions which metaphysical lessons we should learn from linguistics and the social sciences and whether mainstream metaphysical research programmes can have any positive impact on science.
- Goodman, N. 1947. “The Problem of Counterfactual Conditionals.” Journal of Philosophy 44: 113–128.
  - Examines the problems that face analyses of counterfactual conditionals and attempts a partial definition of counterfactual truth.
- Goodman, N. 1955. Fact, Fiction, and Forecast. Cambridge: Harvard University Press.
  - Introduces the “new riddle of induction” (grue-problem) and explores the concepts of counterfactual truth and lawhood in order to develop a theory of projection which resolves it.
- Hall, N. 2004. “Two Concepts of Causation.” In Causation and Counterfactuals, edited by J. Collins, N. Hall, and L. A. Paul, 225–276. Cambridge: MIT Press.
  - Argues that there are two distinct concepts of causation, one of which is best analyzed in terms of dependence, the other in terms of production.
- Husserl, E. 1970. The Crisis of European Sciences and Transcendental Phenomenology. Evanston: Northwestern University Press.
  - Unfinished classical text in phenomenology originally published in German in 1936, which bemoans the fact that modern science is oblivious to the life-world of humans.
- Kistler, M. 2006. Causation and Laws of Nature. Oxford: Routledge.
  - Develops and applies a transfer theory of causation.
- Kripke, S. 1963. “Semantical Considerations on Modal Logic.” Acta Philosophica Fennica 16: 83-94.
  - Gives an exposition of some features of a semantical theory of modal logics.
- Kripke, S. 1980. Naming and Necessity. Oxford: Blackwell.
  - Argues that the meaning of names is not determined by descriptions and that natural kind terms rigidly designate (that is, that they designate the same natural kind across all possible worlds), thus allowing for a posteriori necessities.
- Ladyman, J. and D. Ross, D. Spurrett, and J. Collier. 2007. Every Thing Must Go: Metaphysics Naturalized. Oxford: Oxford University Press.
  - Argues for a naturalization of metaphysics by criticizing contemporary analytic metaphysics and develops a scientifically informed structuralist realist metaphysics.
- Lange, M. 2009. Laws and Lawmakers: Science, Metaphysics, and the Laws of Nature. Oxford: Oxford University Press.
  - Instead of saying that laws support counterfactuals, Lange proposes to reverse the order and say that laws are those generalities that are stable or invariant under counterfactual perturbations.
- Lewis, D. K. 1973a. Counterfactuals. Oxford: Blackwell.
  - An account of counterfactual conditionals in terms of modal realism. Introduces the Best Systems Account of laws of nature.
- Lewis, D. K. 1973b. “Causation.” Journal of Philosophy 70: 556–567.
  - Proposes and modifies the counterfactual account of causation in terms of counterfactual dependence.
- Lewis, D. K. 1986. On the Plurality of Worlds. Oxford: Blackwell.
  - Defends modal realism, which is the view that the actual world is only one of many possible worlds all of which exist, on the basis that it is highly serviceable in solving longstanding philosophical problems.
- Mackie, J. L. 1965. “Causes and Conditions.” American Philosophical Quarterly 2: 245–264.
  - Proposes the INUS account of causation.
- Martin, C. B. 1994. “Dispositions and Conditionals.” The Philosophical Quarterly 44: 1–8.
  - Introduces finkish dispositions as a problem for counterfactual analyses of dispositions.
- Maudlin, T. 2007. The Metaphysics within Physics. Oxford: Oxford University Press.
  - Argues that lawhood is irreducible but can account for causation, counterfactuals, and dispositionality.
- Mumford, S. and R. L. Anjum. 2011. Getting Causes from Powers. Oxford: Oxford University Press.
  - The authors develop not only a theory of causation based on powers, but also offer a detailed analysis of causal powers themselves.
- Mumford, S. and M. Tugby. 2013. “What is the Metaphysics of Science?” Metaphysics and Science, Edited by S. Mumford and M. Tugby, 3–26. Oxford: Oxford University Press.
  - Introduction to a collection of state-of-the-art papers on core issues in Metaphysics of Science.
- Paul, L. A. 2012. “Metaphysics as Modeling: The Handmaiden’s Tale.” Philosophical Studies 160: 1–29.
  - Claims that science and metaphysics of science differ with respect to their respective subject matter, but that there is no categorical difference in method, as both construct theories by building models.
- Putnam, H. 1975. “The Meaning of ‘Meaning.’” Minnesota Studies in the Philosophy of Science 7: 131–193.
  - Argues for semantic externalism (the claim that the meaning of a term does not determine its extension, which means that the meanings of a word are not determined by the psychological state the speaker is in, but by external factors) using the Twin Earth thought experiment.
- Quine, W. V. O. 1948. “On What There Is.” In From A Logical Point of View, 1953, 1–19. Cambridge: Harvard University Press.
  - Proposes that ontological commitments can be read off statements or scientific theories by formalizing them in predicate logic and identifying bound variables.
- Quine, W. V. O. 1951. “Two Dogmas of Empiricism.” In From A Logical Point of View, 1953, 20–46. Cambridge: Harvard University Press.
  - The two dogmas Quine argues against are: (i) that there is a clear distinction between analytically true and synthetically true sentences, and, (ii), that each meaningful sentence faces the tribunal of sense experience on its own for its verification or falsification (rather than holistically in concert with other sentences).
- Roberts, J. 2008. The Law-Governed Universe. Oxford: Oxford University Press.
  - Introduces the measurability account of laws of nature, which states that lawhood is a role that propositions play rather than a property of facts and that laws guarantee the reliability of methods of measuring natural quantities.
- Salmon, W. 1984. Scientific Explanation and the Causal Structure of the World. Princeton: Princeton University Press.
  - Develops a causal/mechanical account of explanation which incorporates the idea that causation is best considered a process.
- Salmon, W. 1994. “Causality without Counterfactuals.” Philosophy of Science 61: 297–312.
  - Agrees with Dowe’s improvement of Salmon’s 1984 theory and also proposes a transfer or conserved quantity theory of causation.
- Scholz, Oliver R. 2018. “Induktive Metaphysik – Ein vergessenes Kapitel der Metaphysikgeschichte.” In Philosophische Sprache zwischen Tradition und Innovation, edited by D. Hommen and D. Sölch. Frankfurt am Main: Peter Lang.
  - Describes and analyses the historical programme of inductive metaphysics which developed simultaneously with Logical Empiricism.
- Schrenk, M. 2017. Metaphysics of Science: A Systematic and Historical Introduction. London: Routledge.
  - Comprehensive, easily accessible systematic and historical introduction to Metaphysics of Science including the topics of dispositions, counterfactuals, laws of nature, causation, and dispositional essentialism, as well as information on the origins and methodology of Metaphysics of Science.
- Schurz. G. 2016. “Patterns of Abductive Inference.” In Springer Handbook of Model-Based Science, edited by L. Magnani. and T. Bertoletti, 151–174. New York: Springer.
  - Analyses the structure of abductive inferences and recommends that metaphysics should make use of such inferences.
- Stalnaker, R. 1968. “A Theory of Conditionals.” American Philosophical Quarterly 2: 98–112.
  - Uses possible worlds semantics to analyze counterfactual conditionals without a commitment to possible worlds realism.
- Strawson, P.F. 1959. Individuals: An Essay in Descriptive Metaphysics. New York: Routledge.
  - Distinguishes between descriptive and revisionary metaphysics and examines the relationship between our language and our habit of conceiving of the world in terms of individuals (particulars and persons).
- Tahko, T.E. 2015. An Introduction to Metametaphysics. Cambridge: Cambridge University Press.
  - Comprehensive and easily accessible introduction to 20th century and current debates about the methodology and epistemology of metaphysics.
- Tooley, M. 1977. “The Nature of Laws.” Canadian Journal of Philosophy 7: 667–698.
  - Argues that the relations between universals are truth-makers for laws of nature.
- Williamson, Timothy. 2016. “Abductive Philosophy.” Philosophical Forum, 47 3–4: 263–280.
  - Recommends both ampliative inferences such as abductions (or, nearly synonymous, inferences to the best explanation) and model-building as valuable methodologies not only for the sciences but also for philosophy and metaphysics.
- Woodward, J. 1992. “Realism about Laws.” Erkenntnis 36: 181–218.
  - Defends the view that the notion of lawfulness is linked to the notion of invariance rather than the notion of necessary connection.
- Woodward, J. 2003. Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press.
  - Proposes an interventionist theory of causation that analyses causation by appealing to the notion of intervention or manipulation.

Author Information

Julia F. Göhner
Heinrich Heine University
Dusseldorf, Germany

and

Markus Schrenk
Email: markus.schrenk@phil.uni-duesseldorf.de
Heinrich Heine University
Dusseldorf, Germany

Language in Classical Chinese Philosophy

At first glance, early Chinese thought as expressed in Warring States period (475-221 BCE) texts does not seem to focus on the kinds of questions about language that one might expect from philosophers working on “the philosophy of language.” This does not mean, however, that language is philosophically insignificant to early Chinese thinkers. But it does show that discussions of language in these texts are part of early Chinese authors’ engagement with a larger set of philosophical problems, particularly the problem of self-cultivation. Here, “self-cultivation” means a set of generalized practices directed toward the goal of moral action, focusing on the development of a set of virtues and norms as they relate to the individual as well as progressively higher units of social organization. Although positions on self-cultivation differ widely across strands of early Chinese thought, a common goal of all competing traditions is the rehabilitation of human conduct. Discourse about appropriate “models” (fa 法) for such rehabilitation – whether they be concrete tools, exemplary individuals or abstract ideas – is found in all early Chinese philosophical texts. This, then, raises the issue of language: how does the sage (shengren 聖), as one who has successfully mastered exercises of self-cultivation and thus furnishes us with the requisite fa, speak? Or, as some traditions ask, does the sage speak at all? Do words promote or impede an individual’s development, and is the sage’s insight an ineffable experience or is it one that can, and should, be articulated for the benefit of others? Thus, the problem of self-cultivation functions as a stage for various other intersecting concerns into human nature, the relation between human feelings and thought or judgment, the ideal social and political organization, and the relation between the human subject and the larger processes of nature and the cosmos, among other topics. Discussions of the linguistic dimensions of sagehood then generate other questions about language: How do words relate to psychological states? Is language a constitutive element of human nature, or is it a conventional practice that stands in a particular orientation to a naturally given state? Is language inherently tied to the incidence of social and political chaos, or is it a technology that can be used to institute order? This entry offers a brief overview of how inquiries concerning language are developed in classical Confucian, Mohist, and Daoist writings.

Key Terms and Problems
Speech (yan 言) as Virtuous Conduct (xing 行) in the Analects
Language and Self-Cultivation in the Mencius
Zhengming 正名 in the Xunzi
The Mohist Canons
‘Not Speaking’ in the Daodejing
‘Goblet Words’ in the Zhuangzi
Additional Trends
References and Further Reading

1. Key Terms and Problems

Contemporary debates on language in Chinese philosophy, in the analytic tradition, have been determined to a large extent by the research of Graham (1989, 1978) and Hansen (1983) on the linguistic models displayed in the Mohist Canons. Harbsmeier (1989b, 1991), Mou (1999), Fraser (2007) and Robins (2000) represent a selection of scholars who have extended the inquiry into the grammatical and syntactical structures in the Canons by further developing some of the central theses put forward by Graham and Hansen, such as those concerning the use of word-types (like mass-nouns) and structures of predication. An enduring premise in this approach is the clear distinction between language (variously construed as speech/yan言 and names/ming名) and the reality (shi 實, literally, ‘objects’ or ‘solids’) with which it shares a formal, representational relationship.

Another trend in inquiries concerning language involves a less formal approach, replacing the focus on referential structures with an analysis that identifies language as part of an embodied, empirical model of experience. Geaney (2010, 2002), for instance, argues that conceptions of language in early China cannot be grasped without appreciating the larger perceptual index of sight and sound of which ‘names’ and ‘speech’ are a constitutive element. Wagner (2003) similarly underscores how conceptions of ming in early linguistic models (like that of Wang Bi) define speech in terms of aurality, with ‘names’ being understood as meaningful units of sound. Lewis (1999) calls for situating language somewhere between a purely oral, and thus aural, dimension and a written technology that serves as a more robust medium for recording and articulating judgments.

Alternate directions in the literature display a different set of concerns, foregrounding the socio-political applications of a theory of language. In this latter approach, conceptions of language are often perceived as being coextensive with a conception of culture. We find, as a result, numerous schools attempting to furnish an account of how culture is to be distinguished from a natural state, and how ‘names’ or ‘speech’ fit in relative to this distinction. Multiple accounts of this distinction—as either oppositional, as a continuum, as unconnected—lead to diverse possibilities for conceiving language as a spectrum that displays a naturalist bias at one extreme and a social normative agenda at the other.

Whether we choose to capture the discussions of language in classical Chinese philosophy with a referential model that focuses on predicate logic, a perception-based model of the senses, or a more expansive understanding of language as a socio-political technology, a basic vocabulary emerges across a wide selection of texts that ties the question of language to the larger problem of how one can know the world and provide an articulate judgment of one’s experience in it. Early Chinese accounts of language are intimately bound up with how one discriminates (bian 辨) one thing from another, categorizing the world accordingly in terms of what ‘is so’ (shi 是) and what is ‘not so’ (fei 非). This dialectical capacity for division separates things both on the descriptive as well as normative registers, and thus built into the ascription of something as ‘so’ is the clear sense that it ought to be so. Chris Fraser describes these dual senses of the distinction between shi and fei as follows:

They [shi and fei] apply both to the descriptive, empirical question of whether or not something is a certain kind of thing and the normative question of whether some action or practice is morally right or wrong. In effect, shi and fei refer to a very basic, general normative status that does not distinguish between the different flavors of correctness and error implicated in describing, commanding, recommending, permitting, or choosing . . . Because of their normative use, they are seen as inherently evaluative terms with action-guiding force. In ethical contexts, this feature is obvious, as shi-fei distinctions articulate values. Even in nonethical contexts, however, the attitude of deeming something shi or fei is regarded as action guiding.

A recurrent theme that we accordingly encounter in pre-Han texts concerns the relation of names (ming) to how one discriminates and orders one’s categories. What is a name (ming) in relation that which is so (shi)? Is the negation of a thing by pointing to what it is not (fei) the opposite of a given name in that context? And how does the normative dimension of the model of bian affect the use of names along distinctions between shi and fei? As we see later in the article, these are all problems concerning language and epistemology that emerge as points of contention between the various competing schools of classical China.

2. Speech (yan 言) as Virtuous Conduct (xing 行) in the Analects

Concerns with language in Confucius’ Analects come to rest squarely within the text’s overarching composition of a program of self-cultivation. Names (ming 名) and the activity of speaking (yan 言), broadly construed in both a nominal and verbal sense, therefore do not present the reader with the kind of problematic that requires establishing a logical relation between mental content (as determining the ‘meaning’ of a word) and the world as a given, objective correlate. Rather, the salient question the text repeatedly poses is how to use words and speak in general such that one’s linguistic comportment can coincide with one’s character as a virtuous person. A direct consequence of aligning the question of language along these lines is to be seen in frequent discussions in the Analects where both the style of a person’s speech (its elocutionary attributes, such as tempo and diction) as well as its content emerge as useful measures of moral development. The Master is thus concerned with whether one’s words are sincere (xin 信) and unequivocally identifies “clever or cunning speech” (qiao yan巧言) (Analects, 1.3) with the absence of ren or virtue, as it is broadly construed in this text.

There exists in the Analects, then, no sense of the inherent value of words as signifiers of an external reality. Rather, language is analyzed as a philosophical problem only in relation to the viability of a virtue-based ethics, and its efficacy is to be judged in its successful subordination to, and implementation of, a model of virtuous conduct (xing 行) (see Analects, 9.24). At one end of this spectrum, the Master invokes the rhetorically powerful example of the Ancients, who remain silent out of fear that their actions will not match their words (see 4.22). But of more use is the model of the ‘nobleperson’ or junzi 君子, who displays a flawless calibration of words to action. Scattered across the text, the majority of discussions regarding the nature and use of words comes to settle on the need to emulate the linguistic perspicuity exhibited by this ideal type. The junzi speak with sincerity (xin 信) (see 1.7) and their use of language is repeatedly described as careful (see 1.14), slow (12.3) and always bound by the larger concerns with virtuous conduct (2.13, 4.24).

The capacity to undermine the Confucian art of self-cultivation through a gross misuse of language emerges as a necessary corollary to the conceptual bind the text forges between one’s speech (yan 言) and conduct (xing 行). While all who are virtuous speak in accordance with their character, it is not the case that all who speak are necessarily virtuous (see 14.4). Language can then serve equally as a marker of both moral health as well as moral decrepitude. It is this basic observation that underlies a central Confucian conviction that the health of a society, and its apex political institutions, can be achieved through the practice of zhengming 正名, or ‘correcting names.’ While this is an overt concern and stated objective in the Xunzi, the Analects underscores the important role that zhengming plays in a famous passage that links socio-political disorder ultimately with a state of linguistic disorder (see 13.3). If names (ming 名) in their specific designation refer not to discrete objective correlates (‘son,’ ‘father’ as neutral, discrete units) but rather to how one must act in relation to the roles associated with such names (to ‘be a son,’ to ‘be a father’), then a state of linguistic disorder is one in which the designation of behavioral norms implied in the use of names no longer works or implies a failure of these norms. Where the performative designations of our names are not properly understood, socio-political chaos must necessarily reign. The Analects thus points in the direction of a prescriptive theory of language in its brief formulation of a program of zhengming, which involves the rehabilitation of such a comprised language and its social and political ill-effects.

3. Language and Self-Cultivation in the Mencius

In the Mencius, the Confucian program of self-cultivation is given further conceptual depth to the extent that a more robust metaphysics of human nature (ren xing 人性) anchors the entire project. The text organizes its discussions of language with particular attention to its overriding concerns with the nature and development of the heart (xin 心) and the attainment of a kind of moral animation in the human subject, which it describes in Mencius 2A2 as having a “flood like qi” (hao ran zhi qi 浩然之氣). In other words, the imperative in the Mencius is not simply to secure a complementary organization of language (yan 言) and virtuous conduct (xing 行), as we have seen in the Analects. The text adds depth to this generalized formulation of language by integrating the question of how to use words with its more intricate moral psychology of the heart and human nature. One appreciates the implications of this move in the naturalized status that extends to language itself. For instance, Mencius 4A15 establishes a parity between certain basic physical attributes, like the pupils of a person’s eyes, and the kind of language they speak. Crucially, these attributes—one anatomical, another linguistic—function as potent markers of a more fundamental moral signature of human nature. Thus, if the inherently moral capacities of being human are to be realized, the text points to both one’s pupils as well as one’s words as the natural markers of moral development.

The position the Mencius takes on the status and role of language is, however, not so straightforward if we consider two basic paradigms in the text that bring everything into moral orientation. The first of these models is that of the ‘nobleperson’ or junzi 君子, who is able to grow the “four (moral) sprouts” (si duan 四端) of the heart and successfully master the virtuous conducts of benevolence (ren 仁), ritual propriety (li 禮), righteousness (yi 義) and knowledge (zhi 知). Such a perfected moral state, while it manifests in the junzi’s physical comportment, remains wordless (bu yan不言, Mencius 7A21). At the cosmological level, the text is emphatic about the silence of Heaven (tian 天), whose commandments, which remain unarticulated, can be gleaned only from the evidence of the King’s conduct and the people’s acceptance (see Mengzi 5A5).

However, it is between the poles of silence and grandiose speech that the Mencius affirms the efficacy and value of language. While it describes the junzi as effecting a wordless practice, the text simultaneously upholds speech that is simple and concise (compare Mencius 7B32, 4B15). The overarching framework of ren xing, furthermore, supplies the authors with a standard for truth or genuineness such that speech that complements the natural development of virtuous conduct is positively upheld as corresponding with the reality (shi 實) of things (Mengzi 4B17). A corollary to a genuine/natural language is the potentially false modality of speech, and the Mencius explicitly participates in this arbitration between truth and falsity by rejecting what it terms as “one-sided” and “perverse” speech (see Mencius 2A2, 3B9). Here we are presented with an important dimension to the linguistic philosophy of the Mencius in its thematization of the activity of disputation, or bian 辯, a dialectical framework of language characterized by the eristic exchanges between various parties to a debate. Words in this context admit either to being true or false, and the text explicitly stakes its claims by rendering the principles of competing schools, like those of Yang Zhu and Mo Di, as “one-sided” and “perverse.” Yet, measures of truth and falsity in the Mencius, it bears repeating, do not function in relation to an objective, neutral external world. Rather, the performative dimension of self-cultivation remains the basic conceptual frame. To speak truly and genuinely, in a way that corresponds to the reality of things, implies that such words are distinguished primarily by their virtuous quality. The perversity of the speech of adversaries, like Yang Zhu and Mo Di, is a problem precisely because of the potential of such misguided language to draw society down into a bestial condition, where the genuine principles of benevolence and righteousness are nowhere to be seen (Mencius 3B9).

4. Zhengming 正名 in the Xunzi

Xunzi’s philosophy revolves around the central premise that one’s humanity can be successfully shaped only through concerted effort within the institutional frameworks of education and ritual. A conceptual locus in the text is accordingly represented by the concept of wei偽, ‘deliberate effort,’ a model of virtuous conduct that involves the concerted implementation of institutionally mandated practices. Xunzi’s often cited constructivism is thus to be distinguished from the Mencian belief in the continuity between nature (xing 性) and institutions, the latter being mechanisms by means of which natural dispositions, as positive traits already present in an individual, can be fully actualized. Nature and nurture for Xunzi are not complementary as they are for Mencius, and the former’s claim that “human nature is evil” (xing e 性惡) implies that the work of nurture is a focused undoing or rectification of a naturally undesirable configuration of elements in an individual. The notion of wei偽 therefore implies a concerted level of intervention in natural processes and patterns, denoting an activity that is distinguished by its levels of artifice rather than spontaneity.

Xunzi’s concern with establishing right order, then, does not extend to achieving a harmonious state prescribed in nature, but instead refers to appropriately functioning conventions of society and politics. It is within this overall context of assumptions regarding nature and the institutions that are necessary for a society’s ordered existence that the question of language proves to be of pivotal importance in the text. Names (ming 名) in the Xunzi are a technology through which the undesirable traits of human nature can both be expressed as well as curtailed. As the text states, names have neither “innate appropriateness” (gu yi固宜), nor do they admit to any “intrinsic reality” (gu shi固實). Yet, there are those which are “intrinsically good” ([ming you] gu shan 名有固善). Xunzi thus frees language from any problematic tie with nature since words share no constitutive bond with xing 性, a state that, in turn, is described as “evil,” e 惡. At the same time, however, they are potential markers of virtuous conduct, and it is successfully utilizing this potential of language to rehabilitate society that constitutes a central aim of the text.

The chapter entitled Zheng Ming 正名, “Correcting Names,” details the Xunzi’s intricate treatment of language in both its calamitous as well as remedial versions. The text begins by attributing a significant source of disorder in society to a particular linguistic condition, which it associates with a series of flawed acts like “splitting names,” “making up new names,” and “throwing into disorder established names.” What comes in for censure here is, in essence, the relativism of standards provoked by the competing theories of the Mohists and other camps like the School of Names (Ming jia 名家). The text diagnoses as deplorable a situation in which each school articulates a ‘name’ for itself, evaluating and discriminating reality on the basis of a set of purely subjective observations. One’s ability to understand and negotiate reality (shi 實), according to the Xunzi, depends on the quality of our names or ming 名 (broadly construed to include categories and distinctions) made in language. Where numerous distinctions crowd around the same reality (be it an object, a relation, a character, a role, and so forth), the designation between ming 名 and shi 實 breaks down to result in chaos and confusion.

How, then, does one go about “correcting names”? The text upholds its Confucian commitment to tradition, adapting its conservatism, however, to the specific task of rehabilitating the linguistic standards perfected and fixed by the previous generations of kings. These are the “common names” (san ming 散名), which exhibit a clarity of designation between ‘names’ and ‘reality’ that must be modeled if the disorder that prevails in society is to be corrected. The Xunzi elaborates a nuanced framework to explain this positive linguistic model, explaining the origin of ‘correct’ names in relation to other aspects of an individual’s physical, psychological and epistemic experience, and, in this respect, arguably makes its most significant contribution regarding questions of language. What the sage, like the true kings of the past, is able to successfully identify is the evolution of a given experience through its various stages of development: starting with the elemental origins in the senses; the psychological shaping of such sensory stimuli in feelings/dispositions or qing 情; and the overall understanding or knowledge (zhi 知) of the heart that is able to make sense of and correctly judge the entire process as it unfolds. Sages display a mastery over this entire psycho-physical complex, and their acute zhi 知 enables them to identify which things involve a similar sensory experience and evoke corresponding, similar dispositions, and which things must be accordingly distinguished as generating divergent stimuli and responses. This perspicacity leads to the correct designations in language, where each set of names exhibits a careful sorting of accumulated sensory and psychological data with the constant inflow of new experiences. It is this sorting activity at the level of names that constitutes, in the most rudimentary sense, the deliberate effort (wei 偽) that the Xunzi praises in the work of sages and the larger institutional frameworks of education and ritual. The implementation of zheng ming obviates the proliferation of multiple standards and classes of things by which people can judge their reality. To ‘correct names,’ then, is, first and foremost, to safeguard a society from the scourge of relativism. The text accordingly recommends the king to regulate definitions of names in order that his citizens clearly understand the meanings and referents of words that are in use. Ming 名and shi 實 are thereby harmonized, such that the relations between words and their referents are made plainly manifest and are agreed upon in the social and political conventions through which language is put to use. Zhengming is thus primarily about the social and political benefits to be gained from using language in a particular mode. As the text affirms in its advice to kings, correcting names equips the people with a unified intention and enables them, ultimately, to follow the law. This is the only path to good and successful governance.

5. The Mohist Canons

The short tracts of text that comprise the Mohist Canons as well as the longer work of the Mozi offer a series of dense statements on the nature of language. The Canons in particular put forward a theoretical framework that establishes standards for making true statements and engaging in clear and effective communication. As scholars have often suggested, the Canons are remarkable for the technical nature of their discussions on names (ming), on the relation between names and the reality of objects (shi), and on the epistemic status of our language. Yet, there is an unmistakable sense that the text remains bound to the narrow objective of establishing a sound theory of language for the purposes of defining the basic tenets of Mohist doctrine. A general frame for these inquiries into the nature and the proper use of names is therefore the model of ‘debate’ or bian, which is explicitly thematized in the Canons as the guiding activity through which the proper dao (as envisioned by the Mohists) can be codified and defended. The text defines bian as “contending over claims which are the converse of each other” and continues to state that “winning in disputation is fitting the fact.” Claims which are, in a bian-type exchange, the “converse” of each other are, as we have already seen, the dichotomy of claiming one thing to be so (shi) and another to be not-so (fei). The Mohist is emphatic on the factual nature of this distinction, explicitly marking out the categories of shi and fei as either fitting with reality or not, and the Canons equip the practitioner with the requisite tools and knowledge with which to master this art of discrimination and to articulate the true and correct picture of Mohist doctrine.

We should thus read the Canons as, first and foremost, a text that expounds a dialectical model equipping a speaker to clearly distinguish what is so or right (shi) from what is not so or wrong (fei). As a manual of argumentation or debate (bian), it accordingly inquires into the fundamental laws governing names (ming 名) and their referencing of objects/reality (shi), and discusses more complex problems surrounding the nature of evidence in arguments, the relation between sentences and a speaker’s thoughts, the uses of analogy, and the methods of illustrating, matching, adducing, and inferring (to name but a few of the themes covered).

At the heart of the diverse discussions on language in the Canons lies what Angus Graham has called a “radically nominalist approach to naming.” Such a model does not admit a premise of essences at work in language, whereby a name for a thing might be understood as referencing a core, defining idea that transcends all particular instantiations. To categorize something as ‘this’ or as ‘so’ (shi), and to extend that category to a ming or ‘name,’ is to simply pick out one thing among others and identify it as what it is called. “[T]here is no ‘essence’,” as Graham suggests, “merely the existence (you 有) of the thing with all its properties.”

The nominalism of the Canons does not, however, commit the Mohist to a relativistic view on truth or to a skepticism regarding the epistemic status of names. A central objective of the text in this respect is the identification of the correct procedures for relating names to objects so that language can be used consistently and correctly. The Canons thus articulate a larger epistemological framework by presenting specific sources of knowledge and identifying specific objects of knowledge that allow for a more structured and nuanced discussion of how names are engendered and the various orders of meaning they convey. Knowledge (zhi 知) can be obtained “by hearsay [report], by explanation, and by personal experience [observation]” and its specific objects are “names (ming), objects (shi), how to relate [an object to a name], and how to act.” We find here a basic set of premises shared by the Confucians—namely, that distinguishing between objects using names, and being able to successfully apply the correct names (that is, relate names to objects) produces knowledge and has the effect of guiding one’s actions. Yet, while the Confucian paradigm, as we have seen it on display in Analects 13.3 and in the Xunzi, sets about rectifying the reality of behavior and conduct so as to rehabilitate the correct norms codified in language, the Mohist Canons are emphatic on the need to grasp the act of naming itself. The name (ming), in other words, functions as a definition of the thing (shi), and in doing so denotes its reality.

At the heart of the Canons, then, lies a basic set of premises regarding how to discriminate between the names for various things based on more subtle distinctions between the various kinds or classes of names and referents. Thus, for example, Canon A78 identifies three classes of ming that align with the kinds of referents they point to:

Names. Unrestricted; Classifying; Private.

‘Thing’ (wu 物) is ‘unrestricted’; any object necessarily requires this name. Naming something ‘horse’ is ‘classifying’; for ‘like the object’ we necessarily use this name. Naming someone ‘Jack’ is ‘private’; this name stays confined in this object.

Unrestricted (da 達) names, covering the largest class or kind, have a general scope of designation (like the name, thing/wu 物); then there are class (lei 類) names, which refer to particular kinds/classes of things and are thus limited in scope (like horse/ma 馬); finally, there are personal or private (si 私) names, which are singular in reference (like a proper noun, Jack). That this typology functions on the basis of an underlying ontology of sameness and difference is evident in the logic which drives us from using one type of name to another. Between the word ‘thing’ and ‘horse,’ we have separated out members and distinguished one kind of ‘thing’ from others with which it does not share defining traits. A horse is not a hammer, and thus can be distinguished by a name that marks both its difference from other things (hammers) and its sameness with others (other horses). The Canons appear to take for granted the idea that the reality of objects (shi 實) is divided along such natural classes of sameness and difference, and names, as definitions of this reality, correspond to and express these divisions of classes as given facts that are observable in one’s experience.

The act of speaking (yan 言), then, is a dynamic composite of naming, where a directed intention on the part of the speaker to convey some idea or thought (yi 意) leads to an explicit choice of naming in relation to reality. This act of referring (ju 舉) is an integral moment of the speech act, which the Canons define as “picking out an object from among others by means of its name.” To refer, furthermore, “is to present the analogue for the object” and every reference therefore is an act of setting up an “archetype” (ni 擬) which the chosen name evokes as a meaningful standard (fa 法). Speaking (yan 言) is described as an “emergence of references” (chu ju出舉), a linking up of various names that evoke models or archetypes that all speakers are in possession of. Thus, in addition to the premise that there are different kinds of names (based upon sameness and difference, for example), the Canons also appear to assume the role that convention plays through mutually agreed upon standards or archetypical referents for the names shared among a linguistic community.

6. ‘Not Speaking’ in the Daodejing

The canonical texts of early Daoism also question the role and status of language in relation to an ideal of self-cultivation that is set up as a prime objective to be achieved. However, in sharp contrast to the constructivist tendencies of Confucian discourses, texts like the Daodejing and Zhuangzi explicitly reject the idea that language can be optimally regulated in and through institutional frameworks and conventional practices. There is, moreover, a thoroughgoing suspicion that pervades these texts regarding the value of language in general, and we repeatedly encounter the claim that linguistic expression, in its very constitution, is ridden with epistemic poverty (insofar as words do not attain any true standards for knowledge). This leads to a more extreme position, often cited by scholars in both the Daodejing and Zhuangzi, that rejects language, as such, as a medium of expression. Harmonization with dao, the focus of self-cultivation, is thus understood to be a distinctly extra-linguistic experience.

The Daodejing makes its case for the ineffable quality of a practice of self-cultivation by describing the sage repeatedly as one who does not speak. Daodejing 56 emphasizes in this regard the inversely proportional relation between knowledge and speaking, where “one who understands [dao] does not speak” and one who has no understanding whatsoever has much to say (zhi zhe bu yan, yan zhe bu zhi 知者不言，言者不知). As a categorical rebuke of the Confucian faith in institutional practice and of the conceptual locus established by the notion of deliberate effort (wei偽) in texts like the Xunzi, the Daodejing extols the model of “non (or non-coerced) action” (wu wei 無為). Sages, in other words, must abandon the strictures that come down by way of conventional standards, habits, cultures of education, and other institutionalized patterns of behavior and conduct. Acting without acting, then, is to divest oneself of the social mores that, in a Confucian practice, are pivotal to the successful implementation of a program of self-cultivation. The text appears to suggest that such sagacity entails a termination of speech, as we learn in Daodejing 2, which describes how sages who excel in the affairs of non-action “practice the teaching that is without words” (xing buy an zhi jiao行不言之教).

And yet, the irony, if not the outright contradiction, of an argument that claims the inadequacy of language that is itself put in words is not lost on the authors of the Daodejing. To use language to extol a condition that appears, on the face of it, to be extra-linguistic therefore suggests a more nuanced perspective that these authors hold. We find, for instance, an additional set of claims in the text that uphold a certain kind of speech, and which positively describe words of the sage that mirror the spontaneous patterns of the dao. The ontology captured by the character ziran 自然, the ‘self-so-ing’ essence of dao that manifests in diverse cycles of change and natural progression, finds expression in a particular modality of speech in which words match the fluidity of nature. Rather than a state of complete and total aphasia (the speechlessness that, for example, defines the Pyrrhonian skeptic), the art of wuwei involves a perspicuous and measured operation of language. The Daodejing does in fact describe positive linguistic traits to be modeled, like words that are “trustworthy” (信, Daodejing 8) and that are “lacking in that which can be blamed” ([善言]無瑕讁, Daodejing 27). The text even identifies certain standards by which the reliability of speech can be judged, stating in Daodejing 81, for instance, that “trustworthy words are not beautiful” (信言不美). The sage who acts without acting, then, also speaks without speaking. As a linguistic complement to its model of wuwei, the Daoejing, rather than eliding language completely from its agenda, recommends a certain modulation of speech whereby the errors in how we utilize language might be removed and its potential to express the patterns of dao might be affirmed.

7. ‘Goblet Words’ in the Zhuangzi

While it retains the core themes of the Daodejing, the Zhuangzi elevates its criticism of Confucian and Mohist discourse and dismantles, in a spectacular fashion, the fundamental structures of dialectical speech that underlie both philosophical positions. The authors of the Inner Chapters (Neipian 內篇) build, in this respect, an elaborate critique of argumentation [or disputation] (bian 辯) —a genre of thinking and speaking that is defined by eristic speech, which, as we have seen, pivots on the choice of arguing for one alternative over another. The Qiwulun, the second of the Inner Chapters, evaluates the tenability of such a basic kind of dialectical exchange it associates with the debates of the Confucians and Mohists, where each party argues for its set of claims as true and as constituting a body of knowledge, and correspondingly associates the opposing party’s claims with falsity. The linguistic structure underpinning all such eristic speech is represented by the clear distinction between a positive ascription of what is the case (conveyed by the character shi 是) and a negative attribution using the character fei 非 to reference all that is not. In sharp disagreement with the linguistic models of texts like the Mozi and the Mohist Canons, the Zhuangzi associates this dichotomy of shi-fei claims—of what is and is not so, of what is right and wrong—with a vocabulary of artifice and inflexibility.

夫道未始有封，言未始有常，為是而有畛也。

The way has never had borders; speech has never had any regularity. Make claims about what is so, or what is right, and there are boundaries.

The method of defining what is so, as we read here, consists literally in a making of a definition (conveyed by the characters wei shì為是), where the artifice of a fixed category stands in direct contrast to the processual nature of experience that is dao. Furthermore, dividing language in terms of strict labels, standards or categories continually eludes the reality of dao and only serves to delude an individual with false standards for knowledge. Bian 辯, owing to the very nature of sophistical speech, therefore endlessly carries on and, as per the diagnosis of the Zhuangzi, serves only to wear out the heart-mind (xin 心).

Yet, in analogous fashion to the Daodejing, the Zhuangzi does not recommend an indiscriminate abandoning of all speech. The exemplary model of the sage not only speaks, but does so in a language that, in fact, occasionally spills into the genre of dialectics.

物無非彼，物無非是。自彼則不見，自知則知之。故曰：彼出於是，是亦因彼。

Of things, there are none that are not ‘that’ (bi 彼); of things there are none that are not ‘this’ (shi 是); One cannot see a thing if one approaches it as ‘that,’ one knows it as ‘this’ only as it is known to oneself. Thus it is said: ‘That’ emerges from ‘this,’ ‘this’ follows from ‘that.’

. . . 為是不用而寓諸庸…因是已。已而不知其然，謂之道。

[The sage] does not use a [fixed] definition of what is the case (wei shi為是) but instead lodges it in the usual . . . This is to judge what is so on a given basis (yin shi因是) and stop. Stopping without knowing (bu zhi不知) it to be so, this is called dao.

Unlike the rhetorical ploys and logic-chopping inherent to the activity of bian 辯, the generation of categories in the sage’s dialectic is fluid and perpetually under revision. A key insight in the Zhuangzi thus relates to the inescapability of linguistic expression and the corresponding need to constantly modulate our categories so they can adapt to shifting perspectives and contexts.

The text articulates this positively appraised framework of language using the metaphor of “goblet words” (zhiyan 卮言), a class of speech that is set apart from the ordinary use of language. While the latter functions through a stable matrix of ascriptions and designations between words and reality, the image of the goblet serves the purpose here of emphasizing a thorough dynamism in the way that words can be deployed. Like a goblet that continually overflows only to be filled again with water, the Zhuangzi perceives of a transformative speech that similarly ‘overflows’ each act of categorization or definition. Language, in such a figuration, enables a speaker to express multiple possibilities of experience, and it takes on a varied and rich descriptive quality that, as the text states, “harmonizes with the natural” (he yi tian ni 和以天倪). In sharp contrast to the Confucian agenda of zhengming, which strives toward instituting a catalog of names deemed to be singular and fixed in their denotations, the goblet language of the Zhuangzi is forever under revision, accumulating ever more shades and textures to our names so they may correspond to the self-so-ing (ziran 自然) ontology of dao.

8. Additional Trends

There are of course additional texts and trends, both in pre-Han Chinese literature and in later literary traditions, that further illuminate the line of inquiry that has been introduced here. One body of work that offers ample opportunity for further research is the corpus of excavated materials that has yet to receive an in-depth treatment focusing on the themes and problems of language. Two texts, the Tai Yi Sheng Shui 《太一生水名》and Heng Xian《恆先》, for example, identify a set of positions on names (ming) as part of larger cosmogonic models. In the case of the Tai Yi Sheng Shui text, the problem of naming is specifically related to a cosmogonic account in which an underlying structure of binary pairings governs the nature and use of names. The text articulates the question of language, in other words, in relation to an account of genesis, and the potential of names (ming 名) is rendered in their ability to either maintain or upset a generative structure that is understood to subtend all things. This imbrication of cosmogony and language, moreover, points explicitly to the role of cultivation that we have identified as deeply connected to the question of language in classical Chinese accounts. The regenerative logic of the cosmogonic account, when it is replicated at the level of language, endows the speaker with the ability to bring harmony to the realm of human endeavors and to aid in the cultivation of one’s person. The Tai Yi Sheng Shui resorts to the familiar model of sages, and presents them as figures who utilize cosmogonic principles of regeneration and rebirth by appropriately wielding the ‘name’ of dao. In doing so, the text explicitly praises them for achieving the completion of affairs (shi 事) and the cultivation of their persons (shen身).

The Heng Xian seems to offer an alternative account in which the organizing conceptual frame is the ontological division between being or presence (you 有) and non-being or absence (wu 無). ‘Names,’ in this binary account, are endowed with a mediating role between a conscious, coercive activity and a complete absence of the same. The text articulates this middle ground through the creative notion of names and accompanying “endeavors” (shi 事) that “become (or happen) of themselves” (zi wei自為).

This article has offered but one perspective on the treatment of language in classical Chinese texts, foregrounding the intersection of concepts of language and the larger concern with cultivation practices. Numerous possibilities for thinking about the nature of language emerge along a spectrum where speech is rendered, at one end, as a natural disposition, or, at the other, as an artificial construct that must be calibrated to achieve a desired state at the individual as well as communal levels. Irrespective of a bias toward naturalism or constructivism, a recurring theme emerges in the figure of the sage or shengren who supplies each of the schools with a model or fa 法for how language should ideally be deployed. The excavated literature adds additional diversity to this conversation, offering another iteration of the sage who appears to borrow from both the Confucian as well as Daoist theories of language and their corresponding models of sagacity.

9. References and Further Reading

Allan, Sarah. 2003. “The Great One, Water, and the Laozi: New Light from Guodian.” T’oung Pao 89 (4/5):237–285.
Boltz, William. 1985. “Desultory Notes on Language and Semantics in Ancient China.” Journal of the American Oriental Society 105 (2):309–313.
Brindley, Erica F. 2013. “The Cosmos as Creative Mind: Spontaneous Arising, Generating, and Creating in the Heng Xian.” Dao 12 (2):189–206.
Fraser, Chris. 2007. “Language and ontology in early Chinese thought.” Philosophy East and West 57 (4):420–456.
Fraser, Chris. 2016. The Philosophy of the Mòzĭ: The First Consequentialists: Columbia University Press.
Geaney, Jane. 2002. On the Epistemology of the Senses in Early Chinese Thought: University of Hawaii Press.
Geaney, Jane. 2010. “Grounding “language” in the senses: What the eyes and ears reveal about Ming 名 (names) in early chinese texts.” Philosophy East and West 60 (2):251–293.
Graham, Angus C. 1978. Later Mohist Logic, Ethics, and Science: Chinese University Press.
Graham, Angus C. 1989. Disputers of the Tao: Philosophical argument in ancient China: Open Court La Salle, Ill.
Hall, David L., and Roger T. Ames. 1987. Thinking Through Confucius: State University of New York Press.
Hansen, Chad. 1983. Language and Logic in Ancient China: University of Michigan Press.
Harbsmeier, Christoph. 1989a. “The Classical Chinese Modal Particle Yi.” In Proceedings of the Second International Conference on Sinology, 471–503. Academia Sinica.
Harbsmeier, Christoph. 1989b. “Marginalia Sino-Logica.” In Understanding the Chinese Mind: The Philosophical Roots, edited by Robert E. Allinson, 59–83. Oxford.
Harbsmeier, Christoph. 1991. “The mass noun hypothesis and the part-whole analysis of the White Horse Dialogue.” In Chinese Texts and Philosophical Contexts: Essays Dedicated to Angus C. Graham, 49–66. Open Court.
Hutton, E.L. 2014. Xunzi: The Complete Text: Princeton University Press.
Kjellberg, Paul. 2007. “Dao and Skepticism.” Dao 6 (3):281–299.
Lewis, Mark E. 1999. Writing and Authority in Early China: State University of New York Press.
Loy, Hui-chieh. 2003. “Analects 13.3 and the Doctrine of “Correcting Names”.” Monumenta Serica 51:19–36.
Mou, Bo. 1999. “The structure of the Chinese language and ontological insights: a collective-noun hypothesis.” Philosophy East and West:45–62.
Perkins, F. 2014. Heaven and Earth Are Not Humane: The Problem of Evil in Classical Chinese Philosophy: Indiana University Press.
Robins, Dan. 2000. “Mass nouns and count nouns in classical Chinese.” Early China 25:147–184.
Wagner, R. G. 2003. Language, Ontology, and Political Philosophy in China: Wang Bi’s Scholarly Exploration of the Dark (Xuanxue): State University of New York Press.
Yearley, Lee H. 2005. “Daoist Presentation and Persuasion: Wandering among Zhuangzi’s Kinds of Language.” Journal of Religious Ethics 33 (3):503–535.
Zhuangzi. 1956. Zhuangzi Yinde (A Concordance to Chuang Tzu), Harvard-Yenching Institute Sinological Index Series. Cambridge MA: Harvard University Press.

Author Information

Rohan Sikri
Email: rsikri@uga.edu
University of Georgia
U. S. A.

Plato: Meno

plato Plato’s Meno introduces aspects of Socratic ethics and Platonic epistemology in a fictional dialogue that is set among important political events and cultural concerns in the last years of Socrates’ life. It begins as an abrupt, prepackaged debater’s challenge from Meno about whether virtue can be taught, and quickly becomes an open and inconclusive search for the essence of this elusive “virtue,” or human goodness in general. This inquiry exhibits typical features of the Socratic method of elenchus, or refutation by cross-examination, and it employs typical criteria for the notoriously difficult goal of Socratic definitions. But then a distinctive objection to the possibility of learning anything at all by such inquiry prompts the introduction of characteristically Platonic themes of immortality, mathematics, and a “recollection” of knowledge not learned by experience in this life. A model geometry lesson with an uneducated slave is supposed to illustrate the importance of being aware of our own ignorance, the nature of proper education, the difference between knowledge and true belief, and the possibility of learning things without being taught. When the conversation returns to Meno’s initial question of whether virtue can be taught, Socrates introduces another manner of investigation, a method of “hypotheses,” by which he argues that virtue must be some kind of knowledge, and so it must be something that’s taught. But then Socrates also argues to the contrary that since virtue is never actually taught, it seems not to be knowledge after all.

This dialogue portrays aspects of Socratic ignorance and Socratic irony while it enacts his twofold mission of exposing common arrogant pretensions and pursuing a philosophical knowledge of virtue that no one ever seems to have. It is pervaded with typical Socratic and Platonic criticisms of how, in spite of people’s constant talk of virtue, they value things like wealth and power more than wisdom and justice. And it includes a tense confrontation with one of the men who will bring Socrates to trial on charges of corrupting young minds with dangerous teachings about morality and religion. The dialogue closes with the surprising suggestion that virtue as practiced in our world both depends on true belief rather than knowledge and is received as some kind of divine gift.

Overview of the Dialogue
Major Themes of the Dialogue
Relations of the Meno to Other Platonic Dialogues
References and Further Reading

1. Overview of the Dialogue

a. Dramatic Setting

The Meno is a philosophical fiction, based on real people who took part in important historical events. Plato wrote it probably about 385 B.C.E., and placed it dramatically in 402 B.C.E. Socrates was then about sixty-seven years old, and had long been famous for his difficult questions about virtue and knowledge. In just a few years, he would be convicted and executed for the crime of corrupting the youth of Athens. This dialogue probably takes place in one of Athens’ gymnasia, where men and boys of leisure gathered not just for exercise, but also for education and socializing. Socrates often conducted his distinctive philosophical conversations in places like that, and ambitious young men like Meno, who studied public speaking and the hot intellectual topics of the times, wanted to hear what Socrates had to say. Some wanted to try refuting him in public.

The larger setting is the political and social crisis at the end of the long Peloponnesian War. After finally being defeated by Sparta, Athens has narrowly escaped total destruction, and is now ruled by a Spartan-backed oligarchy. The questions in the Meno about teaching virtue are directly related to longstanding tensions between oligarchic and democratic factions. For generations, Athens had been an intellectual, economic, and military leader, especially after her crucial role—together with Sparta—in repelling the Persian invasions of Greece in 490 B.C.E. and 480 B.C.E. Athens’ radically democratic form of government was distinctive but influential in typically oligarchic Greece, and influential largely because she presided over the Delian League of nearly 200 city-states, which became an Athenian empire. After those Persian invasions, many independent cities had asked Athens to replace Sparta in leading a united defense and reprisal against the Persian empire. But eventually most were just supplying mandated funds to Athens, basically for the continuation of Athens’ war against Sparta’s Peloponnesian League. Through many reversals of fortune, Athens both suffered greatly and flourished culturally, using some of that tribute for her own development and adornment. Much of the best Greek art still familiar to us today—the sculpture and architecture, the tragedy and comedy—comes from the Athens of that time. Artists and intellectuals flocked to Athens, including the new kind of traveling teachers, called “sophists,” who are so disparaged in the last part of the Meno. These teachers were independent entrepreneurs, competing with each other and providing an early form of higher education. Much of their influence came through their expensive courses in public speaking, which in Athens prepared young men of old aristocratic families for success in democratic politics. But various sophists also taught various other subjects, from mathematics to anthropology to literary criticism.

Shortly before this dialogue takes place, some leading Spartans and allies considered killing all the Athenian men and enslaving the women and children. But they decided instead to support a takeover by a brutal, narrow oligarchy, led by thirty members of aristocratic Athenian families who were unhappy with the democracy. Their executions, expropriations, and expulsions earned them the hatred of most Athenians; later “the Thirty” became known as “the Thirty Tyrants.” The extremists among them first purged their more obvious enemies, then turned to the moderates who resisted their cruelty and wanted a broader oligarchy or restricted democracy that included the thousands in the middle class. Thousands of Athenians were killed or fled the city, and many who stayed acquiesced in fear for their lives. But supporters of a return to democracy soon rallied outside the city, defeating the Thirty’s army in May 403 B.C.E. The conversation in the Meno takes place in late January or early February 402 B.C.E. (after Anytus’ return from exile in 403 B.C.E., before Meno’s departure for Persia by early 401 B.C.E., and shortly before annual rites of initiation to the religious Mysteries, which are mentioned at Meno 76e). Democratic and oligarchic factions might then still have been negotiating terms of reconciliation in order to prevent further civil war. The resulting agreement included a general amnesty for crimes committed up to that time, excluding only the Thirty and a few other officials. But the last of the extreme oligarchs would soon massacre the nearby town of Eleusis and take power there, and then attempt another takeover at Athens in 401 B.C.E., before they are finally put down for good.

As Meno and Socrates discuss the nature of virtue and how it might be acquired, the Athenian success story is not over. The democracy would continue for most of the next century, and even a semblance of the empire would be revived. But for now, the recently restored democracy is anxious about continuing class conflict, and fearful of renewed civil war. Some democrats were suspicious of Socrates, and may have believed that he had sided with the extreme oligarchs, because of his prior relationships with some of them. The general amnesty did not allow prosecuting such allegations. But after the war, Socrates continued his uniquely nondemocratic yet anti-elitist, unconventional yet anti-sophistic interrogations. Many Athenians thought that he was undermining traditional morality and piety, and thereby corrupting the young minds of a vulnerable community. Those were the formal charges that led to Socrates’ execution in 399 B.C.E.

b. Characters

i. Socrates

About the historical Socrates, much of what we think we know is drawn from what Plato wrote about him. Socrates published nothing himself, but, probably soon after his death, the Socratic dialogue was born as a new genre of literature. He was portrayed with different emphases by different authors, including Xenophon, Aeschines, Antisthenes, Phaedo, Euclides, and others. But what interests most people about Socrates today comes from Plato’s philosophical portraits. Even these Platonic portraits vary somewhat across his many dialogues, but all are similar in one way or another to what we see in the Meno. Generally, Plato’s Socrates focuses his inquiries on moral subjects, and he will discuss them with anyone who is interested. He claims not to know the answers to his questions, and he interrogates others who do claim to know those answers. He seeks definitions of virtues like courage, moderation, justice, and piety, and often he suggests that each virtue, or virtue as a whole, is really some kind of knowledge.

As Plato depicts Socrates, it was not easy to understand his position in either the politics or the controversial new teachings of the time. Many of his contemporaries, like Meno and Anytus in this dialogue, probably could not distinguish his kinds of questions from other “arts of words” practiced by other intellectuals or “sophists.” But Plato often has Socrates criticizing sophists for claiming to teach more than they knew, and he emphasizes that, by contrast, Socrates never claimed to be a teacher, never accepted fees for his conversations, never sought wealth or political power, and always pursued subjects related to seeking the real nature of virtue.

To make matters more confusing, a few of the Thirty Tyrants or their extremist supporters, like Critias and Charmides, had earlier been associates of Socrates. But again, Socrates’ position in the conflict is not obvious. While he criticized democracy generally for putting power in the hands of an unwise and fickle majority, he never advocated rule by the wealthy either, and certainly not any of the Thirty’s cruel deeds. Plato emphasizes that Socrates respected common citizens more than the famous and powerful (Apology 21b-22e), and that he disobeyed direct orders from the Thirty, at risk to his own life (32cd). Socrates generally advocates humility and justice above all (for example, Apology 20cff, 29dff, Crito 49aff), and he specifically refutes and chastises Charmides and Critias in Plato’s Charmides.

ii. Meno

Meno is apparently visiting the newly restored Athenian government to request aid for his family, one of the ruling aristocracies in Thessaly, in northern Greece, that was currently facing new power struggles there. Meno’s family had previously been such help to Athens against Sparta that his grandfather (also named Meno) was granted Athenian citizenship. We do not know what resulted from Meno’s mission to Athens, but we do know that he soon left Greece to serve as a commander of mercenary troops for Cyrus of Persia—in what turned out to be Cyrus’ attempt to overthrow his brother, King Artaxerxes II.

Meno was young for such a position, about twenty years old, but he was a favorite of the powerful Aristippus, a fellow aristocrat who had borrowed thousands of troops from Cyrus for those power struggles in Thessaly, and was now returning many of them. The contemporary historian Xenophon (who also wrote Socratic dialogues) survived Cyrus’ failed campaign, and he wrote an account whose description of Meno resonates with Plato’s portrait here: ambitious yet lazy for the hard work of doing things properly, and motivated by desire for wealth and power while easily forgetting friendship and justice. But Xenophon paints Meno as a thoroughly selfish and unscrupulous schemer, while Plato sketches him as a potentially dangerous, overly confident young man who has begun to tread the path of arrogance. His natural talents and his privileged but unphilosophical education are not guided by wisdom or even patience, and he prefers “good things” like money over genuine understanding and moral virtue. In this dialogue, Plato imagines Meno encountering Socrates shortly before that disastrous Persian adventure, when he has not yet proved himself to be the “scoundrel” and “tyrant” that Socrates suspects and Xenophon later confirms. According to Xenophon, when Cyrus was killed and his other commanders were quickly beheaded by the King’s men, Meno was separated and tortured at length before being killed, because of his special treachery (see Xenophon’s Anabasis II, 6).

iii. Anytus

Anytus is a prominent Athenian politician and Meno’s host in Athens. He too was wealthy, not in Meno’s old aristocratic way, but as heir to the successful tannery of a self-made businessman. Anytus is passionately opposed to those sophists who thrived in Athens’ democracy and claimed to teach virtue along with so many other things. He prefers the more traditional assumption that good gentlemen learn goodness not from professional teachers but by association with the previous generation of good gentlemen. (That was a traditional aristocratic notion, but it has a democratic shape at Meno 92e, Apology 24d ff., and Protagoras 325c ff.) Although Plato was not a fan of most sophists either, he portrays Anytus’ attitude as clearly prejudicial. And though Socrates is no professional teacher, Anytus considers him just as bad, or worse. Anytus is one of three men who will bring Socrates to trial in 399 B.C.E.

Anytus had himself been prosecuted in 409 B.C.E., for failure as a general in the war against Sparta, and allegedly he escaped punishment by bribing the jury. Later, he supported the moderate faction among the Thirty Tyrants, and was banished by the extremists. Then he was a general for the democratic forces in the fight to overthrow the Thirty in 403 B.C.E., and he quickly became a leading politician in the restored democracy. In the Meno, Socrates presses Anytus about why so many of Athens’ leading statesmen have failed to teach even their own sons to be good, and Anytus could probably see that these questions apply to himself. Xenophon’s Apology of Socrates, which is rather different from Plato’s, suggests that Anytus had a personal grudge against Socrates, since Socrates had criticized Anytus’ education of his own son, and predicted that he would turn out to be no good. But Anytus may well have sincerely believed that Socrates corrupted young men like Critias and Charmides by teaching them to question good traditions. At any rate, Socrates’ questions about education in the Meno upset Anytus enough to warn Socrates to desist, or risk getting hurt—thus foreshadowing Anytus’ role in Socrates’ trial. (Compare Meno 94e f. and 99e f. with Apology 23a-24a and 30cd.)

c. Summary of Arguments, in Three Main Stages

There are three main parts to this dialogue, which are three main stages in the argumentation that leads to the tentative conclusion about how virtue is acquired.

The dialogue opens with Meno’s challenge to Socrates about how “virtue” (aretê) is achieved. Is it something that is taught, or acquired through training, or possessed by nature? Socrates quickly turns the discussion into an investigation of something more basic, namely, what such virtue is. Since Socrates denies knowing the nature of virtue, while Meno confidently claims to know all about it, Socrates gets Meno to try defining it. Most of this third of the dialogue is then an extended series of arguments against Meno’s three attempts to define virtue. We see the famous “Socratic Method,” in which Socrates refutes someone’s claim to knowledge by revealing that one of their claims is contradicted by others that they also believe to be true. For example, Meno’s initial claim that there are irreducibly different virtues for different kinds of people (71e) is incompatible with his implicit belief (elicited by Socrates) that virtues cannot be different insofar as they are virtues. And Meno’s definition of virtue as the ability to rule over others (73d) is incompatible with his agreements that a successful definition of virtue must apply to all cases of virtue (so including those of children and slaves) and only to cases of virtue (so excluding cases of unjust rule). In each case, since Meno accepts these claims that contradict his proposed definitions, he is shown not to know what he thought he knew about virtue. As Socrates three times exposes the inadequacies of Meno’s attempted definitions, giving examples and guidelines for further practice, Meno’s enthusiasm gives way to reluctance and frustration. Eventually, Meno blames Socrates for his trouble, and insults Socrates by comparing him with the ugly, numbing stingray. Then he makes a momentous objection to conducting such an inquiry at all.

The second stage of the dialogue begins with that momentous, twofold objection: if someone does not already know what virtue is, how could he even look for it, and how could he even recognize it if he were to happen upon it? Socrates replies by reformulating that objection as a paradoxical dilemma, then arguing that the dilemma is based on a false dichotomy. The dilemma is that we cannot learn either what we know or what we do not know, because there is no need to learn what we already know, and we cannot recognize what we do not yet know. Socrates tries to expose the false dichotomy by identifying states of cognition between complete knowledge and pure ignorance. First, he introduces a notion that the human soul has learned in previous lives, and suggests that learning is therefore possible by remembering what has been known but forgotten. (Forgotten-but-capable-of-being-remembered is a state of cognition between complete knowledge and pure ignorance.) Then he tries to illustrate this “theory of recollection” with the example of a geometry lesson, in which Socrates refutes a slave’s incorrect answers much as he had refuted Meno, and then leads him to recognize that the correct answer is implied by his own prior true beliefs. (Implicit true belief is another state of cognition between complete knowledge and pure ignorance.) After the geometry lesson, Socrates briefly reinterprets the alleged “recollection” in a way that can be taken as the discovery of some kind of innate knowledge, or innate ideas or beliefs. Meno finds Socrates’ explanation somehow compelling, but puzzling. Socrates says he will not vouch for the details, but recommends it as encouraging us to work hard at learning what we do not now know. He asks Meno to join him again in a search for the definition of virtue.

But in the third stage of the dialogue, Meno nonetheless resists, and asks Socrates instead to answer his initial question: is virtue something that is taught, or is it acquired in some other way? Socrates criticizes Meno for still wanting to know how virtue is acquired without first understanding what it is. But he agrees, reluctantly, to examine whether virtue is something that is taught by way of “hypotheses” about what sorts of things are taught, and about what sorts of things are good. Here Socrates leads Meno to two opposed conclusions. First, he argues, on the hypothesis that virtue is necessarily good, that it must be some kind of knowledge, and therefore must be something that is taught. But then he argues, from the fact that no one does seem to teach virtue, that virtue is not after all something that is taught, and therefore must not be knowledge. This is where Anytus arrives and enters the discussion: he too objects to the sophists who claim to teach virtue for pay, and asserts that any good gentleman can teach young men to be good in the normal course of life. But then Anytus cannot explain Socrates’ long list of counterexamples: famous Athenians who were widely considered virtuous, but who did not teach their virtue even to their own sons. When Anytus withdraws from the conversation in anger, Socrates reminds Meno that sometimes people’s actions are guided not by knowledge but by mere true belief, which has not been “tied down by working out the reason.” He provisionally concludes that when people act virtuously, it is not by knowledge but by true belief, which they receive not by teaching but by some kind of divine gift. But then Socrates warns again that they will not really learn how virtue is acquired until they first figure out what virtue itself is.

2. Major Themes of the Dialogue

a. Virtue and Knowledge

In this whole inconclusive conversation, the most important Socratic proposal is that “virtue” (aretê in Greek) must be some kind of knowledge. But a crucial fact about the dialogue is that this central subject matter, while obviously very important, remains elusive from beginning to end. When Meno asks how aretê is acquired, Socrates denies knowing what aretê really is. Meno thinks he knows what aretê is, but he is soon surprised to find that he cannot define it. As they work at the definition, alleged examples of aretê range from political power to good taste and from justice to getting lots of money. At first, Meno wants to deny that all aretai share some common nature, but he quickly becomes ambivalent about that. Eventually, Socrates seems to persuade him that the essence of aretê must be some kind of knowledge, but then this provisional conclusion gives way under the observation that what they are looking for is apparently never actually taught. In closing, Socrates reminds Meno that their confusion about whether aretê is taught is a result of their confusion about the nature of aretê itself.

So what sort of thing is this aretê that they are trying to understand? Much of ancient Greek literature shows that aretê was a central ideal and basic motivator throughout the culture. The stylized heroes of Homer’s legendary Trojan war and the real soldiers of their own contemporary campaigns, the athletes at the Olympic games and the orators in political debates—all of these, whether they fought for survival or retribution or the common good, were also seeking honor from their peers for aretê. Both the importance and the vagueness of the term is expressed in Socrates’ question to Anytus:

Meno has been telling me for some time, Anytus, that he desires the kind of wisdom and aretê by which people manage their households and cities well, and take care of their parents, and know how to receive and send off fellow-citizes and foreign guests as a good man should. To whom should we send him for this aretê? (91a)

The standard English translations of aretê are “excellence” and “virtue.” “Excellence” reminds us that the ancient concept applies to all of the above and even to some admirable qualities in nonhuman things, like the speed of a good horse, the sharpness of a good knife, and the fertility of good farmland. But “virtue” too is sometimes still used that way, when we speak of the virtues of the plan or the brand that we prefer. And “excellence” is rather weak and abstract for the focus of these Socratic dialogues, which is something people spent a lot of time thinking and worrying about. Intellectuals debated how it is acquired; politicians knew they had to speak persuasively about it; and Socrates himself considered it the most important thing in life. In our dialogue, Meno keeps thinking of aretê in terms of ruling others and acquiring honor or wealth, while Socrates keeps reminding him that aretê must also include things like justice and moderation (73a, d, 78d), industriousness (81d, 86b). and self-control: “rule yourself,” he says, “so that you may be free” (86d). In this connection, it is often said that Greek ethical thinking evolved from a focus on competitive virtues like courage and strength to a greater appreciation of cooperative virtues like justice and fairness. But this could be at most a shift of emphasis, since even Homer’s epics of war and adventure celebrate pity and humility, justice and self-control. So it may help to think of our dialogue as asking how we can acquire “virtue” in the very general sense of human goodness or human greatness. Like Meno, most of us think we already know what “being a good person” or “being a great person” is like, but we would be stumped if we had to define it. The whole range of examples used in this dialogue would be relevant. And Socrates’ basic suggestion, that “being good and great” requires some important kind of knowledge, would seem both attractive and puzzling.

A further reason for the inconclusiveness of the Meno is the inherent difficulty of providing the kind of definition that Socrates seeks. He was notorious for always seeking and always failing to identify the essences of things like justice, piety, courage, and moderation. A successful definition in Socrates’ sense does not just state how a given word is used, or identify examples, or stipulate a special meaning for a given context. A Socratic definition is supposed to reveal the essence of a unitary concept or a type of real thing. Such a definition would specify not just any qualities that are common to that kind of thing, but the qualities that make them be the kind of thing they are. Other characters in Plato’s dialogues usually have difficulty understanding what Socrates is asking for; in fact, the historical Socrates may have been the first person to be rigorous about such definitions. The task is more difficult than it first seems, even for things like shape and color (see 75b-76e); it is even harder to accomplish for something like virtue. The first third of our dialogue takes the time to show that Meno’s list of examples will not do, because it does not reveal what is common to them all and makes them be virtue while other things are not (72a ff.); and that this kind of explanation must apply to all relevant cases (73d) and only to relevant cases (78d-e); and that something cannot be so explained in terms of itself or related terms that are still matters of dispute (79a-e). At the beginning of the dialogue, Meno did not know even how to begin looking for the one essence of all virtue that would enable us to understand things like how it is achieved. Socrates shows him these guidelines, and tries to get him to practice. But while Socrates clearly knows more than Meno about how to investigate the essence of virtue, he has not been able to discover exactly what it is.

Socrates is drawn to the idea that the essence of all virtue is some kind of knowledge. In the last third of the dialogue, when Meno will not try again to define virtue, Socrates introduces and explores his own suspicion in terms of the following “hypothesis”: if virtue is taught then it is knowledge, and if it is knowledge then it is taught, but not otherwise. This line is pursued with the further “firm hypothesis” that virtue must always be a good thing. Socrates argues that only knowledge is necessarily good, and the goodness or badness of everything else depends on whether it is directed by knowledge. The conclusion of this hypothetical investigation would be that virtue is taught because it is some kind of knowledge—and the argument to that effect requires the rejection of Meno’s constant preference for “good things” like wealth and power (78c-d, 87e-89a). But what kind of knowledge? Or what kind of wisdom? In this discussion, Socrates uses a variety of Greek knowledge-terms, combining epistêmê, phronêsis, and nous as if they were interchangeable. The cumulative meaning ranges from knowledge and intelligence to understanding and wisdom. Clearly, what Socrates is looking for would be not just theoretical knowledge but some kind of practical wisdom, a knowledge that can properly direct our behavior and our use of material things. But this dialogue gets no further than arguing that virtue is some sort of wisdom, “in whole or in part” (89a). And then Socrates introduces a reason for reconsidering even that: it seems that such wisdom is never taught.

b. Recollection and Innate Ideas

A surprising interpretation of knowledge occurs in the middle third of the Meno, when Socrates suggests that real learning is a special kind of remembering. Meno’s frustration in trying to define virtue had led him to object:

But in what way will you look for it, Socrates, this thing that you don’t know at all what it is? What sort of thing, among the things you don’t know, will you propose to look for? Or even if you should meet right up against it, how will you know that this is the thing you didn’t know? (80d)

Is Meno here honestly identifying a practical difficulty with this particular kind of inquiry, where the participants now seem not to know even what they are looking for? Or is he just throwing up an abstract, defensive obstacle, so that he does not have to keep trying? Socrates interprets Meno’s objection in the obstructionist way, and reformulates it as a paradoxical theoretical dilemma:

Do you see what a contentious debater’s argument you’re bringing up—that it seems impossible for a person to seek either what he knows or what he doesn’t know? He couldn’t seek what he knows, because he knows it, and there’s no need for him to seek it. Nor could he seek what he doesn’t know, because he doesn’t know what to look for. (80e)

This reformulation of Meno’s objection has come to be known as “Meno’s Paradox.” It is Plato’s first occasion for introducing his notorious “theory of recollection,” which is an early example of what would later be called a theory of innate ideas.

The notion that learning is recollection is supposed to show that learning is possible in spite of Meno’s objection: we can learn by inquiry, because we can begin in a state of neither complete knowledge nor pure ignorance. To understand what Plato intends with his sketchy theory, we should compare the initial statement of the idea (81a-e), the alleged illustration of it (82a-85b), and the restatement of it after the illustration (85b-86b). According to the initial statement, all souls have already learned everything in many former lives, and learning in this life is therefore a matter of remembering what was once known but is now forgotten. But this is apparently an attention-grabber, dubiously citing unnamed priests and poets, who are just the kind of people Socrates later criticizes for having intermittent true beliefs rather than stable knowledge about their subjects (99c-d). Meno is in fact intrigued, and when he asks for a demonstration, Socrates illustrates by cleverly leading an uneducated slave to the correct answer to a geometrical problem—and doing so by “only asking questions” and eliciting the correct answer from the slave himself. Here, Socrates clearly asks “leading questions,” and eventually even shows the slave the answer in the form of a question (84e). But more important is the fact that he legitimately helps the slave to work out the reasoning, and thereby see the way in which the unexpected answer was implied by other true beliefs that he already had. So the geometry lesson successfully demonstrates some of the beauty of Socratic education, and the power of deductive reasoning in learning. That is enough to refute Meno’s Paradox, which inferred the impossibility of learning from a false dichotomy between complete knowledge and pure ignorance.

But the geometry lesson with the slave clearly does not demonstrate the reminding of something that was learned in a previous life. So it is important to notice that Socrates partly restates the “theory of recollection” after the geometry lesson. This time he concludes not that the slave has remembered some geometrical knowledge from what his mind had learned from experiences in previous lives, but instead that the slave has discovered the relevant true beliefs in his mind, which is somehow “always in a state of having learned” (86a). In the context, that “always” does seem to include many lifetimes, though it could in principle refer just to however long the mind has existed, perhaps since some point of development in the womb. In any case, the phrase “always in a state of having learned” is unusual and striking. If a mind could always be in a state of having learned something, then there would be no point at which it learned that thing. This paradoxical phrasing turns the initial statement of the theory of recollection, which stretched a common-sense notion of learning from experience over a number of successive lifetimes, into the beginnings of a theory of innate ideas, because the geometrical beliefs or concepts somehow belong to the mind at all times. Near this point in the dialogue, Socrates also states that after employing such ideas to elicit the relevant true beliefs, more work is still required for converting them to knowledge (85c-d). Later in the conversation, Socrates even seems to identify “recollection” with this latter part of the process (98a).

Some philosophers and experimental psychologists today agree that basic mathematical concepts, and the beliefs implicit in them (along with many others), are innate—not as an eternal possession of an immortal soul, but as a universal and specialized human capacity determined in part by biological evolution. So in a sense, Socrates’ conclusion that something of “the truth about reality” is “always in our minds” (86b) is even roughly compatible with modern science. The Meno does not end up specifying just what kind of innate resources enable genuine learning about geometry or virtue: Socrates infers from the geometry lesson both that the slave had innate knowledge (85d), and that he had innate beliefs that can be converted to knowledge (85c, 86a), but the dialogue ends with an agreement that “men have neither of these by nature, neither knowledge nor true belief” (98c-d). In fact, while Plato seems quite serious about the idea that genuine learning requires discovering knowledge for ourselves on the basis of our innate resources, he has Socrates disclaim confidence about any details of the theory in this dialogue (86b-c).

c. Teaching and Learning

According to Socrates, the practical purpose of the theory of recollection is to make Meno eager to learn without a teacher (81e-82a, 86b-c). It seems that Meno is used to thinking of learning as just hearing and remembering what others say, and he objects to continuing the inquiry into the nature of virtue with Socrates precisely because neither of them already knows what it is (80d). The geometry lesson shows that we can learn things we do not yet know (at least what we do not yet consciously and explicitly know) if they are entailed by other things that we know or correctly believe. And Socrates emphatically alleges that when the slave becomes aware of his own ignorance, he properly desires to overcome it by learning; this too is supposed to be an object lesson for Meno (84a-d). But Meno does not learn this lesson. Instead of desiring to inquire into the real nature of virtue, he asks instead to hear Socrates’ answer to his initial question about how virtue is acquired. He asks again whether virtue is something that is taught, and once again he wants to be taught about this just by being told (86c-d; compare 70a, 75b, 76a-b, 76d).

This time Socrates apparently relents, but he warns that the rest of their discussion will be compromised by a flawed approach. At least he gets Meno to follow him in a self-consciously “hypothetical” approach—a kind of method that he claims to borrow from mathematicians, who use it when they cannot prove more securely what they want to prove. He illustrates with a geometrical hypothesis that is notoriously obscure, but the corresponding hypothesis about virtue seems to be this: if virtue is something that is taught, then it is a kind of knowledge, and if it is a kind of knowledge, then it is something that is taught (87b-c). Next, Socrates offers an independent argument (based on a different hypothesis) that virtue must in fact be some kind of knowledge, because virtue is necessarily good and beneficial, and only knowledge could be necessarily good and beneficial. Together with the hypothesis that knowledge and only knowledge is taught, Socrates would have proved that virtue is something that is taught.

But there is something wrong with the hypothesis that all and only knowledge is taught. Surely much of what is taught is just opinion, and surely some knowledge is learned on one’s own, without a teacher. In fact, one main point of the theory of recollection and the geometry lesson was that real learning requires active inquiry and discovery from one’s own resources, which include some form of innate knowledge. Even if Socrates did “teach” the geometry lesson in a Socratic way, by leading the slave to the answer with the right questions, nonetheless he showed that while he could in some sense just show the slave the answer, he could not successfully give him knowledge or understanding. That requires working out the explanation for oneself (82d, 83d, 84b-c, 85c-d; compare 98a). This whole lesson was conducted in order to encourage Meno to try learning what virtue is, when he does not have a teacher to tell him what it is (81e-82a, 86c).

So why would Socrates use the faulty hypothesis that knowledge and only knowledge is taught, when it contradicts his notion of recollection and his model geometry lesson? Perhaps because, in effect, it is really Meno’s own hypothesis, as his opening questions and his behavior throughout the dialogue persistently imply. Meno’s opening set of questions substitutes “learned” for “taught” as if they were the same thing (Is virtue taught? Or is it trained? Or is it neither learned nor trained…). And then he just wants to hear Socrates’ answers, and keeps resisting the hard work of definition that Socrates keeps encouraging. When Meno resists yet again after the theory of recollection and the geometry lesson (86c), Socrates cleverly investigates this hypothesis, implicit in Meno’s behavior, to redirect Meno’s attention from his question about how virtue is acquired (Is it taught?) back to the unanswered question of what virtue is (Is it knowledge?). So Socrates could be quite serious in his lengthy argument that virtue must be some kind of knowledge (87c-89a), while reluctantly making use of the unsupported hypothesis that knowledge must be taught because, in effect, Meno insists upon it. Meno refuses to pursue knowledge of virtue the hard way, and he thinks that what he hears about virtue the easy way is knowledge.

After persuading Meno to take seriously his own favorite notion—that virtue is achieved through some kind of knowledge, rather than through wealth and political power—Socrates endeavors to convince Meno that learning just by hearing from others does not provide real knowledge or real virtue. Meno’s host Anytus now arrives at just the right moment, since Anytus is passionately opposed to the sophists who claim to teach wisdom and virtue with their traveling lectures and verbal displays. Anytus believes that virtue can be learned instead by spending time with any good gentleman of Athens, but Socrates shows that this view is superficial, too. He gathers well-known examples of allegedly virtuous men who did not teach their virtue even to their own children, which indicates that virtue is not something that is taught. Anytus departs in annoyance at Socrates’ seemingly dismissive treatment of Athens’ political heroes, so Socrates continues the issue with Meno. He reminds Meno that even professional teachers and good men themselves disagree about whether virtue can be taught. The closing pages argue that if their earlier hypothesis was true, and “people are taught nothing but knowledge,” then since virtue is not taught, virtue would not be knowledge. Socrates suggests that perhaps it could be correct belief instead. Correct belief can direct our behavior well, too, though not nearly as reliably as knowledge.

In this final portion of the dialogue, Socrates twice again asks Meno whether “if there are no teachers, there are no learners.” And Meno keeps affirming it, though no longer with full confidence: “I think … So it seems … if we have examined this correctly” (96c-d). Meno’s challenge to Socrates in the opening lines of the dialogue had used the terms “learned” and “taught” interchangeably. In the meantime, Socrates’ notion of learning as “recollection” indicates that knowledge requires much more than verbal instruction. As Socrates says to Anytus:

For some time we have been examining … whether virtue is something that’s taught. To that end we are asking whether good men past or present know how to bestow on another this virtue which makes them good, or whether it just isn’t something a man can give or receive from another. (93a-b)

Meno’s assumption that knowledge must be taught, and taught by mere verbal instruction, prevents a fuller investigation in this dialogue of Socrates’ hope that virtue is a kind of knowledge.

d. Theory and Practice

And what about Socrates: does he teach virtue in the Meno? He offers a theory that “there is no teaching but recollection” (82a). But what about his practice? Isn’t Socrates trying to teach Meno, by leading him to a correct definition of virtue, as he led Meno’s slave to the correct answer in the geometry lesson?

Rather, Socrates’ practice in the geometry lesson actually goes pretty well with his theory that there is no teaching, because his leading questions there require that the slave think through the deduction of the answer from what he already knew. And Socrates finishes by emphasizing that real knowledge of the answer requires working out the explanation for oneself. So even if a “teacher” can show the answer, he cannot give the understanding. The understanding requires active inquiry and discovery for oneself, based on innate mental resources and a genuine desire to learn. Whatever else might prove true or false about the notion that learning is a kind of recollection, these practical implications are what Socrates insists upon.

On behalf of the rest of the theory, I wouldn’t much insist. But we’ll be better men, braver and less lazy, if we believe that we must search for the things we don’t know, rather than if we believe that it’s not possible to find out what we don’t know, and that we must not search for it—this I would fight for very much, so long as I’m able, both in theory and in practice. (86b-c)

The practical side of learning as recollection applies no less in Socrates’ interactions with Meno. Socrates tries leading Meno to desire real knowledge of what virtue is rather than just collecting others’ opinions about how it is acquired, and tries to get him to practice active inquiry and discovery of the truth for himself, starting from his own basic and sincere beliefs about virtue. Meno’s moral education would call for all of that even if Socrates could tell him what the essence of virtue is, which he claims he cannot do.

Active Socratic inquiry requires humble hard work on the part of all learners: practice in the sense of the personal effort and training that properly develops natural ability. Socrates’ efforts to guide Meno throughout the dialogue indicate that achieving the wisdom that is virtue would require both the right kind of natural abilities and the right kind of training or practice—so that teaching can help if it is not mere verbal instruction but discussions that help a learner to discover the knowledge for himself. That could be the whole dialogue’s answer to Meno’s opening challenge, which specifies three options:

Tell me if you can, Socrates: Is virtue something that’s taught? Or is it not taught, but trained? Or is it neither trained nor learned, but people get it by nature, or in some other way? (70a)

Some have argued that Plato mentions training in the opening lines only because it was one of the traditional options debated in his day. It seems to be tacitly dropped from the rest of the dialogue, and when Meno later revisits his opening challenge, he omits the option about training (86c-d). But if Meno forgets or deliberately avoids it, Socrates does not. When Meno starts to recognize his difficulties, Socrates encourages him to practice with definitions about shape (75a) and gives him a series of paradigms or examples to practice with (73e-77a); later, he criticizes Meno for refusing to do so (79a). At a number of points, Socrates draws attention to the kind of training and habits Meno has already received (70b, 76d, 82a). The geometry lesson, which is supposed to exhibit successful persistent inquiry in the face of previous failures, concludes with advice about the need to work through problems “many times in many ways” (85c) and with a repeated warning about intellectual laziness (86b). While the theory that learning is recollection suggests that an essential basis for wisdom and virtue is innate, Socrates also reminds Meno that any such basis in nature would still require development through experience (89b). When Anytus enters the discussion, his father is praised as a man who, unlike Anytus himself, did not receive his prosperity as a gift from his father, but earned it “by his own skill and hard work” (90a). And the combination of quotations from Theognis near the end of the dialogue suggest that virtue is learned not through verbal teaching alone, but through some kind of character-apprenticeship under the guidance of others who are already accomplished in virtue (95d ff.)

Socrates’ persistence in encouraging Meno to practice active inquiry points in the same direction as the sketchy theory of recollection: while the kind of wisdom that could be real virtue would require understanding the nature of virtue itself, it would not be achieved by being told the definition. And it would not be a theoretical understanding divorced from the practice of virtue. In fact, our dialogue as a whole shows that Meno will not acquire the wisdom that is virtue until after he already practices some measure of virtue: at least the kind of humility, courage, and industriousness that are necessary for genuine learning.

3. Relations of the Meno to Other Platonic Dialogues

We cannot be precise or certain about much in Plato’s writing career. The Meno seems to be philosophically transitional between rough groupings of dialogues that are often associated in allegedly chronological terms, though these groupings have been qualified and questioned in various ways. It is commonly thought that in the Meno we see Plato transitioning from (a) a presumably earlier group of especially “Socratic” dialogues, which defend Socrates’ ways of refuting unwarranted claims to knowledge and promoting intellectual humility, and so are largely inconclusive concerning virtue and knowledge, to (b) a presumably “middle” group of more constructively theoretical dialogues, which involve Plato’s famous metaphysics and epistemology of transcendent “Forms,” such Justice itself, Equality itself, and Beauty or Goodness itself. (However, that second group of dialogues remains rather tentative and exploratory in its theories, and there is also (c) a presumably “late” group of dialogues that seems critical of the middle-period metaphysics, adopting somewhat different logical and linguistic methods in treating similar philosophical issues.) So the Meno begins with a typically unsuccessful Socratic search for a definition, providing some lessons about good definitions and exposing someone’s arrogance in thinking that he knows much more than he really knows. All of that resembles what we see in early dialogues like the Euthyphro, Laches, Charmides, and Lysis. But the style and substance of the Meno changes somewhat with the formulation of Meno’s Paradox about the possibility of learning anything with such inquiries, which prompts Socrates to introduce the notions that the human soul is immortal, that genuine learning requires some form of innate knowledge, and that progress can be made with a kind of hypothetical method that is related to mathematical sciences. This cluster of Platonic concerns is variously developed in the Phaedo, Symposium, Republic, and Phaedrus, but in those dialogues, these concerns are combined with arguments concerning imperceptible, immaterial Forms, which are never mentioned in the Meno. Accordingly, many scholars believe that the Meno was written between those groups of dialogues, and probably about 385 B.C.E. That would be about seventeen years after the dramatic date of the dialogue, about fourteen years after the trial and execution of Socrates, and about the time that Plato founded his own school at the gymnasium called the Academy.

More specifically, significant relations of the Meno to other Platonic dialogues include the following.

The Meno is related by its dramatic setting to the famous series of dialogues that center on the historical indictment, trial, imprisonment, and death of Socrates (Euthyphro, Apology, Crito, and Phaedo). Anytus in the Meno will be one of the three men who prosecute Socrates, which is specifically foreshadowed in the Meno at 94e.

The failed attempt to define virtue as a whole in the Meno is much like the failed attempts in other dialogues to define particular virtues: piety in the Euthyphro, courage in the Laches, moderation in the Charmides, and justice in the first book of the Republic. (And two other dialogues attempt and fail to define terms that are related to virtue: friendship in the Lysis and beautiful/good/fine (to kalon) in the Hippias Major.) Those dialogues emphasize some of the same criteria for successful definitions as the Meno, including that it must apply to all and only relevant cases, and that it must identify the nature or essence of what is being defined. The Meno adds another criterion: that something may not be defined in terms of itself, or in related terms that are still subject to dispute.

One of Socrates’ arguments late in the Meno, that virtue probably cannot be taught because men who are widely considered virtuous have not taught it even to their own sons, is also used near the beginning of Plato’s Protagoras. But there it is countered by a long explanation from the sophist Protagoras of how virtue is in fact taught to everyone by everyone, not with definitions or by mere verbal instruction, but in a life-long training of human nature through imitation, storytelling, and rewards and punishments of many kinds. Socrates does not object to this theory of moral education (instead he objects to other parts of Protagoras’ account), and elements of it are included in the system of education outlined by Socrates in Plato’s Republic. But while Plato’s treatment of Protagoras’ theory of education in the Protagoras is fairly sympathetic, the Meno’s general disparagement of sophistic teaching is explored at length in Socrates’ debates with individual sophists in Plato’s Euthydemus, Gorgias, Hippias Minor, and Hippias Major.

The Meno’s geometry lesson with the slave, where success in learning some geometry is supposed to encourage serious inquiry about virtue, is one indication of Plato’s interest in relations between mathematical and moral education. In the Gorgias (named after a sophist or orator who is mentioned early in the Meno as one of Meno’s teachers), Socrates debates an ambitious young orator-politician who is drawn to a crass hedonism, and claims that his soul lacks good order because he neglects geometry, and so does not appreciate the ratios or proportions exhibited in the good order of nature. Book VII of the Republic describes a system of higher education designed for ideal rulers, which uses a graduated series of mathematical studies to prepare such rulers for philosophical dialectic and for eventually understanding the Form of Goodness itself. In this connection, Socrates’ introduction of a “hypothetical” method of inquiry, adopted from mathematics, is developed somewhat in the Phaedo and in Republic Book VI.

The notion of learning as recollection is revisited most conspicuously in Plato’s Phaedo (72e-76e) and Phaedrus (246a ff.), both of which associate it closely with theories of human immortality and eternal, transcendent Forms. The passage about recollection in the Phaedo even begins by alluding to the one in the Meno, but then it discusses recollection not of specific beliefs or propositions (like the theorem about doubling the square in the Meno), but of basic general concepts like Equality and Beauty, which Socrates argues cannot be learned from our experiences in this life. In the Phaedrus, recollection of such Forms is not argued for but asserted, in a rather suggestive and playful manner, as part of a myth-based story about the human soul’s journeys with gods, which is meant to convey the power of love in philosophical learning. Plato also explores other models of innate knowledge elsewhere, such as an innate mental pregnancy in the Symposium (206c-212b; compare Phaedrus 251a ff.) and an innate intellectual vision in the Republic (507a-509c, 518b ff.).

4. References and Further Reading

a. The Standard Greek Text

Burnet, John. Platonis Opera, vol. III. Oxford: Clarendon Press, 1903.

b. Some English Translations

Plato: Meno. Translated by G. M. A. Grube. Second Edition. Hackett Publishing, 1980.
Plato: Meno and Phaedo. Translated by Alex Long and David Sedley. Cambridge Texts in the History of Philosophy. Cambridge University Press, 2011.
Plato: Protagoras and Meno. Translated by Adam Beresford and introduced by Lesley Brown. Penguin Classics, 2006.

c. Some Book-Length Studies

Bluck, R. S. Plato’s Meno, Edited with Introduction and Commentary. Cambridge University Press, 1961.
Klein, Jacob. A Commentary on Plato’s Meno. University of North Carolina Press, 1965.
Scott, Dominic. Plato’s Meno. Cambridge University Press, 2006.
Sharples, R. W. Plato’s Meno, Edited with Translation and Notes. Chicago: Bolchazy-Carducci, 1984.
Weiss, Roslyn. Virtue in the Cave: Moral Inquiry in Plato’s Meno. Oxford University Press, 2001.

d. Some Articles and Essays on the Major Themes

i. Virtue and Knowledge

Fine, Gail. “Inquiry in the Meno.” In The Cambridge Companion to Plato, edited by Richard Kraut, 200-226. Cambridge University Press, 1992.
Brickhouse, Thomas C., and Nicholas D. Smith. “Socrates and the Unity of the Virtues.” The Journal of Ethics 1 (1996): 311-324.
Santas, Gerasimos. “Socratic Definitions.” In Gerasimos Santas, Socrates: Philosophy in Plato’s Early Dialogues, 97-135. Routledge and Kegan Paul, 1979.
Vlastos, Gregory. “The Socratic Elenchus: Method Is All.” In Socratic Studies, edited by Gregory Vlastos, 1-37. Cambridge University Press, 1994.
Woodruff, Paul. “Plato’s Earlier Theory of Knowledge.” In Essays on the Philosophy of Socrates, edited by Hugh Benson, 86-106. Oxford University Press, 1992.

ii. Recollection and Innate Ideas

Moravcsik, Julius. “Learning as Recollection.” In Plato I: Metaphysics and Epistemology, edited by Gregory Vlastos, 53-69. Anchor Books, 1971.
Rawson, Glenn. “Platonic Recollection and Mental Pregnancy.” Journal of the History of Philosophy 44 (2006): 137-155.
Vlastos, Gregory. “Anamnesis in the Meno.” Dialogue IV (1965): 143-167.

iii. Teaching and Learning

Devereaux, Daniel T. “Nature and Teaching in Plato’s Meno.” Phronesis 32 (1978): 118-126.
Scolnicov, Samuel. “Three Aspects of Plato’s Philosophy of Learning and Instruction.” Paideia Special Plato Issue (1976): 50-62.
Woodruff, Paul. “Socratic Education.” In Philosophers on Education, edited by Amelie Rorty, 13-29. Routledge, 1998.

iv. Theory and Practice

Nehamas, Alexander. “Meno’s Paradox and Socrates as a Teacher.” In Essays on the Philosophy of Socrates, edited by Hugh Benson. Oxford University Press, 1992.
Rawson, Glenn. “Speculative Theory, Practical Theory, and Practice in Plato’s Meno.” Southwest Philosophy Review 17 (January 2001): 103-112.

Author Information

Glenn Rawson
Email: grawson@ric.edu
Rhode Island College
U. S. A.

Liar Paradox

The Liar Paradox is an argument that arrives at a contradiction by reasoning about a Liar Sentence. The Classical Liar Sentence is the self-referential sentence:

This sentence is false.

It leads to the same difficulties as the sentence, I am lying. Experts in the field of philosophical logic have never agreed on the way out of the trouble despite 2,300 years of attention. Here is the trouble. It is a sketch of the Paradox, the argument that reveals the contradiction:

Let L be the Classical Liar Sentence. If L is true, then L is false. But the converse also can be established, as follows. Assume L is false. Because the Liar Sentence is just the sentence ‘L is false’, the Liar Sentence is therefore true, so L is true. What has now been shown is that L is true if, and only if, it is false. Since L must be one or the other, it is both.

That contradictory result apparently throws us into the lion’s den of semantic incoherence. The incoherence is due to the fact that, according to the rules of classical logic, anything follows from a contradiction, even 1 + 1 = 3. This article explores the details and implications of the principal ways out of the Paradox, that is, the ways of preserving or restoring semantic coherence.

Most people, when first encountering the Liar Paradox, react in one of two ways. One reaction is not to take the Paradox seriously and say they will not reason any more about it. This reaction is a weak one because it provides no useful diagnosis of the original problem of semantic incoherence. The second and more popular reaction is to say the Liar Sentence must be meaningless. This reaction is weak if it can answer the question, “Why is the Classical Liar Sentence meaningless?” only with the ad hoc remark that otherwise we get a paradox. An adequate solution should offer a more systematic treatment. For example, the sentence “This sentence is not in Italian“ is very similar in structure to the Classical Liar Sentence. Is it meaningless, too? Apparently not. So, what feature of the Liar Sentence makes it be meaningless while “This sentence is not in Italian“ is not meaningless?

Is the Liar Paradox importantly different if one considers it to be about statements or propositions rather than sentences? The classical view of propositions is that a proposition is what a person uses a sentence to say, and that a proposition has its truth value independently of the sentence used to express it. So, one issue is whether it is important to start the Liar Paradox argument with this liar sentence:

What this sentence says is false.

instead of this one:

This sentence is false.

The questions about the Liar Paradox continue, and an adequate solution should address the questions formally or at least systematically.

History of the Paradox
Overview of Ways Out of the Paradox
Assessing the Five Ways Out
Conclusion
References and Further Reading

1. History of the Paradox

Zeno’s Paradoxes were discovered in the 5th century B.C.E., and the Liar Paradox was discovered later in the middle of the 4th century B.C.E. Both were discovered in ancient Greece. The oldest attribution of the Liar Paradox is to Eubulides of Miletus, a contemporary of Socrates, who included it among a list of seven puzzles. He said, “A man says that he is lying. Is what he says true or false?” Eubulides’ actual commentary on the Liar has not been found. An ancient gravestone on the Greek Island of Kos was reported by Athenaeus to contain this poem which might be about the difficulty of solving the Paradox:

O Stranger: Philetas of Kos am I,

‘Twas the Liar who made me die,

And the bad nights caused thereby.

Aristotle first clearly described the principle that no sentence can be contradictory; see his Metaphysics Book IV, Chapter 3, 1005^b lines 6-34. Theophrastus, Aristotle’s successor, wrote three papyrus rolls about the Liar Paradox, and the Stoic philosopher Chrysippus wrote six, but their contents are lost in the sands of time. Despite various comments on how to solve the Paradox, no Greek suggested that the Greek language itself was inconsistent; it was the reasoning within Greek that was considered to be inconsistent.

In the eleventh century, St. Peter Damian of Italy asserted that even an omnipotent God could not make a contradiction be true.

In the Late Medieval period in Europe, the French philosopher Jean Buridan put the Liar Paradox to devious use with the following proof of the existence of God. It uses the pair of sentences:

God exists.

None of the sentences in this pair is true.

The only consistent way to assign truth values (being true or being false) requires making the sentence God exists be true. In this way, Buridan has apparently proved that God does exist.

There are many other versions of the Paradox. Some Liar Paradoxes begin with a chain of sentences, no one of which is self-referential, although the chain as a whole is self-referential or circular:

The following sentence is true.

The following sentence is true.

The following sentence is true.

The first sentence in this list is false.

There are also Contingent Liars which may or may not lead to a paradox depending on what happens in the world beyond the sentence. For example:

It is raining, and this sentence is false.

Paradoxicality here depends on the weather. If it is sunny, then the sentence is simply false, but if it is raining, then we have the beginning of a paradox.

Suppose we try to solve the paradox by saying the Classical Liar Sentence, namely L, is so odd that it is neither true nor false. This way out fails for the following reason. If L were to be neither true nor false, as this treatment is suggesting, then, by the meaning of neither…nor, L is not false. But that consequence implies that what L says of itself (namely, that it is false) is false. So, L is false. This result leaves us with a contradiction (that L is false and not false). Unless there is a mistake in this reasoning, taking the route of saying the Liar Sentence is neither true nor false is not a successful treatment.

a. Strengthened Liar Paradox

Suppose we were somehow to have found a promising way out of the Classical Liar Paradox. Ahead looms the Strengthened Liar Paradox. The Strengthened Liar Paradox is called Strengthened because some promising solutions to the Classical Liar Paradox fail when faced with the Strengthened Liar Paradox.

The Strengthened Liar Paradox (also called the Strong Liar Paradox) can begin with a Strengthened Liar Sentence such as:

This sentence is not true,

to produce a contradiction. For example, let us stipulate that L’ is a name of the Strengthened Liar Sentence, and let us stipulate that the phrase This sentence within L’ refers to the full sentence L’. Surely L’ is either true or not true. Let’s examine both cases, or both disjuncts, starting with the second. Suppose L’ is not true. If L’ is not true, then that apparently implies it is true since any speaker who expresses the sentence is saying it is not true. Having established this result, now let’s make a different supposition starting with the first disjunct. Suppose L’ is true. If L’ were true, then that implies, just from the meaning of the sentence, that it is not true. That is our second result. Now, let’s combine the two results and we have established that L’ is true if and only if it is not true. Now we have a paradox because L’ is true or it is not.

Here is another version of the Strengthened Liar Paradox. Suppose you believe a promising way to solve the Classical Liar Paradox is to call the Classical Liar Sentence meaningless, with the assumption that any declarative sentence is true, false or meaningless. Before you can be content with that treatment, you must consider that it is not meaningless to call a sentence meaningless. If the Classical Liar Paradox is apparently solved formally by having an object language that allows a truth predicate and a falsehood predicate and a predicate that applies to meaningless phrases, then one could form in the object language a different Strengthened Liar Sentence, call it L”, that informally says:

This sentence is either false or meaningless.

Now we are on the road to paradox again. Surely L” is either true or it is not. Let us examine both disjuncts. (1) Suppose L” were true. If L” is true, then it is false or meaningless. If so, then it is not true. (2) Now for the second disjunct. Suppose L” were not true. Why would a declarative sentence not be true? Because it is false or meaningless. But the sentence’s being false or meaningless is precisely the claim being made by speakers of L”, so it follows that L” is true. So, by combining the results from both (1) and (2), one may conclude that L” is true if and only if it is not. We have a contradiction. It is understandable why this reasoning is often called the revenge of the liar.

We do not want to solve the Classical Liar Paradox only to be ensnared by the Strengthened Liar Paradox. Therefore, finding one’s way out of the Strengthened Liar Paradox is the acid test of a successful solution.

In discussions below, where context does not disambiguate between the Classical Liar Paradox and the Strengthened Liar Paradox and where it is not important to distinguish them, the simple phrase the Liar Paradox is used.

b. Why the Paradox is a Serious Problem

To put the Liar Paradox in perspective, it is essential to appreciate why such an apparently trivial problem is a deep problem. Solving the Liar Paradox is part of the larger project of understanding truth. Understanding truth is a difficult project that involves finding a theory of truth, or a definition of truth, and a proper analysis of the concept of truth. These are distinct projects, but the current article does not carefully distinguish them from each other.

Some researchers believe the Liar Paradox is one of several unresolvable knots in our language that “do exist and are not merely the product of careless and confused reasoning” (Mates 1981, 3). One of the aims of this article is to assess this claim.

Before saying more about the paradox and about a theory of truth, let us be clear about what a contradiction is. When this article speaks of a contradiction in a sentence that is being asserted or can be asserted, it means a sentence that is equivalent to a compound sentence that has the logical form of an assertion and its denial. Slightly more formally, the logical form of a contradiction is P and Not P, where P is some declarative sentence or independent clause, and Not P is its negation. When a Marxist speaks of the contradiction in capitalism, the Marxist is not referring to a contradiction in the sense of that term that is of interest to this article, but rather to the fact that opposing social forces will clash and produce a restructuring of the society’s economic system.

Languages are expected to contain contradictions but not paradoxes. The contradictory sentence such as Snow is white, and snow is not white, is just one of the many false sentences in the English language. But languages are not expected to contain or permit paradoxes, namely an apparently good inference in support of a contradiction. At least not in the philosopher’s sense of that word. Informally, many speakers will sometimes say of any very surprising or puzzling chain of reasoning that it is a paradox, but this is not the sense of the word paradox used in this article. A paradox in our sense is an apparently convincing argument leading from apparently true premises to a contradictory conclusion of the logical form P and Not P.

Why is that conclusion a problem? There are many ways to show why. Here is one. Let L be the Liar sentence, and let our contradictory conclusion be that L is both true and false. Calling a sentence false is apparently equivalent to calling its negation true. So, if ~L is the formal representation of the negation of L, and if we accept the conclusion of the Liar Paradox, then the compound sentence L and ~L is true. Now the trouble begins. Let Q be some sentence we already know not to be true, say 1 + 1 = 3. Then we can reason this way:

1.	L and ~L	from the Liar Paradox
2.	L	from 1
3.	L or Q	from 2 using the Law of Addition
4.	~L	from 1
5.	Q	from 3 and 4

This apparently legitimate proof that 1 + 1 = 3 is outrageous. That is why the paradox is a serious problem. An appropriate reaction to any paradox is to look for some unacceptable assumption made in the apparently convincing argument or else to look for a faulty step in the reasoning. Only very reluctantly would one want to learn to live with the contradiction being true, or ignore the contradiction altogether. The very existence of the Liar Paradox and other semantic paradoxes is evidence that there are principles we use which we have been taking to be obviously valid or obviously correct but which are not.

By the way, what this article calls paradoxes are called antinomies by Quine, Tarski, and some other authors.

Let us return to the issue of understanding truth by finding a theory of truth. We naturally want our theory of truth not to allow paradoxes. Aristotle offered what most philosophers consider to be a correct, necessary condition for any adequate theory of truth. Stripped of his overtones suggesting a correspondence theory of truth, Aristotle proposed (in Metaphysics 1011 b26) what is now called a precursor to Alfred Tarski’s Convention T (or his T-scheme):

A sentence is true if, and only if, what it says is so.

In his 1933 article, “The Concept of Truth in Formalized Languages,” Tarski rephrased the idea this way:

A true sentence is one which says that the state of affairs is so and so, and the state of affairs indeed is so and so.

Before we say more about the trouble with our theories of truth and reference, it will be helpful to describe the use-mention distinction. This is the distinction between using a term and mentioning it. Let us not confuse a dog with its name. Lassie is a helpful dog, but the word Lassie is not a dog at all; it is a six letter word. Placing pairs of quotation marks around a term, or italicizing it, serves to name it or mention it. The use-mention distinction applies to sentences as well as terms.

Tarski’s Convention T says a formally correct truth-definition should logically imply all sentences that say, for example: the sentence Snow is white is true just in case snow is white. Here is a second example of the form of the sentences Tarski is aiming at:

The sentence Aristotle was a student of Plato is true just in case Aristotle was a student of Plato.

If the same sentence about snow were named or mentioned not with italics or quotation marks but with the numeral 88 inside a pair of parentheses, then (88) would be true just in case snow is white. There is still another way to refer to sentences, namely via self-reference. If I say, “This sentence is written in English, and not Italian,” then the phrase This sentence refers to that sentence. This is all straightforward, and is a well-accepted way of doing naming and referring.

There is another important point to make about the use of quotation marks. When a logician says

For any sentence S, if “S” is true, then S,

this is not a remark about the letter of the alphabet between “R” and “T”. It is a remark about sentences.

Finally, let us be clearer about substitution of names. If we have two names with the same denotation, then usually one name can be substituted for the other in a sentence without the newly-produced sentence changing its truth-value. Mark Twain is the same person as Samuel Clemens, so substituting ‘Samuel Clemens’ for ‘Mark Twain’ in the true sentence:

Mark Twain was not a famous 21st century U.S. president

will produce:

Samuel Clemens was not a famous 21st century U.S. president

which is also true. The substitution preserves truth. At least it does here, but it does not in some other contexts. There are well known exceptions to this substitution principle. For example, suppose this is true:

John said, “Mark Twain was not a famous 21st century U.S. president.”

If John said nothing about Samuel Clemens, then the above substitution would turn a true sentence into a false one. So, in substituting we need to be careful about substituting inside a quoted phrase.

All these remarks about truth, reference, and substitution seem to be straightforward and not troublesome. Unfortunately, together they do lead to trouble, and the resolution of the difficulty is still an open problem in philosophical logic. Why is that? The brief answer is that Tarski’s sentence with the supposedly uncontroversial assumptions above can be used to produce the Liar Paradox. The less brief answer refers to Tarski’s Undefinability Theorem of 1936.

c. Tarski’s Undefinability Theorem

This article began with a sketch of the Liar Argument using Liar sentence L. To appreciate the central role in the Liar Argument of Tarski’s rephrasing of Aristotle’s point, we need to examine more than just a sketch of the argument. Alfred Tarski proposed a more formal characterization called Schema T or Convention T:

X is true if, and only if, p,

where “p” is a variable for a grammatical sentence and “X” is a name for that sentence. Here is one instance of that general schema:

“Snow is white” is true if, and only if, snow is white.

It is assumed here that we are building a theory of truth for English, and that we are using English to state the theory.

Tarski was the first person to claim that any theory of truth that could not entail all sentences of this schema would fail to be an adequate theory of truth.

If we were instead to build a theory of truth for German instead of English, but use English to state the theory, then the theory should, among other things, at least entail the T-sentence:

“Der Schnee ist weiss” is true in German if, and only if, snow is white.

A great many philosophers believe Tarski is correct when he claims his Convention T is a necessary condition on any successful theory of truth for any language, and the T sentences should be theorems in the metalanguage. But wait! Do we want all the T-sentences to be entailed and thus come out true? Probably not the T-sentence for the Liar Sentence. That T-sentence is:

T `L´ if and only if L.

Here T is the truth predicate (informally it is the predicate “__ is a true sentence”), and L is the Liar Sentence, namely ~T `L´. Substituting the latter for L on the right of the above biconditional yields the contradiction:

T`L´ if and only if ~T`L´.

That is the argument of the Liar Paradox, very briefly.

Tarski added precision to the discussion of the Liar by focusing not on a natural language such as English but on a classical, interpreted, formal language powerful enough to express at least elementary arithmetic. Here the difficulties produced by the Liar Argument became much clearer; and, very surprisingly, he was able to prove that Convention T, plus the assumption that the language contains its own concept of truth, produces semantic incoherence.

The proof requires the following additional assumptions. Here is a quotation from (Tarski 1944):

I. We have implicitly assumed that the language in which the antinomy is constructed contains, in addition to its expressions, also the names of these expressions, as well as semantic terms such as the term “true” referring to sentences of this language; we have also assumed that all sentences which determine the adequate usage of this term can be asserted in the language. A language with these properties will be called “semantically closed.”

II. We have assumed that in this language the ordinary laws of logic hold.

Tarski claimed that the crucial, unacceptable assumption of the formal version of the Liar Argument is the self-reference allowed by any semantically closed language because any semantically closed language contains its own global truth predicate, and this leads to a contradiction.

To expand on this point, in order for there to be a grammatical and meaningful Liar Sentence in a language, there must be a definable notion of is true which holds for the true sentences and fails to hold for the other sentences. If there were such a global truth predicate, then the predicate __ is a false sentence would also be definable; and [here is where we need the power of elementary number theory] a Liar Sentence would exist, namely a complex sentence ∃x(Qx & ~Tx), where Q and T are predicates that are satisfied by names of sentences. More specifically, T is the one-place, global truth predicate satisfied by all and only the names [that is, numerals for the Gödel numbers] of the true sentences, and Q is a one-place predicate that is satisfied only by the name of ∃x(Qx & ~Tx). But if so, then one can eventually deduce a contradiction. This correct deduction by Tarski is a formal analog of the informal argument of the Liar Paradox.

The contradictory result apparently tells us that the argument began with a false assumption. According to Tarski, the error that causes the contradiction is the assumption that the global truth predicate can be well-defined. Therefore, Tarski asserts that truth is not definable within a classical formal language that is classically interpreted—thus the name Undefinability Theorem or Indefinability Theorem. Tarski’s Theorem establishes that classically interpreted languages capable of expressing elementary arithmetic cannot contain their own global truth predicate, and so cannot be semantically closed.

Truth cannot be defined properly within a classical formal language, but there is no special difficulty in giving a proper definition of truth for a classical formal language, provided it is done outside the language; and Tarski himself was the first person to do this. In 1933, he created the first formal semantics for quantified predicate logic. Here are two imperfect examples of how he partly defines truth. First, the simple sentence Fa is true if, and only if, a is F (that is, a has property F, which in turn requires that a be a member of the extension of predicate F, where the extension is the set of all objects having the property F). For example, we might formalize the English sentence, Alfred is fat, by translating it as Fa; then Tarski is telling us that Alfred is fat just in case Alfred is a member of the set of all things that are fat.

For a second example of partly defining truth, Tarski says the universally quantified sentence ∀xFx is true if, and only if, all the objects in the domain are members of the set of objects that are F.

To repeat, a little more precisely but still imperfectly, Tarski’s theory implies that, if we have a simple, formal sentence `Fa´ in our formal language, say classical predicate logic, in which ‘a’ is the name of some object in the domain of discourse (that is, what we can talk about) and if ‘F’ is a predicate designating a property that perhaps some of those objects have, then ‘Fa‘ is true in the object language if, and only if, a is a member of the set of all things having property F. That set is called the extension of ‘F‘. Tarski also spoke of a satisfying ‘F‘ this way. For the more complex sentence ‘∀xFx‘ in our language, it is true just in case every object in the domain is in the extension of F.

These two definitions are still imprecise because the appeal to the concept of property should be eliminated, and the definitions should appeal to the notion of formulas being satisfied by sequences of objects. However, ignoring those details, what we have here are two examples of partially defining truth for the formal object language, say language 0, but doing it from outside language 0, in a meta-language, say language 1, namely English that contains some arithmetic and set theory and that might or might not contain language 0 itself. Tarski was able to show that in language 1 we do satisfy Convention T for the object language 0, because the equivalences:

`Fa´ is true in language 0 if, and only if, Fa

`∀xFx´ is true in language 0 if, and only if, ∀xFx

are both deducible in language 1, as are the other T-sentences.

Despite Tarski’s having this success with defining truth for an object language in its meta-language, Tarski’s Undefinability Theorem establishes that there is apparently no hope of defining truth within the object language itself.

Tarski then took on the project of discovering how close he could come to having a well-defined truth predicate within a classical formal language without actually having one. That project, his hierarchy of meta-languages, is also his key idea for solving the Liar Paradox. The project is discussed below.

2. Overview of Ways Out of the Paradox

a. Five Ways Out

There are many proposed solutions to the paradox. A solution which says to quit using language will stop the Liar Paradox; but surely the Liar Paradox can be stopped by making more conservative changes than this radical, ad hoc solution. All other things being equal, adopting simple, intuitive and conservative semantic principles is to be preferred ideally to adopting ad hoc, complicated and less intuitive semantic principles that have many negative consequences. The same goes for revision of a concept or revision of a logic.

So, we will not quit using language. Nor should we try to find a way out by declaring that we must adhere to the principle, Avoid all paradoxes. Saying that is trivial and unhelplful unless it also gives us other guidance about how to avoid them.

Shall we say instead that the problem is due somehow to the notorious vagueness of English (or whatever natural language is used to create the paradox)? Perhaps. However, more needs to be said because Tarski showed that by using a vagueness-free formal language he could produce the Liar Paradox.

Maybe the route to a solution is to uncover some subtle equivocation in our concepts employed in producing the contradiction. There have been many suggestions along this line, but none have been widely accepted.

Perhaps we should learn to live with paradox. Or perhaps we should simply accept that there is a contradiction unless we make appropriate changes. Because the Liar Paradox depends crucially upon our ideas about how to make inferences and how to understand the key semantic concepts of truth, reference, and negation, one might reasonably suppose that one of these needs revision. But we should proceed cautiously. No one wants to solve the Paradox by being heavy-handed and jettisoning more than necessary. We should be alert to the fact that any changes we do make might have their own drawbacks.

One final word of caution. No doubt the ordinary meaning of the word true is a bit vague, but if we decide to solve the Liar Paradox by revising the concept of truth, then we must remember that explications of true have to be true to some core of ordinary meaning of true lest a revision is so great that it no longer is a revision but instead a change of subject.

If we adopt the metaphor of a paradox as being an argument which starts from the home of seemingly true assumptions and which travels down the garden path of seemingly valid steps into the den of a contradiction, then a solution to the Liar Paradox has to find something wrong with the home, find something wrong with the garden path, or find a way to live within the den. Less metaphorically, the main, systematic ways out of the Paradox are the following:

The Liar Sentence is ungrammatical and so has no truth value (yet the argument of the Liar Paradox depends on it having a truth value).
The Liar Sentence is grammatical but meaningless and so has no truth value.
The Liar Sentence is grammatical and meaningful but still it has no truth value; it falls into the truth gap.
The Liar Sentence is grammatical, meaningful and has a truth value, but one other step in the argument of the Liar Paradox is faulty.
The argument of the Liar Paradox is acceptable, and we need to learn how to live with the Liar Sentence being both true and false.

Two philosophers might take the same way out, but for different reasons.

In presenting any of these five proposed solutions to the Paradox, it is helpful to explore the details and the implications. For example, do they accept, reject or revise the Law of Addition that was appealed to in step 3 of the Liar Argument back in Section 1 of this article? That step permits the deduction of L or Q from L alone. A solution is unacceptable if it cannot answer this question and give the answer a principled justification of some sort.

The five proposed solutions have a key feature in common. They recommend or presuppose logical monism and not logical pluralism. That is, they suppose there is a single, universal logic. This supposition has been challenged by some twentieth century logicians, although most others remain monists.

There are many suggestions for how to deal with the Liar Paradox, but most are never developed to the point of giving a detailed theory that can speak of its own syntax and semantics with precision. Some give philosophical arguments for why this or that conceptual reform is plausible as a way out of paradox, but then do not show that their ideas can be carried through in a rigorous way. Other attempts at solutions take the formal route and then require changes in standard formalisms so that a formal analog of the Liar Paradox’s argument fails, but then they do not offer a philosophical argument to back up these formal changes other than essentially saying, “It is successful in avoiding paradoxes so far.” A decent theory of truth showing the way out of the Liar Paradox requires both a coherent formalism (or at least a systematic theory of some sort) and a philosophical justification backing it up. The point of the philosophical justification is an unveiling of some hitherto unnoticed or unaccepted rule of language for all sentences of some category which has been violated by the argument of the Paradox. In brief, the philosophical point is that a paradox’s diagnosis should not proceed independently of its rigorous or formal treatment.

Some proponents of their own favorite solution to the paradox agree that a systematic approach to the paradox is valuable, and they point out that in some formalism, say first-order arithmetic, the Liar argument cannot be reconstructed. For one example, perhaps the proponents will argue that the sub-argument from the Liar sentence being true to its being false is acceptable, but the sub-argument from the Liar sentence being false to its being true cannot be reconstructed in their formalism. From this they conclude that the Liar sentence is simply false and paradox-free. This may be the key to solving the Paradox, but it is not successful if there is no satisfactory response to the complaint that perhaps their reconstruction using that formalism shows more about the inadequacy of the formalism than the proper way out of the paradox.

Hartley Slater offers a systematic treatment of the Liar Paradox that does not require formal languages, but that explains why treatments of the Liar with various formalizations, such as Tarski’s project of a hierarchy of metalanguages and his promotion of his Convention T in classical predicate logic, are inadequate. Slater’s systematic treatment concludes that “Indexicality infuses the whole of language, making Tarski’s Truth Scheme inappropriate, and thus resolving the Liar Paradox” (Slater, 2012, p. 85).

This need to have a systematic approach was seriously challenged by Ludwig Wittgenstein in his Philosophical Remarks:

I predict a time when there will be mathematical investigations of calculi containing contradictions, and people will actually be proud of having emancipated themselves from worries about consistency.

In 1938 in a discussion group with Alan Turing on the foundations of mathematics, Wittgenstein said one should try to overcome ”the superstitious fear and dread of mathematicians in the face of a contradiction.” The proper way to respond to any paradox, he said, is by an ad hoc reaction and not by any systematic treatment designed to cure both it and any future ills. Symptomatic relief is sufficient. He said it may appear legitimate, at first, to admit that the Liar Sentence is meaningful and also that it is true or false, but the Liar Paradox shows that one should retract this admission and either just not use the Liar Sentence in any arguments, or say it is not really a sentence, or at least say it is not one that is either true or false. Wittgenstein is not particularly concerned with which choice is made. And, whichever choice is made, he claimed it need not be backed up by any theory that shows how to systematically incorporate the choice. He treated the whole situation cavalierly and unsystematically. After all, he said, the language cannot really be incoherent because we have been successfully using it all along, so why all this fear and dread? Most logicians disagree with Wittgenstein and want systematic removal of the Paradox.

Disagreeing with Wittgenstein, P. F. Strawson has promoted the performative theory of truth as a way out of the Liar Paradox. Strawson has argued that the proper way out of the Liar Paradox is to carefully re-examine how the term truth is really used by speakers. He says such an investigation will reveal that the Liar Sentence is meaningful but fails to express a proposition.

To explore Strawson’s response more deeply, notice that Strawson’s proposed solution depends on the distinction between a proposition and the declarative sentence used to express that proposition. The next section explores what a proposition is, but let us agree for now that a sentence, when uttered, either expresses a true proposition, expresses a false proposition, or fails to express any proposition. According to Strawson, when we say some proposition is true, we are not making a statement about the proposition. We are not ascribing a property to the proposition such as the property of correspondence to the facts, or coherence, or usefulness. Rather, when we call a proposition true, we are only approving it, or praising it, or admitting it, or condoning it. We are performing an action of that sort. Similarly, when we say to our friend, “I promise to pay you fifty dollars,” we are not ascribing some property to the proposition, I pay you fifty dollars. Rather, we are performing the act of promising the $50. For Strawson, when speakers utter the Liar Sentence, they are attempting to praise a proposition that is not there, as if they were saying Ditto when no one has spoken. The person who utters the Liar Sentence is making a pointless utterance. According to this performative theory, the Liar Sentence is grammatical, but it is not being used to express a proposition and so is not something from which a contradiction can be derived. Strawson’s way out has been attractive to some researchers, but not to a majority.

Is it obvious that there is a unique way out? Perhaps the best we can do is to have a variety of ways out, some of which are better than some others in certain respects. That point should be kept in mind when this article cavalierly speaks of the way out.

b. Sentences, Statements, and Propositions

The Liar Paradox can be expressed in terms of sentences, statements, and propositions.

The Strengthened Liar might begin with any of these:

This sentence is not true.
This statement is not true.
This proposition is not true.
This is not true.

The sentence “I like that” can assert two very different propositions when asserted on two different occasions, one in which the word “that” refers to the dog on the mat, and the one in which the same word refers to the cat on the mat. And two sentences can express the same proposition, such as when someone says both, “I like that” and “I like the cat on the mat.”

Sentences are linguistic expressions, whereas statements and propositions are not. A proposition is usually said to be the content of a meaningful sentence. We sometimes use sentences to make statements and assert propositions, but we sometimes use sentences to ask questions and to threaten our enemies. When speaking about sentences, we usually are speaking about sentence types, not tokens. Tokens are the sound waves or the ink marks or the electronic events. Types are what is the same when we say that the same sentence was spoken by John, recorded in ink in his notebook, and sent over the Internet to his friend. In the process of asserting the Strengthened Liar sentence, the person is using a token of the word this to refer to a special sentence type, namely to the Strengthened Liar sentence. In the process of asserting the Strengthened Liar proposition, the person is using a token of the word this to refer to the meaningful content of a special sentence type, namely to the Strengthened Liar sentence.

This is a bit vague, but it is difficult to remove the vagueness. Philosophers disagree with each other about what a statement is, and they disagree even more about what a proposition is. Most philosophers will say that sentences do not themselves make statements. Rather it is we speakers who use sentences to make statements. Some philosophers will claim that it is statements or propositions that are primarily true or false, and a sentence is true or false only in a secondary sense. But other philosophers disagree and believe that it is sentences that are primarily true or false.

Despite Quine’s famous complaint that there are no propositions because there can be no precise criteria for deciding whether two different sentences are being used to express identical propositions, there are some very interesting reasons why researchers who work on the Liar Paradox should focus on propositions rather than on either sentences or statements, but those reasons are not explored here. John Corcoran suggests the following position:

A judgment is a private act that results in a belief; a statement is a public event usually involving a sentence. Each judgment and each statement is performed by a unique person at a unique time and place. Propositions and sentences are timeless and placeless abstractions. A proposition is an intensional entity; it is a meaning composed of concepts. A sentence is a linguistic entity. A written sentence is a string of characters. A sentence can be used by a person to express meanings, but no sentence is intrinsically meaningful. Only propositions are properly said to be true or to be false—in virtue of facts, which are subsystems of the universe (Corcoran 2009, p. 71).

For a discussion of the need for propositions, see (Barwise and Etchemendy 1987). The present article continues to speak primarily of sentences rather than propositions, though only for the purpose of simplicity.

c. An Ideal Solution to the Liar Paradox

Ideally, we would like a proposed solution to the Liar Paradox to provide a solution to all the versions of the Liar Paradox, such as the Strengthened Liar Paradox, the version that led to Buridan’s proof of God’s existence, and the contingent versions of the Liar Paradoxes. The solution should solve the paradox both for natural languages and formal languages, or provide a good explanation of why the paradox can be treated properly only in one but not the other. The contingent versions of the Liar Paradox are going to be troublesome because, if the production of the paradox does not depend only on something intrinsic to the sentence but also depends on what circumstances occur in the world, then there needs to be a detailed description of when those circumstances are troublesome and when they are not, and why.

It would be ideal if we had a solution to both the Liar Paradox and Curry’s Paradox, another paradox that turns on self-reference. Haskell Curry’s paradox concerns the following sentence C:

If C is true then ⊥.

Notice that the sentence C contains itself. The symbol “⊥” abbreviates a contradiction. This leads to a paradox because one instance of Tarski’s Convention T is the equivalence:

C is true iff C.

Substituting Curry’s definition of C for the second C on the right yields:

C is true iff if C is true then ⊥.

Now let us begin to construct a multi-step Conditional Proof. Assume that C is true. Then, because of the last equivalence, if C is true then ⊥. So, by modus ponens, ⊥. Hence, by Conditional Proof, we have established that:

if C is true then ⊥.

By the definition of C, this is:

Thus, by the first equivalence above (namely, C is true iff C), because we have established its right side:

C is true.

Therefore, by modus ponens on the previous two steps, we may infer:

⊥.

So, we have proved a contradiction. The outcome is a self-referential paradox that does not rely on negation, as the Liar Paradox does.

To have an ideal solution to the Liar Paradox, it would be reasonable to require a solution not only to the Curry Paradoxes but also to the Yablo Paradox which is Liar-like and Curry-like but which apparently does not rely on self-reference. In Stephen Yablo’s paradox, there is no way to coherently assign a truth value to any of the sentences in the countably infinite sequence of sentences of the form, None of the subsequent sentences are true. Imagine an unending line of people in numerical order who say, and only say, simultaneously:

1. Everybody after me is lying.

2. Everybody after me is lying.

3. Everybody after me is lying.

…

Ask yourself whether the first person’s sentence in the sequence is true or false. To produce the paradox it is crucial that the line of speakers be infinite. Notice that no sentence overtly refers to itself. There is controversy in the literature about whether the paradox actually contains a hidden appeal to self-reference or circularity. See (Beall 2001) for more discussion.

To summarize, an important goal for the best solution, or solutions, to the Liar Paradox is to offer us a deeper understanding of how our semantic concepts and principles worked to produce the Paradox in the first place, especially if a solution to the Paradox requires changing them. We want to understand the concepts of truth, reference, and negation that are involved in the Liar Paradox. In addition to these, there are the subsidiary principles and related notions of denial, definability, naming, meaning, predicate, property, presupposition, antecedent, and operating on prior sentences to form newer meaningful ones rather than merely newer grammatical ones. We would like to know what limits there are on all these notions and mechanisms, and how one impacts another.

What are the important differences among the candidates for bearers of truth? The leading candidates are sentences, propositions, statements, claims, and utterances. Is one primary, while the others are secondary or derivative? Ideally, we would like to know a great deal more about truth, but also falsehood and the related notions of fact, situation and state of affairs. We want to better understand what a language is and what the relationship is between an interpreted formal language and a natural language, relative to different purposes. Finally, it would be instructive to learn how the Liar Paradoxes are related to all the other paradoxes.

That may be quite a lot to ask, but then our civilization does have some time to investigate all this before the Sun expands and vaporizes our little planet.

d. Should Classical Logic be Revised?

An important question regarding the Liar Paradox is: What is the relationship between a solution to the Paradox for (interpreted) formal languages and a solution to the Paradox for natural languages? There is significant disagreement on this issue. Is appeal to a formal language a turn away from the original problem, and so just changing the subject? Can one say we are still on the subject when employing a formal language because a natural language contains implicitly within it some formal language structure? Or should we be in the business of building an ideal language to replace natural language for the purpose of philosophical study?

Is our natural language, for example, English, a semantically closed language? Does English have one or more logics? Should we conclude from the Liar Paradox that the logic of English cannot be standard logic but must be one that restricts the explosion that occurs due to our permitting the deduction of anything whatsoever from a contradiction? Should we say English really has truth gaps or perhaps occasional truth gluts (sentences that are both true and false)? So many questions.

Or instead can a formal language be defended on the ground that natural language is inconsistent and the formal language is showing the best that can be done rigorously? Can sense even be made of the claim that a natural language is inconsistent, for is not consistency a property only of languages with a rigorous structure, namely formal languages and not natural languages? Should we say people can reason inconsistently in natural language without declaring the natural language itself to be inconsistent? This article raises, but will not resolve, these questions, although some are easier to answer than others.

Many of the most important ways out of the Liar Paradox recommend revising classical formal logic. Classical logic is the formal logic known to introductory logic students as Predicate Logic in which, among other things, (i) all sentences of the formal language have exactly one of two possible truth values (TRUE, FALSE), (ii) the rules of inference allow one to deduce any sentence from an inconsistent set of assumptions, (iii) all predicates are totally defined on the range of the variables, and (iv) the formal semantics is the one invented by Tarski that provided the first precise definition of truth for a formal language in its metalanguage. A few philosophers of logic argue against any revision of classical logic by saying classical logic is the incumbent formalism that should be accepted unless an alternative is required (probably it is believed to be incumbent because of its remarkable success in expressing most of modern mathematical inference). Still, most other philosophers argue that classical logic is not the incumbent which must remain in office unless an opponent can dislodge it. Instead, the office has always been vacant.

In the decades since Tarski’s treatment of the Liar Paradox, there have been many new approaches that reject his classical, extensional logic in favor of alternative logics that do not require that his T-sentences be theorems of the metalanguage.

One critic of classical formal logic, Hartley Slater, says the usual formal languages fail at the crucial point of properly treating indexicals, words whose reference changes with context:

It is a recognition of the previous points about indexicality and sentence nominalisations that gets one out of the Liar… [B]ut the Truth Scheme “‘p’ is true ≡ p” does not apply when indexicals are involved, since one cannot say: ‘He is happy’ is true ≡ he is happy. (Slater 2012, p. 72)

Some philosophers object to revising classical logic if the purpose in doing so is merely to find a way out of the Paradox. They say that philosophers should not build their theories by attending to the queer cases. There are more pressing problems in the philosophy of logic and language than finding a solution to the Paradox, so any treatment of it should wait until these problems have a solution. From the future resulting theory which solves those problems, one could hope to deduce a solution to the Liar Paradox. However, for those who believe the Paradox is not a minor problem but is one deserving of immediate attention, there can be no waiting around until the other problems of language are solved. Perhaps the investigation of the Liar Paradox will even affect the solutions to those other problems.

3. Assessing the Five Ways Out

There have been many systematic proposals for ways out of the Liar Paradox. Below is a representative sample of five of the main ways out.

a. Russell’s Type Theory

Bertrand Russell said natural language is incoherent, but its underlying sensible part is an ideal formal language (such as the applied predicate logic of Principia Mathematica). He agreed with Henri Poincaré that the source of the Liar trouble is its use of self-reference. Russell’s way out was to rule out self-referential sentences as being ungrammatical or not well-formed in his ideal language.

In 1908 in his article “Mathematical Logic as Based on the Theory of Types” that is reprinted in (Russell 1956, p. 79), Russell solves the Liar with his ramified theory of types. This is a formal language involving an infinite hierarchy of, among other things, orders of propositions:

If we now revert to the contradictions, we see at once that some of them are solved by the theory of types. Whenever ‘all propositions’ are mentioned, we must substitute ‘all propositions of order n’, where it is indifferent what value we give to n, but it is essential that n should have some value. Thus when a man says ‘I am lying’, we must interpret him as meaning: ‘There is a proposition of order n, which I affirm, and which is false’. This is a proposition of order n+1; hence the man is not affirming any propositions of order n; hence his statement is false, and yet its falsehood does not imply, as that of ‘I am lying’ appeared to do, that he is making a true statement. This solves the liar.

Russell’s implication is that the informal Liar Sentence is meaningless because it has no appropriate translation into his formal language since an attempted translation violates his type theory. This theory is one of his formalizations of the Vicious-Circle Principle: Whatever involves all of a collection must not be one of the collection. Russell believed that violations of this principle are the root of all the logical paradoxes.

His solution to the Liar Paradox has the drawback that it places so many subscript restrictions on what can refer to what. It is unfortunate that the Russell hierarchy requires even the apparently harmless self-referential sentences This sentence is in English and This sentence is not in Italian to be syntactically ill-formed. The type theory also rules out explicitly saying (within his formalism) that legitimate terms must have a unique type, or saying that properties have the property of belonging to exactly one category in the hierarchy of types, which, if we step outside the theory of types, seems to be true about the theory of types. Bothered by this, Tarski took a different approach to the Liar Paradox.

b. Tarski’s Hierarchy of Meta-Languages

Reflection on the Liar Paradox suggests that either informal English (or any other natural language) is not semantically closed or, if it is semantically closed as it appears to be, then it is inconsistent—assuming for the moment that it does make sense to apply the term inconsistent to a natural language with a vague structure. Because of the vagueness of natural language, Tarski quit trying to find the paradox-free structure within natural languages and concentrated on developing formal languages that did not allow the deduction of a contradiction, but which diverge from natural language as little as possible.

One virtue of Tarski’s way out of the Liar Paradox is that it does permit the concept of truth to be applied to sentences that involve the concept of truth, provided we apply level subscripts to the concept of truth and follow the semantic rule that any subscript inside a pair of quotation marks must always be smaller than the subscript outside but still within the sentence; any violation of this rule produces a meaningless, ungrammatical formal sentence. Let language of level 1 be the meta-language of the object language that is in or at level 0. Level 0 sentences do not contain truth or similar terms, but would contain, say, Paris is the capital of France. The sentence saying this level 0 sentence is true occurs in level 1. It would be: Paris is the capital of France is true₀. No sentence is allowed to contain its own truth predicate.

The rule for subscripts stops the formation of both the Classical Liar Sentence and the Strengthened Liar Sentence anywhere within the hierarchy. The subscripting also stops paradoxical chains that start as follows:

The next sentence is true.

The previous sentence is false.

Another virtue of the Tarski way out is that it provides a way out of the Yablo Paradox.

Russell’s solution calls This sentence is in English ill-formed, but Tarski’s solution does not, so that feature is also virtue of Tarski’s way out. Tarski allows some self-reference, but not the self-reference involved in the Liar Paradox.

Tarski’s clever treatment of the Liar Paradox unfortunately has drawbacks. English has a single word true, but Tarski is replacing this with an infinite sequence of truth-like formal predicates, each of which is satisfied by the truths only of the language below it in the hierarchy of languages. Intuitively, a more global truth predicate should be expressible in the language it applies to. One hopes to be able to talk truly about one’s own semantic theory. The Tarski way out does not allow us even to say that in all languages of the hierarchy, some sentences are true. To use Wittgenstein’s phrase from his Tractatus, the character of the hierarchy can be shown but not said.

Despite these restrictions and despite the unintuitive and awkward hierarchy, Quine defends Tarski’s way out as the best of the ways. Here is Quine’s defense:

Revision of a conceptual scheme is not unprecedented. It happens in a small way with each advance in science, and it happens in a big way with the big advances, such as the Copernican revolution and the shift from Newtonian mechanics to Einstein’s theory of relativity. We can hope in time even to get used to the biggest such changes and to find the new schemes natural. There was a time when the doctrine that the earth revolves around the sun was called the Copernican paradox, even by the men who accepted it. And perhaps a time will come when truth locutions without implicit subscripts, or like safeguards, will really sound as nonsensical as the antinomies show them to be. (Quine 1976)

Tarski adds to the defense by stressing that:

The languages (either the formalized languages or—what is more frequently the case—the portions of everyday language) which are used in scientific discourse do not have to be semantically closed. (Tarski, 1944)

One criticism of Quine is that he is asking us to be patient and not to be so bothered by the complexity of the hierarchy, but he is giving no other justification for the hierarchy.

(Kripke 1975) criticized Tarski’s way out for its inability to handle contingent versions of the Liar Paradox such as one that begins with:

It is raining and this sentence is false

because Tarski cannot describe the contingency. That is, Tarski’s solution does not provide a way to specify the circumstances in which a sentence does leads to a paradox and the other circumstances it does not.

Putnam also criticized Tarski’s way out for its quietism about its own semantics:

The paradoxical aspect of Tarski’s theory, indeed of any hierarchical theory, is that one has to stand outside the whole hierarchy even to formulate the statement that the hierarchy exists. But what is this “outside place”—“informal language”—supposed to be? It cannot be “ordinary language,” because ordinary language, according to Tarski, is semantically closed and hence inconsistent. But neither can it be a regimented language, for no regimented language can make semantic generalizations about itself or about languages on a higher level than itself. (Putnam 1990, 13)

Within Tarski’s hierarchy of formal languages, we cannot say, Every language has true sentences (because no sentence can contain its own truth predicate in Tarski’s hierarchy) even though outside the hierarchy this is clearly a true remark about the hierarchy.

c. Kripke’s Hierarchy of Interpretations

Kripke’s way out of the Classical Liar Paradox requires a revision in our semantic principles but a less radical one than does the Russell solution or the Tarski-Quine solution. Kripke rejects the hierarchy of languages and retains the intuition that there is a single, semantically coherent and meaningful Liar Sentence, but argues that it is neither true nor false and so falls into a truth value gap. Kripke successfully develops the details using the tools of symbolic logic. Tarski’s Undefinability Theorem does not apply to languages having sentences that are neither true nor false. So, it can be argued that Kripke successfully shows that a semantically coherent formal language can contain its own global truth predicate in the sense that T(‘p’) is true whenever p is true, and is undefined if p is undefined. Not surprisingly, the negation of the truth predicate T does not quite express the concept of “not true” in the sense of meaning “false or undefined,” and so Kripke’s way out has a difficulty with the strengthened liar argument.

Let’s explore Kripke’s theory of truth in a bit more detail. He trades Russell’s and Tarski’s infinite syntactic complexity of languages for infinite semantic complexity of a single formal language. He rejects Tarski’s infinite hierarchy of meta-languages in favor of one formal language having an infinite hierarchy of partial interpretations. Consider a single formal language capable of expressing elementary number theory and containing a predicate T for truth (that is, for truth in an interpretation). Kripke assigns to T an elaborate interpretation, namely its extension (the set of sentences it is true of), its anti-extension (the set of sentences it is false of), and its undecideds (the set of sentences it is neither true nor false of). No sentence is allowed to be a member of both the extension and anti-extension of any predicate. Kripke allows the interpretation of T to change throughout the hierarchy. The basic predicates except the T predicate must have their interpretations already fixed in this base level. In the base level of the hierarchy, the predicate T is given a special extension and anti-extension. Specifically, its extension is all the (names of the) true sentences that do not actually contain the predicate symbol ‘T’, and its anti-extension is all the false sentences that do not contain ‘T‘. The predicate ‘T‘ is the formal language’s only basic partially-interpreted predicate.

As we ascend the hierarchy, distancing ourselves from the basic level, more and more complex sentences involving the symbol ‘T‘ get added into the extension and anti-extension of the intended truth predicate T. Each step up Kripke’s semantic hierarchy is another partial interpretation of the language. As we go up a level we add into the extension of T all the true sentences containing T from the lower level. Ditto for the anti-extension.

For example, at the lowest level in the hierarchy we have the (formal equivalent of the) true sentence 7 + 5 = 12. Strictly speaking it is not grammatical in English to say 7 + 5 = 12 is true because we make a use-mention error. More properly we should add quotation marks and say ‘7 + 5 = 12’ is true. In Kripke’s formal language, ‘7 + 5 = 12’ is true at the base level of the hierarchy. Meanwhile, the sentence that is the best candidate for saying it is true, namely ‘T(‘7+5=12’)’, is not true at that level, although it is added to the extension of T and thus is said to be true at the next higher level. Unfortunately at this new level, the even more syntactically complex sentence ‘T(‘T(‘7+5=12’)’)’ is still not yet true. It will become true at the next higher level. And so goes the hierarchy of interpretations as it attributes truth to more and more sentences involving the concept of truth itself. The extension of T, that is, the class of names of sentences that satisfy T, grows but never contracts as we move up the hierarchy, and it grows by calling more true sentences true. Similarly the anti-extension of T grows but never contracts as more false sentence involving T are correctly said to be false.

Kripke shows that T eventually becomes a truth-like predicate for its own level when the interpretation-building reaches the unique lowest fixed point at a countably infinite height in the hierarchy. At a fixed point, no new sentences are declared true or false, and at this level Kripke shows that the language also satisfies Tarski’s Convention T, so for this reason many philosophers are sympathetic to Kripke’s controversial claim that T is a truth predicate at that point. At this fixed point, the formal equivalent of the Liar Sentence still is neither true nor false, and so falls into the truth gap, just as Kripke set out to show. In this way, the Liar Paradox is solved, the formal language has a global truth predicate, the formal semantics is coherent, and many of our intuitions about semantics are preserved.

However, there are difficulties with Kripke’s way out. His treatment of the Classical Liar stumbles on the Strengthened Liar and reveals why that paradox deserves its name. For a discussion of why, see (Kirkham 1992, pp. 293-4).

Some critics of Kripke’s theory say that in the fixed-point the Liar Sentence does not actually contain a global truth predicate but rather only a clever restriction on the truth predicate, and so Kripke’s Liar Sentence is not really the Liar Sentence after all; therefore we do not have here a solution to the Liar Paradox. Other philosophers say this is not a fair criticism of Kripke’s theory since Tarski’s Convention T, or some other intuitive feature of our concept of truth, must be restricted in some way if we are going to have a formal treatment of truth.

What can more easily be agreed upon by the critics is that Kripke’s candidate for the Liar sentence falls into the truth gap in Kripke’s theory at all levels of his hierarchy, so it is not true in his theory. [We are making this judgment that it is not true from within the meta-language in which sentences are properly said to be true or else not true.] However, in the object language of the theory, one cannot truthfully say the Liar Sentence is not true since the obvious candidate expression for that, namely ~Ts, is not true, but rather falls into the truth gap. Therefore, Kripke’s truth-gap theory cannot state its own thesis.

Robert Martin and Peter Woodruff created the same way out as Kripke, though a few months earlier and in less depth.

d. Barwise and Etchemendy

Another way out says the Liar Sentence is meaningful and is true or else false, but one special step of the argument in the Liar Paradox is incorrect, namely, the inference from the Liar Sentence’s being false to its being true. Arthur Prior, following the informal suggestions of Jean Buridan and C. S. Peirce, takes this way out and concludes that the Liar Sentence is simply false. So do Jon Barwise and John Etchemendy, but they go on to present a detailed, formal treatment of the Paradox that depends crucially upon using propositions rather than sentences. The details of their treatment will not be sketched here. Their treatment says the Liar Proposition is simply false on one interpretation but simply true on another interpretation, and that the argument of the Paradox improperly exploits this ambiguity. The key ambiguity is to conflate the Liar Proposition’s negating itself with its denying itself. Similarly, in ordinary language we are not careful to distinguish asserting that a proposition is false from denying that it is true.

Three positive features of the Barwise-Etchemendy solution are that (i) it applies to the Strengthened Liar, (ii) its propositions are always true or false, but never both, and (iii) it shows the way out of paradox both for natural language and interpreted formal language. Yet there is a price to pay. No proposition in their system can be about the whole world, and this restriction is there for no independent reason but only because otherwise we would get a paradox.

e. Paraconsistency

A more radical way out of the Paradox is to argue that the Liar Sentence is both true and false. This solution is a version of dialethism, the thesis that some contradictions are true. It embraces the Liar contradiction, then tries to limit the damage that is ordinarily a consequence of that embrace. This way out changes the classical rules of semantics in two ways: (1) it allows the Liar Sentence to be both true and false, and (2) it limits the damage by preventing the semantic incoherence that occurs from allowing everything to follow from any contradiction. The damaging principle of classical logic, called Explosion, is: (p & ~p) ⊧ q. A logic for which Explosion fails is called a paraconsistent logic.

This way out was initially promoted primarily by Graham Priest in 1979. It succeeds in avoiding semantic incoherence while offering a formal, detailed treatment of the Paradox. Priest is not a logical pluralist, and he proposes that there is one true paraconsistent logic. One noteworthy feature of Priest’s truth-glut semantics is that it is the same as Kleene’s strong three-valued semantics with truth-gaps if we apply this translation scheme:

Kleene	⇒	Priest
True	⇒	True only
False	⇒	False only
No Truth Value	⇒	Both True and False

In formalizing reasoning with paradoxical sentences in Priest’s theory, a paradoxical sentence will imply some sentence P & ~P in the object language; but using Tarski’s T-scheme, this transforms immediately into:

P is true and P is not true

so the contradiction propagates into the metalanguage.

A principal virtue of the paraconsistency treatment is that, unlike with Barwise and Etchemendy’s treatment, a sentence can be about the whole world. Critics of this approach to the Liar have complained that it does not seem to solve the Strengthened Liar Paradox, nor Curry’s Paradox; and it does violence to our intuition that sentences cannot be both true and false in the same sense in the same situation. See the last paragraph of “Paradoxes of Self-Reference,” for more discussion of using paraconsistency as a way out of the Liar Paradox.

4. Conclusion

To summarize, when we treat the Liar Paradox we should provide two things, an informal diagnosis which pinpoints the part of the paradox’s argument that has led us astray, and a formalism that prevents the occurrence of the paradox’s argument within that formalism.

Russell, Tarski, Kripke, Barwise-Etchemendy, and Priest (among many others) deserve credit for providing a philosophical justification for their proposed solutions while also providing a formal treatment in symbolic logic that shows in detail both the character and implications of their proposed solutions. The theories of Russell and of Quine-Tarski do provide a treatment of the Strengthened Liar, but at the cost of assigning complex levels to the relevant sentences. On the positive side, their treatment does not take Russell’s radical step of ruling out all self-reference. Kripke’s elegant and careful treatment of the Classical Liar stumbles on the Strengthened Liar. Barwise and Etchemendy’s way out avoids these problems, but requires accepting the idea that no sentence can be used to say anything about the whole world, including the semantics of our language. Priest’s way out requires giving up our intuition that no context-free, unambiguous sentence is both true and false.

In conclusion, it appears that more work needs to be done in finding the best way, or the best ways, out of the Liar Paradox that will preserve the most important intuitions we have about semantics while avoiding semantic incoherence. In this vein, one can draw a pessimistic conclusion and an optimist conclusion. Taking the pessimistic route, Putnam says:

If you want to say something about the liar sentence, in the sense of being able to give final answers to the questions “Is it meaningful or not? And if it is meaningful, is it true or false? Does it express a proposition or not? Does it have a truth value or not? And which one?” then you will always fail. In closing, let me say that even if Tarski was wrong (as I believe he was) in supposing that ordinary language is a theory and hence can be described as “consistent” or “inconsistent,” and even if Kripke and others have shown that it is possible to construct languages that contain their own truth-predicates, still, the fact remains that the totality of our desires with respect to how a truth-predicate should behave in a semantically closed language, in particular, our desire to be able to say without paradox of an arbitrary sentence in such a language that it is true, or that it is false, or that it is neither true nor false, cannot be adequately satisfied. The very act of interpreting a language that contains a liar sentence creates a hierarchy of interpretations, and the reflection that this generates does not terminate in an answer to the questions “Is the liar sentence meaningful or meaningless, or if it is meaningful, is it true or false?” (Putnam 2000)

In (Putnam 2012,p. 206), Putnam concluded that “a solution does not seem to be possible” if by a solution, we mean one that makes all appearance of paradox go away.

More optimistically, should there really be so much fear and loathing about limitations on our ability to formally express all the theses of our favored theory? Many fields have learned to live with their limitations. ZFC set theory cannot speak of the set of all its sets, but it remains a fruitful theory.

5. References and Further Reading

For further reading on the Liar Paradox that provides more of an introduction to it while not presupposing a strong background in symbolic logic, the author recommends reading the article below by Mates, plus the first chapter of the Barwise-Etchemendy book, and then chapter 9 of the Kirkham book. The rest of this bibliography is a list of contributions to research on the Liar Paradox, and all members of the list require the reader to have significant familiarity with the techniques of symbolic logic. In the formal, symbolic tradition, other important researchers in the last quarter of the 20th century when research on the Liar increased dramatically were Burge, Gupta, Herzberger, McGee, Parsons, Putnam, Routley, Skyrms, van Fraassen, and Yablo.

Barwise, Jon and John Etchemendy. The Liar: An Essay in Truth and Circularity, Oxford University Press, 1987.
Beall, J.C. (2001). “Is Yablo’s Paradox Non-Circular?” Analysis 61, no. 3, pp. 176-87.
Burge, Tyler. “Semantical Paradox,” Journal of Philosophy, 76 (1979), 169-198.
Corcoran, John. “Sentence, Proposition, Judgment, Statement, and Fact: Speaking about the Written English Used in Logic” in W. A. Carnielli (ed.), The Many Sides of Logic, College Publications. pp. 71-103. 2009.
Dowden, Bradley. “Accepting Inconsistencies from the Paradoxes,” Journal of Philosophical Logic, 13 (1984), 125-130.
Gupta, Anil. “Truth and Paradox,” Journal of Philosophical Logic, 11 (1982), 1-60. Reprinted in Martin (1984), 175-236.
Herzberger, Hans. “Paradoxes of Grounding in Semantics,” Journal of Philosophy, 68 (1970), 145-167.
Kirkham, Richard. Theories of Truth: A Critical Introduction, MIT Press, 1992.
Kripke, Saul. “Outline of a Theory of Truth,” Journal of Philosophy, 72 (1975), 690-716. Reprinted in (Martin 1984).
Martin, Robert. The Paradox of the Liar, Yale University Press, Ridgeview Press, 1970. 2nd ed. 1978.
Martin, Robert. Recent Essays on Truth and the Liar Paradox, Oxford University Press, 1984.
Martin, Robert and Peter Woodruff. “On Representing ‘True-in-L’ in L,” Philosophia, 5 (1975), 217-221.
Mates, Benson. Skeptical Essays, The University of Chicago Press, 1981. See especially “Two Antinomies,” on pages 15-57.
McGee, Vann. Truth, Vagueness, and Paradox: An Essay on the Logic of Truth, Hackett Publishing, 1991.
Parson, Charles. “The Liar Paradox,” Journal of Philosophical Logic 3 (1974): 381-412.
Priest, Graham. “The Logic of Paradox,” Journal of Philosophical Logic, 8 (1979), 219-241; and “Logic of Paradox Revisited,” Journal of Philosophical Logic, 13 (1984), 153-179.
Priest, Graham, Richard Routley, and J. Norman (eds.). Paraconsistent Logic: Essays on the Inconsistent, Philosophia-Verlag, 1989.
Prior, Arthur N. “Epimenides the Cretan,” Journal of Symbolic Logic, 23 (1958), 261-266.
Prior, Arthur N. “On a Family of Paradoxes,” Notre Dame Journal of Formal Logic, 2 (1961), 16-32.
Putnam, Hilary. Realism with a Human Face, Harvard University Press, 1990.
Putnam, Hilary. “Paradox Revisited I: Truth.” In Gila Sher and Richard Tieszen, eds., Between Logic and Intuition: Essays in Honor of Charles Parsons, Cambridge University Press, (2000), 3-15.
Putnam, Hilary. Philosophy in an Age of Science: Physics, Mathematics, and Skepticism. Harvard University Press, 2012.
Quine, W. V. O. “The Ways of Paradox,” in his The Ways of Paradox and Other Essays, rev. ed., Harvard University Press, 1976.
Russell, Bertrand. “Mathematical Logic as Based on the Theory of Types,” American Journal of Mathematics, 30 (1908), 222-262.
Russell, Bertrand. Logic and Knowledge: Essays 1901-1950, ed. by Robert C. Marsh, George Allen & Unwin Ltd. (1956).
Skyrms, Brian. “Return of the Liar: Three-valued Logic and the Concept of Truth,” American Philosophical Quarterly, 7 (1970), 153-161.
Slater, Hartley. “Logic is Not Mathematical,” Polish Journal of Philosophy, Spring 2012, pp. 69-86.
Strawson, P. F. “Truth,” in Analysis, 9, (1949).
Tarski, Alfred. “The Concept of Truth in Formalized Languages,” in Logic, Semantics, Metamathematics, pp. 152-278, Clarendon Press, 1956.
Tarski, Alfred. “The Semantic Conception of Truth and the Foundations of Semantics,” in Philosophy and Phenomenological Research, Vol. 4, No. 3 (1944), 341-376.
Van Fraassen, Bas. “Truth and Paradoxical Consequences,” in (Martin 1970).
Woodruff, Peter. “Paradox, Truth and Logic Part 1: Paradox and Truth,” Journal of Philosophical Logic, 13 (1984), 213-231.
Wittgenstein, Ludwig. Remarks on the Foundations of Mathematics, Basil Blackwell, 3rd edition, 1978.
Yablo, Stephen. (1993). “Paradox Without Self-Reference.” Analysis 53: 251-52.

Author Information

Bradley Dowden
Email: dowden@csus.edu
California State University, Sacramento
U. S. A.

The Sheffer Stroke

The Sheffer Stroke is one of the sixteen definable binary connectives of standard propositional logic. The stroke symbol is “|” as in $$(p \mid q) \leftrightarrow (\neg p \vee \neg q)$$ The linguistic expression whose logical behavior is presumed modeled by this logical connective is the truth-functional phrase “not both,” from which the name NAND originates.

All sixteen connectives interpret associated functions of the Boolean algebra. In the theory of electronic circuits the Boolean functions are implemented by electronic or logic gates: the gate implementing the associated function of the Sheffer Stroke is called NAND and is known as a “universal gate.” The Sheffer Stroke has the remarkable metalogical property known as functional completeness (more precisely, weak functional completeness.) A connective is functionally complete (more precisely, weakly functionally complete) for a formal language $L$ if and only if all mathematically definable connectives of $L$ (except for the zeroary connectives or constants) can be defined by using that connective as the only connective. In using the familiar truth table for the semantics of the standard propositional logic, the functional completeness of the Sheffer Stroke means that, for every truth table labeled by a well-formed formula of the logic, there is an identical truth table whose labeling formula has the Sheffer Stroke symbol as the only connective symbol; or, every definable connective can be defined by a truth table that is labeled by a formula that has the Sheffer Stroke as its only connective symbol. (Two truth tables are identical if they agree on every truth value output corresponding to the same truth value input assignments.) The same observations about functional completeness apply to the case of the Peirce Arrow, which is the dual of the Sheffer Stroke.

The discovery of the Sheffer Stroke was achieved independently by Henry M. Sheffer in 1913 after it had been realized previously by Charles Sanders Peirce, as attested by a fragment written in 1880 (and, again, in 1902). This landmark discovery was hailed by such seminal figures in the history of logic as Ludwig Wittgenstein and Bertrand Russell.

An elegant result due to Emile Post (1941) makes it possible to account for the property of functional completeness of the Sheffer Stroke on the grounds that it is lacking certain characteristic “hereditary” properties. This is examined in detail in the present article.

The logical-philosophic significance of the availability of a Sheffer function was taken by Ludwig Wittgenstein (in the Tractatus Logico-Philosophicus, 1922) to consist in its perspicuous illustration of deeper features of formal logic. In natural languages, the phrases whose logical behavior is captured by the Sheffer Stroke and the Peirce Arrow are, respectively, “not both” and “neither-nor”: these seem rather unremarkable, but this is a sign that what is at stake in functional completeness investigations is characteristically related to the study of formal logic and is not relevant to the goals of studying natural languages.

The Sheffer Stroke and Its Place in Propositional Logic
History
1. Peirce’s Discovery
2. The “Discovery” and Principia Mathematica
The Logical Connectives of Standard Propositional Logic and the Sheffer Stroke
Properties of the Sheffer Stroke
Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic,
and Philosophy
1. Wittgenstein’s Tractatus and the Sheffer Stroke
References and Further Reading

1. The Sheffer Stroke and Its Place in Propositional Logic

The linguistic phrase whose logical behavior is traced through the Sheffer Stroke is the truth-functional expression “not both ___ and —,” and its logical equivalent is “either not ___ or not —.” The term “Sheffer Stroke” is the name of the symbol “ $\mid$ ” denoting the binary logical connective of the standard Propositional Logic that is usually called Sheffer Stroke. Thus, the name Sheffer Stroke is used not simply for the symbol but also for the logical connective itself. This article refers to the logical connective indifferently as the Sheffer Stroke trusting that context removes any ambiguity between the connective and its symbol.

Other names of the logical connective are Alternate Denial, NAND and Negated Conjunction. Readers of older Logic textbooks are likely to find the connective called Alternate Denial. There is another, related, logical connective of standard propositional logic, called NOR, Joint Denial, Negated Disjunction, and Joint Exclusion; it is sometimes called Peirce’s Arrow or Quine’s Dagger (although the last two, as with “Sheffer Stroke” are, strictly speaking, names of symbols used for that connective.) The relationship between the Sheffer Stroke and NOR is deep and has profound interest, which will be explored in this article.

The term “Sheffer Stroke” refers also to the symbol used to denote the logical connective that has the same name; this connective is also known by other names, as will be seen. This article speaks of logical connectives. Strictly speaking, the Sheffer Stroke connective is the semantic analogue of a definable binary Boolean function which can be called the associated Boolean function of the connective. This Boolean function is known as NAND in the theory of electronic or logic gates, where it serves as one of two universal gates; precisely speaking, the physical gate is an instance of implementation of the Boolean function whose propositional-logic interpretation is the Sheffer Stroke. That interpretation is not examined in the present article.

This article focuses only upon propositional logic, unless otherwise indicated. Upon turning to the examination of Wittgenstein’s comments, this restriction will be lifted. Propositional logic can be considered as the special case of predicate or first-order logic with all predicate constants as being zero-place in its signature. Propositional logic is bereft of symbolic resources needed for checking many argument forms as valid, for the translation of mathematical statements, and for many other reasons. This article is confined to propositional logic only for the limited purpose of avoiding certain complications while our present interest is in laying out certain basic concepts.

The term Sheffer Functions is sometimes used to refer to two Boolean functions, one of which is interpreted as our Sheffer Stroke and the other is known as NOR (in some interpretations) or Peirce’s Arrow (and also by other names, as will be seen.) Both of these so-called Sheffer functions are binary truth functions (with the names also used for the uninterpreted associated Boolean functions); they both have the remarkable property of being functionally complete—in the sense defined above. The term “Sheffer functions” is also used to refer to functionally complete functions of alternate or non-standard many-valued logics. Some authors who generalize the term “Sheffer function” to many-valued logics define it so that it applied only to unary or binary functions that are, each, functionally complete. Others use the term regardless of the arity of the functionally complete function. A theorem proven by Emile Post in 1921 shows that proven existence of functions of arity n = 1 or n = 2 that define all unary and binary functions of a formal language implies that those functions can also define all functions of higher arities. This result holds regardless of the number of truth values over which the connectives are defined. In the standard two-valued propositional logic, there are no unary connectives that are functionally complete but there are exactly two binary connectives that are, and these are called the Sheffer functions of the standard propositional logic.

The kind of inquiry that reveals the remarkable properties of the Sheffer Stroke is, properly speaking, metalogical or metatheoretical. The two logical connectives, NAND (or the Sheffer Stroke) and NOR, are sometimes referred to summarily as the Sheffer functions. Strictly speaking, those are the associated Boolean functions which are semantically interpreted by the logical connectives. For present purposes, this article does not dwell on this distinction. It speaks consistently about logical connectives. The article investigates the significance of the Sheffer Stroke and NOR in the section on the Properties of the Sheffer Stroke. It also traces the historical background of the discovery of these connective. (see History)

Note that there is inconsistency in the bibliography with respect to both notational variants and terminological jargon. The logician H. M. Sheffer, after whom the connective is named, actually used NOR but Russell-Whitehead used the NAND function when they extolled this discovery in a specially added section to the second edition of their famed Principia Mathematica in the aftermath of what they took to be Sheffer’s discovery. (Whitehead-Russell, 1925, 1927) It was Whitehead-Russell who gave the name “Sheffer Stroke” to the connective. Although the symbol itself had been used by Sheffer to denote NOR, Sheffer rather incongruously called the symbol of his connective “per”, in analogy with the symbol of algebraic division, and he called the connective (now usually called NOR) “rejection.”

As a logical connective, the Sheffer Stroke stands for a Boolean function defined over the set of two values, $$2 = \{1, 0\}$$ Insofar as the semantic connective Sheffer Stroke is being examined, think of the values as the truth values True and False and denote them respectively by T and F. It is not unusual to speak interchangeably, or indifferently, of truth functions and logical connectives. Unfortunately, as it was just noted, the bibliography, ranging over several decades in the development of modern logic, is not consistent when it comes to terminological or notational matters. For present purposes, lay down a certain convention: distinguish between logical connectives (also called truth functions) and their underlying or associated Boolean functions. If the symbol of the logical connective is, generally, “$*$” then symbolize the associated Boolean function by “$f_{*}$.” In doing so, reserve standard algebraic methods of definition for the associated functions but define the logical connectives by means of the familiar truth table.

The domain $D$ of the associated function of the Sheffer Stroke $f_{\mid}$ is the Cartesian product $$\{1, 0\} \times \{1, 0\}$$ the range $R$ of the function is $$\{1, 0\}$$ Thus, the associated function of the Sheffer Stroke connective is defined as follows:

$$f_{\mid}: D = \{1, 0\} \times \{1, 0\} = \{<1, 1>, <1, 0>, <0, 1>, <0, 0> \} \rightarrow R = \{1, 0\}$$

For the sake of completeness, alternative ways of defining this Boolean function will be shown. These, however, should be considered as notational variants; it is the same Boolean function that they all define.

$$f_{\mid}(1,1) = 0; f_{\mid}(1,0) = 1; f_{\mid}(0,1) = 1; f_{\mid}(0,0) = 1;
f_{\mid}(x,y) = 0$$ when $$x = y = 1; f_{\mid}(x,y) = 1$$ otherwise
$$f_{\mid}(x,y) = \{<< 1, 1>, 0> , <<1, 0>, 1>, <<0, 1>, 1>, <<0, 0>, 1>\}$$

It is customary to define logical connectives of logical systems or languages by means of the familiar truth table. The truth table for the connective called the Sheffer Stroke or NAND is given below.

p	q	p	$$\mid$$	q
T	T	T	F	T
T	F	T	T	F
F	T	F	T	T
F	F	F	T	F

One can also use the familiar truth table to ascertain that the Sheffer Stroke connective receives the same truth value outputs with the negation of conjunction for all possible assignments of truth values to the individual propositional components. The propositional connective negation, by definition, reverses the truth values of its inputs and the conjunction connective receives the output T only when both of its inputs are T while it receives F for all other possible assignments of truth values to its components. This article symbolizes the negation connective by “$\neg$” and the conjunction connective by “$\wedge$”. Because the formulas written in bold are logically equivalent, the formula formed by connecting them with “$\leftrightarrow$” (symbol of material equivalence) should be a tautology: the truth table verifies this result. (The logical connective of material equivalence is so defined that it receives the output T if and only if its input values are the same truth values.)

p	q	(p	$$\pmb{\mid}$$	q)	$$\leftrightarrow$$	$$\pmb{\neg}$$	(p	$$\pmb{\wedge}$$	q)
T	T	T	F	T	T	F	T	T	T
T	F	T	T	F	T	T	T	F	F
F	T	F	T	T	T	T	F	F	T
F	F	F	T	F	T	T	F	F	F

The linguistic expression whose logical behavior is presumed modeled by this logical connective is the truth-functional phrase “not both,” from which the name NAND originates. This expression is logically equivalent (it yields the same truth value for the same assignments of truth values to its components) with “either not the first or not the second” for two component propositions; hence the alternative name of this connective as Alternate Denial. Making the claim that the Sheffer Stroke connective models such expressions of language means that what is modeled is taken to be truth-functional expressions of a natural language like English. Truth-functionality means that the compound proposition always takes a truth value (true or false) that can be uniquely determined when the truth values of the components or parts are known; this is because the special logical particle (in this case “not both”) that connects the component propositions is definable in terms of its truth conditions (what truth value it yields for specified assignments of truth values to the component propositions it connects). Insofar as one is dealing with truth-functional expressions, the Principle of Compositionality of Meaning applies: the logical meaning of the composite depends uniquely on the specified logical meanings of its parts. For non-truth-functional meanings of “not” or “and,” the expression “not both” is not truth-functional and cannot be modeled by the connective called Sheffer Stroke or NAND.

Connecting two propositions by means of the linguistic particle modeled by the Sheffer Stroke asserts the claim that these two propositions are mutual contraries. One can appreciate what contrariety means by checking the truth table above, by means of which the Sheffer Stroke connective is defined: the compound in which the Sheffer Stroke symbol is the principal-connective symbol is false only when the connected propositional components are both true; it is true in every other case (or model, which means assignment of truth values to the propositional components or, also called, valuation.) Contrariety (or mutual contrariety), then, means that the propositions that are presumed contraries cannot possibly be true together but they can possibly be false together. One should distinguish this from the relationship known as mutual contradictoriness: two propositions are mutual contradictories if and only if they cannot possibly be true together and they cannot possibly be false together. If two propositions p and q are mutual contradictories, then the compound proposition formed by connecting them by means of the exclusive either-or is a logical truth. On the other hand, based on what has been said, and as can be seen by the truth table above, one has: when two propositions are mutual contraries, then the proposition formed by connecting them by means of the Sheffer Stroke connective is a logical truth.

a. Alternative Definitions of the Sheffer Stroke

There are other ways of defining the Sheffer Stroke connective. Its matrix definition is as follows:

$$\pmb{p \mid q}$$	T	F
T	F	T
F	T	T

The Disjunctive Normal Form (DNF) of $$\ulcorner p \mid q\urcorner$$ is $$\ulcorner \neg p \vee \neg q\urcorner$$ (Corner brackets are used because there is reference to symbols of the formal object language within the metalanguage, which is a symbolically enhanced fragment of English used to talk about the formal language. Notice that symbols like “$\varphi$”, on the other hand, are themselves metalinguistic and do not take corner brackets. No such brackets are needed also in the case in which the formulas are presented by themselves in space reserved for them.)

The DNF of a well-formed formula $\varphi$ can be obtained from the truth table of $\varphi$ by means of the following method: Check the rows, and only the rows, across which $\varphi$ receives the truth value T. If an individual (or atomic) variable receives T on that row, reproduce it as it is, $\ulcorner p \urcorner$; if the individual variable receives F on that row, reproduce it as negated. $\ulcorner \neg p \urcorner$. Next, form the conjunction of the propositional variables so represented (which means that one connects them by the connective symbol $\ulcorner \wedge \urcorner$.) Do this for all rows on which $\varphi$ receives T. Finally, join all the conjunctions formed in this manner by means of inclusive disjunctions, symbolized by $\ulcorner \vee \urcorner$.

Thus, examining the truth table by means of which the Sheffer Stroke was defined, one has: the value T is received on the rows for values of the single propositional variables:

$$<p^{T}, q^{F}>, <p^{F}, q^{T}>, <p^{F}, q^{F}>$$

Form the conjunctions first:

$$p \wedge \neg q, \neg p \wedge q, \neg p \wedge \neg q$$

Then, form their conjunction:

$$(p \wedge \neg q) \vee (\neg p \wedge q) \vee (\neg p \wedge \neg q)$$

This expression admits of further simplification (a subject that is beyond current concerns), to yield a logically equivalent formula:

$$\neg p \vee \neg q$$

A method of representation known as the Karnaugh Map is as follows for the Sheffer Stroke. Two different variants of this method are explored. This is essentially diagrammatic as it allows for simplifications of well-formed formulas that are first transformed into their equivalent normal forms before they are mapped by this type of diagram. The normal form for the Shefer Stroke is:

$$(p \mid q) \leftrightarrow (\neg p \vee \neg q)$$

The expression to the right is in both Disjunctive and Conjunctive Normal Form. It has exactly two literals, $\ulcorner \neg p\urcorner$ and $\ulcorner \neg q\urcorner$. Taken as a Disjunctive Normal Form, it has as literals the negations of the two propositional variables: accordingly, we enter into the Karnaugh Map the values T and F in a way we will present now briefly. (Usually, this kind of diagram takes the values as uninterpreted or numerical, $$\{1, 0\}$$ but we can disregard this.) To enter the proper values, we follow the entire row or entire column along which the variable receives the truth value True as shown below. The remaining blocks receive F.

$$\pmb{p \mid q}$$	$$\pmb{q}$$	$$\pmb{\neg q}$$
$$\pmb{p}$$	F	T
$$\pmb{\neg p}$$	T	T

An alternative version (actually corresponding more closely to the initial design of this diagrammatic method) is as follows:

$$\pmb{p \mid q}$$	$$\pmb{T}$$	$$\pmb{F}$$
$$\pmb{T}$$	F	T
$$\pmb{F}$$	T	T

In older texts, we find definitions of connectives like the following definition of the Sheffer Stroke. We consider the propositional variables to be taking truth values in the order: $$<TT, TF, FT, FF>$$ This method of definition is found, along with the truth-tabular definition, in Wittgenstein’s Tractatus.

$$p \mid q \stackrel{\text{def}}{=} (FTTT) (p,q)$$

In textbooks like the one written by Arthur Prior (1962, pp. 5-21) the definition would be given as follows:

$$T \mid T = F ; T \mid F = T ; F \mid T = T ; F \mid F = T$$

Because Prior uses the Polish notation (see section 1c below), he defines the Sheffer Stroke and Peirce Arrow, symbolized respectively by “D” and “X”, as follows—with “N” symbolizing negation, “A” symbolizing inclusive disjunction, “K” symbolizing conjunction, while prefix notation is used throughout:
$$Dpq \stackrel{\text{def}}{=} NKpq = ANpNq$$
$$Xpq \stackrel{\text{def}}{=} NApq = KNpNq$$
Another way of defining the Sheffer Stroke and Peirce Arrow is given (Prior, 1962, p. 12), reading the output values from left to right inside the parenthesis as corresponding to value assignments for the atomic components as $$<1, 1>, <1, 0>, <0, 1>, <0, 0>$$
$$Dpq: (0, 1, 1, 1)pq$$
$$Xpq: (0, 0, 0, 1)pq$$

In the set-theoretic interpretation of Boolean functions, the operation that corresponds to the Sheffer Stroke or NAND is complementation of intersection of sets. Clearly, complementation (symbolized by “$ ‘ $” is the set-theoretic analogue of negation and intersection (symbolized by “$\cap$”) is the set-theoretic analogue of conjunction. The symbol “$\in$” stands for set membership.

$(A \cap B)’ = \{x:$ it is not the case that both $x \in A $ and $x \in B\}$

A Venn diagram can be drawn of the operation.

General Venn Diagram Regions

A: 1 and 2

B: 2 and 3

A $\cap B$: 2

A$‘$: 3 and 4

B$‘$: 1 and 4

A$‘ \cap B’$: 4

NAND: $(A \cap B)’$: 1 and 3 and 4

null

NAND Venn Diagram (yellow area)

null

Boolean functions can be represented as operations in an algebra,
$$\mathscr{B} = <\{1, 0\}, \{\times, + y\}, 1>$$

with carrier set $\{1, 0\}$ and adequately equipped with a set of operations of multiplication and addition-modulo-2 along with the constant or zero-ary function 1. The definitions of the operations over the carrier set’s values are:
$$1 \times 1 = 1, 1 \times 0 = 0, 0 \times 1 = 0, 0 \times 0 = 0$$
$$1 + 1 = 0 + 0 = 0, 1 + 0 = 0 + 1 = 1$$
The Sheffer Stroke and Peirce Arrow are definable in this algebra as:
$$f_{\mid}(x, y) = (x \times y) + 1$$
$$f_{\downarrow}(x, y) = (x \times y) + x + y + 1$$
The multiplication sign is omitted as is conventional in standard notations. So, we have:
$$f_{\mid}(x, y) = xy + 1$$
$$f_{\downarrow}(x, y) = xy + x + y + 1$$
Considering that the general form for binary polynomials representing functions is
$$f^{*}(x, y) = \alpha xy + \beta x + \gamma y + \delta$$
the coefficients are $$\alpha = 1, \beta = 0, \gamma = 0, \delta = 1$$ The general form can also be represented as follows and, by having recourse to the familiar semantic truth table, we can determine the values of the coefficients which are, in this representation form, the values of the function for the shown pairs (i.e., $f_{\mid}(1, 1) = 0, f_{\mid}(1, 0) = 1, f_{\mid}(0, 1)) = 1,f_{\mid}(0, 0) = 1$).
$$f_{\mid}(x, y) = f_{\mid}(1, 1)xy + f_{\mid}(1, 0)x(1 + y) + f_{\mid}(0, 1)(1 + x)y + f_{\mid}(0, 0)(1 + x)(1 + y)$$
By carrying out the operations in the algebra, we obtain the expected result, keeping in mind that (2 = 0) (modulo 2):
\begin{multline*}$$f_{\mid}(1, 1)xy + f_{\mid}(1, 0)x(1 + y) + f_{\mid}(0, 1)(1 + x)y + f_{\mid}(0, 0)(1 + x)(1 + y) = 0xy + 1x(1 + y) + 1(1 + x)y + 1(1 + x)(1 + y) = \\ x + xy + y + xy + 1 + x + y + xy = 2x + 2y + 2xy + xy + 1 = xy + 1$$
\end{multline*}

b. Decision-Procedural Rules for the Sheffer Stroke

The decision procedure in propositional logic known as the Tree Method, can incorporate rules for “$\mid$” as follows:

In the Beth-Tableau Method, the rules for “$\mid$” should be represented as follows:

It is possible to develop a Gentzen-sequent rule for the Sheffer Stroke. (See Riser, 1967; Béziau, 2001; for a more detailed analysis, Read, 1999.) The theoretical significance of enacting proof-theoretic procedures, like Gentzen’s, consists in that the connectives are then defined by means of the rules for their introductions and/or eliminations; there is a substantive philosophic view that this is the proper approach to assessing the meanings of logical connectives. In Gentzen-style sequents, the variables to the left of the turnstile symbol (“$\vdash$”) are presumed joined by conjunction and those to the right are presumed joined by inclusive disjunction. A variable may be shifted from left to right or from right to left by being negated. Repeated variable letters may be deleted (by means of a rule known as Contraction) and variable letters may be shifted freely (or permuted) insofar as they stay in the same side of the turnstile.

c. Alternative Symbols

Another symbol for the Sheffer Stroke or NAND connective is “$\uparrow$” and this symbol is, appropriately, called the “Sheffer Dagger” or “Sheffer Upward Arrow.” An older symbol is “$\veebar$” (for instance, in Alonzo Church’s influential text on Mathematical Logic, Introduction to Mathematical Logic, p. 37), but this symbol is now more commonly used in certain notational variants to symbolize exclusive disjunction. (See History below for symbols used by Sheffer himself and by C. S. Peirce.)

In Polish notation, which uses not infix but prefix placement for connective symbols and neatly dispenses with parentheses, the symbolization for NAND is:
$$Dpq$$

To write in Polish notation that material equivalence (symbolized by “$E$”) obtains between NAND and the negation (symbolized by “$N$”) of conjunction (symbolized by “$K$”), we write:
$$EDpqNKpq$$

The symbolic variant used for logical gates in electronic circuitry also deploys prefix notation (with the symbol of the function written before and not in between the input variables. Thus,
$$NAND (A, B)$$

As the case usually is with writing out functions, it should be noted that there is ambiguity surrounding the notation used for representing the Boolean function interpreting the Sheffer Stroke: it is not clear if it is the operation that is represented or if a name of the function is given. The notation of the so-called lambda-calculus (or $\lambda$-calculus) can be used to disambiguate. Accordingly, to indicate unambiguously that we are giving the name of the underlying function of the NAND (or Sheffer Stroke) connective, we can write:

$\lambda x.\lambda y (f_{\mid}(x, y))$ (—) (___)

with possible specification of the underlined input variables from the set $\{1, 0\}$

2. History

The logical connective we call the Sheffer Stroke and its symbol are named after Henry Maurice Sheffer who, in 1913, published a paper in which he introduced a connective (called a “primitive idea” in the jargon of the times) with remarkable logical properties. Sheffer’s project was motivated by the purpose of using this connective to provide a more parsimonious or economical rendering of Huntington’s axiom system for standard propositional logic. In the parlance of the times, the purpose was to “reduce” the number of “primitive” connectives of standard propositional logic. We will see in subsequent section what all this amounts to.

It so happens that Sheffer used another logical connective which, like the Sheffer Stroke, allows for a reduction of the number of logical connectives that are used. This connective is usually called NOR, Peirce’s Arrow or Joint Denial. The name Sheffer’s Stroke was coined by the authors of Principia Mathematica (Whitehead-Russell, 1963) who extolled the significance of the discovery of this connective and proceeded to add an entire section to the 2$^{nd}$ edition of the Principia utilizing the connective. We will be able to fully appreciate the claims made about the significance of this discovery after we have studied the section on the Properties of the Sheffer Stroke. An entire section, Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic, and Philosophy, will be devoted to assessing the importance of this connective.

Sheffer himself had called his connective “rejection,” inspired by the correspondence of this connective to the linguistic expression “neither-nor.” Another name that was once in usage for this connective is “dispersion.” As we have mentioned, this connective is usually called NOR or Peirce’s Arrow today. Sheffer called the propositional variables that are the connective’s related variables or inputs “rejects.” Rather inopportunely, he gave to the connective symbol the name “per” in analogy to the name of the symbol of the standard algebraic division: in terms of the underlying algebra of modern propositional logic, however, there is no satisfactory Boolean analogue to algebraic division and, so, the name “per” is misleading.

a. Peirce’s Discovery

It turns out that the American logician and philosopher Charles Sanders Peirce (1839-1914) had already discovered the logical connective we call the Sheffer Stroke, as well as the related connective NOR (also called Joint Denial, and quite appropriately Peirce’s Arrow, with other names in use being Quine’s Arrow or Quine’s Dagger and today usually symbolized by “$\downarrow$”). The relevant manuscript, dating to 1880, numbered MS 378 in a subsequent edition and titled “A Boolian [sic] Algebra with One Constant” (Peirce, 1971), was actually destined for discarding and was salvaged for posterity literally at the nick of time in 1926. A fragmentary text by Peirce dating from 1880 also shows familiarity with the remarkable metalogical characteristics that make a single function functionally complete, and this is also the case with Peirce’s unfinished Minute Logic (1902, ch. 3): these texts were eventually published posthumously (1933, vol. 4, pp. 13-18, 215-216.)

Peirce designated the two truth functions, NAND and NOR, by using the symbol “$\curlywedge$” which he called Ampheck, coining this neologism from the Greek word ἀμφήκης which means “of equal length in both directions.” (Peirce, 1933: 4.264) Peirce’s editors disambiguated the use of symbols by assigning “$\overline{\curlywedge}$” to the connective we call the Sheffer Stroke while preserving the symbol “$\curlywedge$” for NOR.

(More about Peirce’s work in logic, including reference to the 1880 manuscript, can be found in another encyclopedia article.)

Like Sheffer did later, Peirce understood that these two connectives can be used to “reduce” all mathematically definable connectives (also called “primitives” and “constants”) of propositional logic: this means that all definable connectives of propositional logic can be defined by using only the Sheffer Stroke or NOR as the single connective. No other connective (or associated function) that takes one or two variables as inputs has this property. Standard, two-valued propositional logic has no unary functions that have the property of functional completeness. In subsequent section, we will explore this remarkable logical property in detail. At first blush, availability of this option ensures that economy of resources can be obtained—at least in terms of how many functions or connectives are to be included as undefined. Unfortunately, there is a trade-off between this gain in economy of symbolic resources and the unwieldy length and rather counterintuitive appearance of the formulas that use only the one connective.

It is characteristic of Peirce’s logical genius and emblematic of his rather under-appreciated contributions to the development of modern logic that he grasped the significance of functional completeness and figured out what truth functions—up to arity 2—are functionally complete for two-valued propositional logic. (Strictly speaking, this is the property of weak functional completeness, given that we disregard whether constants or zero-ary functions like 1 or 0 can be defined.) Peirce subscribed to a Semeiotic view, according to which the fundamental nature and proper tasks of the formal study of logic are defined by the rules set down for the construction and manipulation of symbolic resources. A proliferation of symbols for the various connectives that are admitted into the signature of a logical system suffers from a serious defect on this view: the symbolic grammar fails to match or represent the logical fact of interdefinability of the connectives. Peirce was willing sometimes to accept constructing a formal signature for two-valued propositional logic by using the two-members set of connectives $\{\neg , \bot \}$, which is minimally functionally complete. This means that these two connectives—or, if we are to stick to an approach that emphasizes the notational character of logical analysis, these two symbols—are adequate expressively: every mathematically definable connective of the logic can be defined by using only these two; and the set is minimally functionally complete in the sense that neither of these connectives can be defined by the other (so, as we say, they are both independent relative to each other.) The symbol $\ulcorner \bot \urcorner$ can be viewed as representing a constant truth function (either unary or binary) that returns the truth value False for any input or inputs. Or it can be regarded as a constant, which means that it is a zeroary (zero-input) function, a degenerate function, which refers to the truth value False. Although not using our contemporary terminology, Peirce took the second option. This set has cardinality 2 (it has exactly two members) but it is not the best we can do. Peirce’s discovery of what we have called the Sheffer Functions (anachronistically and unfairly to Peirce, but bowing to convention) shows that we can have a set of cardinality 1 (a one-member set or a so-called singleton) that is minimally functionally complete with respect to the definable connectives of two-valued propositional logic. Thus, either one of the following sets can do. The sets are functionally complete and, because they have only one member each, we say that the connectives themselves have the property of functional completeness. $\ulcorner \mid \urcorner$ is the symbol of the Sheffer Stroke or NAND and $\ulcorner \downarrow \urcorner$ is the symbol of the Peirce Arrow or NOR. (We stipulate as such, even though we have not introduced our grammar formally.)

It is important to show, albeit briefly, how these functions can define other functions. Algebraically approached, this is a matter of functional composition but we do not enter into such details here. We will have more details in subsequent sections. In case one wonders why the satisfaction with defining the connectives of the set that comprises the symbols for negation, inclusive disjunction, and conjunction, namely $\{ \neg, \vee, \wedge \}$, there is an explanation: there is an easy, although informal, way to show that this set is functionally complete. It is not minimally functionally complete because $\ulcorner \vee \urcorner$ and $\ulcorner \wedge \urcorner$ are inter-definable. But it is functionally complete. Thus, showing that one can define these functions suffices for achieving functional completeness. Definability should be thought as logical equivalence: one connective can be defined by means of others if and only if the formulas in the definition (what is defined and what is doing the defining) are logically equivalent. (Presuppose the truth-tabular definitions of the connectives.)

$$\neg p \stackrel{\text{def}}{=} (p \mid p)$$

$$(p \vee q) \stackrel{\text{def}}{=} ((p \mid p) \mid (q \mid q))$$

$$(p \wedge q) \stackrel{\text{def}}{=} ((p \mid q) \mid (p \mid q))$$

$$\neg p \stackrel{\text{def}}{=} (p \downarrow p)$$

$$(p \vee q) \stackrel{\text{def}}{=} ((p \downarrow p) \downarrow (q \downarrow q))$$

$$(p \wedge q) \stackrel{\text{def}}{=} ((p \downarrow q) \downarrow (p \downarrow q))$$

b. The “Discovery” and Principia Mathematica

Bertrand Russell hailed this development (which he considered to be Sheffer’s “discovery”) and, with the co-author of Principia Mathematica Alfred Whitehead, added an entire section in the 2$^{nd}$ edition to take advantage of the discovery. Russell was not aware that Peirce had already made the discovery in the 19$^{th}$ century. Prompted by this applause and urged on by the weight of renewed expectations, Sheffer, who was not a prolific author, returned to the task of taking further advantage of his discovery, but he did not succeed in advancing beyond his initial contribution.

Not only did Russell hail this discovery, but also the oracular thinker and profoundly influential philosopher Ludwig Wittgenstein (Tractatus, 1922) used grandiloquent language in celebrating the discovery that the “ideal” formal language of standard logic can be “reduced” to a single “primitive.” What this all means is discussed in the section on the Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic, and Philosophy. Two other influential authors of an early logic textbook, David Hilbert and Wilhelm Ackermann, regarded this development as a rather unimpressive detail.

Despite the hullabaloo about the “single primitive,” efforts to take advantage of this result for constructing economical versions of Predicate Logic were sparse. No doubt, one reason is that a system that would have only the Sheffer Stroke as its connective would require use of unwieldy formulaic expressions. Abbreviation conventions would be needed, at a minimum. Another reason for the neglect, at least in Quine’s case, was that he was similarly preoccupied with generating other, similarly parsimonious, notational variants of logic (including variable-free grammars.) It was Moses Schönfinkel, one of the originators of Combinatory Logic, who adopted the Sheffer Stroke as single connective to construct a notational idiom of predicate logic. (See Bimbó, 2010.)

3. The Logical Connectives of Standard Propositional Logic and the Sheffer Stroke

It is time to briefly introduce a notational variant or idiom of standard propositional logic (SPL), within which one can locate the truth function NAND (or Sheffer’s Stroke); by referring to this formal language, one can examine and explicate the properties and significance of the Sheffer Stroke. Because one wants to be able to refer to other logical connectives besides the Sheffer Stroke, one actually lays out an expanded variant of SPL, which is called here SPLexp. Talk about SPLexp is within a fragment of English; this fragment is enhanced with specially designated symbols and, as such, it serves as our Metalanguage (ML) while SPLexp is the Object Language (OL). The next goal is to obtain ML symbols from the OL, and this is done without danger of ambiguity because the context makes clear whether OL or ML is employed. As is customary, when symbols are mentioned rather than used, they are placed within quotation marks.

The formal language SPLexp has symbolic resources for single or atomic propositional variables (up to the infinity of the natural numbers), and for logical connectives. It also has auxiliary symbols, and parentheses to be used only for the sake of preventing ambiguity of well-formed expressions. The metalinguistic symbol “$\in$” means “___ is a member of set —”. For connectives, the expansive idiom includes symbols for all definable unary and binary connectives of the standard propositional logic. For present purposes, there is no need to supply names for all the definable connectives denoted by these symbols. Definitions of the connectives are given by means of the familiar truth table. In brief,

PROPOSITIONAL VARIABLES $= \{p, q, r, \ldots, p_{i}, \ldots, q_{i}, \ldots\}, i \in N$

CONNECTIVE SYMBOLS $= \{ \top^{1}, \bot^{1}, id, \neg, \top^{2}, \bot^{2}, 1, 2, \neg 1, \neg 2, \vee, \wedge, \rightarrow, \leftarrow, \leftrightarrow, \nrightarrow, \nleftarrow, \nleftrightarrow, \downarrow, \mid \}$

Standard grammatical conventions for the construction of well-formed formulas are used.

N is the set of natural numbers. “$\mid \varphi \mid$” denotes the truth value of a well-formed formula $\varphi$. Symbols from the object language are appropriated, trusting that the context removes ambiguity.

There are 2$^{2}$ = 4 unary connectives, and there are 2$^{2}$ raised to the second power = 16 binary connectives that are mathematically definable in the standard (two-valued) propositional logic. (In general, if n is the number of inputs to the connective, the number of mathematically definable n-ary connectives in standard propositional logic is 2$^{2}$ raised to the n$^{th}$ power.)

Some characteristic equivalences, which can be checked by the familiar truth table method, are:

$$(p \mid q) \leftrightarrow \neg (p \wedge q)$$

$$(p \mid q) \leftrightarrow (\neg p \vee \neg q)$$

$$(p \mid q) \leftrightarrow (p \rightarrow \neg q)$$

$$(p \mid q) \leftrightarrow (q \rightarrow \neg p)$$

$$(p \mid q) \leftrightarrow \neg (\neg p \downarrow \neg q)$$

$$(p \downarrow q) \leftrightarrow \neg (p \vee q)$$

$$(p \downarrow q) \leftrightarrow \neg (\neg p \mid \neg q)$$

4. Properties of the Sheffer Stroke

An examination of the properties of the Sheffer Stroke begins after having the formal idiom SPLexp in place. Introductory logic textbooks usually omit references to the special properties of the Sheffer Stroke; more advanced logic texts and mathematical logic or metalogic texts always make special mention of this connective and of its dual, the NOR or Peirce Arrow connective. (What “duality” means in this context will be examined soon.)

The student of logic learns that the Sheffer Stroke or NAND, like NOR, has a remarkable characteristic that is called functional completeness or expressive completeness. No other unary or binary connective, besides the Sheffer Stroke and its dual NOR, has this property. No connective of lesser arity (thus, zeroary or unary) has this property, either. When alternative logics are investigated, a fundamental metalogical task consists in querying the existence of functionally complete functions, which may be called Sheffer Functions. Present observations are limited to what is known as standard (sometimes called classical) logic: when it comes to alternative or non-classical logics, the connective defined as negation of conjunction should not be presumed to have the property of functional completeness. (It should be borne in mind that negation and conjunction themselves have different, non-standard, meanings in alternative logics since they are defined over more than the two truth values of standard logic.)

After defining functional completeness, it will be shown that indeed the Sheffer Stroke (or NAND) possesses this remarkable property. One needs to ask also why this is the case and why this is an important characteristic.

This property, functional or expressive completeness, is not to be confused with what is called simply “completeness.” Completeness in that sense means this: relative to what are the logical truths of a formal language $\mathcal{L}$, whose logical consequence relation is symbolized as “$\Vdash_{\mathcal{L}}$”, a proof system L is complete if and only if L’s derivability relation, symbolized “$\vdash_{L}$”, is such that:

$\Vdash_{\mathcal{L}}$ if and only if $\vdash_{L}$

This is equivalent to:

not-$\Vdash_{\mathcal{L}}$ if and only if not-$\vdash_{L}$

Roughly, what this means is that a complete system, and only a complete system, will have failures of proof or failures of derivation in all the cases, and only in the cases, in which one expects the corresponding semantical language to be failing in establishing semantic conclusions or logical truths. Think of a logical truth as a semantical conclusion of any, including the empty, set of premises.

There is more about this fundamental topic of Metalogic in other articles (see Propositional Logic and references there), but caution is needed here to note that functional completeness is not related to that other concept called simply “completeness.”

A logical connective $f$ of a formal language $\mathcal{L}$ is functionally complete with respect to $\mathcal{L}$ if and only if every mathematically definable logical connective $f_{j}$ of $\mathcal{L}$ can be defined in terms only of $f$.

For the case of a binary connective $f_{j}^{2}(p, q)$ that is functionally complete, all mathematically definable connectives $f_{i}^{n}(p_{1}, p_{2}, \ldots, p_{n})$ can be defined by using only propositional variables and the connective $f_{j}^{2}(p, q)$.

If one wants to define functional completeness in terms of the familiar semantic device of the truth table, one can do so in the following way:

a connective $f$ is functionally complete for the language of standard propositional logic if and only if the truth tables for all mathematically definable connectives (of any arity) can be constructed with labels on the top arrow having the symbol for $f$ as the only connective symbol.

For example, this can be done by using only the Sheffer Stroke symbol, $\ulcorner \mid \urcorner$, for certain familiar connectives of the standard propositional logic. Two truth tables are considered identical if they agree on all the truth value outputs corresponding to the same valuations (truth value assignments as inputs.) Thus, the truth-tabular definition of $\ulcorner \wedge \urcorner$ coincides with the truth table with outputs as in “$\wedge$/$\mid$” and the truth-tabular definition of $\ulcorner \vee \urcorner$ coincides with the truth table whose output column is as in “$\vee$/$\mid$”. Notice that the labels for $\wedge$/$\mid$ and $\vee$/$\mid$ use propositional variables and the only connective symbol they use is of $\ulcorner \mid \urcorner$.

An alternative and equivalent definition is:

a connective $f$ is functionally complete for the language of standard propositional logic if and only if for every truth table labeled by a well-formed formula of the logic there is an identical truth table whose label has the Sheffer Stroke symbol as the only connective symbol.

(Two truth tables are identical if they agree on every truth value output corresponding to the same truth value input assignments.)

The non-trivial question now faced is whether such functionally complete connectives are definable and how high one has to ascend in arity (to unary, binary, and so on) before finding a functionally complete connective. The answer is that one can stop at the level of binary connectives in the case of the standard two-valued logic: the Sheffer functions (the Sheffer Stroke and NOR) are, each, functionally complete. The details are examined below.

One can also define functional completeness as a property of groups or sets of connectives. Sometimes one finds references to functional completeness as a property of systems of connectives.

A set $X = \{f_{1}, \ldots, f_{n}\}$ of connectives is functionally complete with respect to a formal language $\mathcal{L}$ if and only if every mathematically definable connective $f$ of $\mathcal{L}$ can be defined by using only members of $X$.

Such a set $X$ is then itself a member of the set of functionally complete sets of the language, $FC(\mathcal{L})$. Thus, using “$\in$” as the symbol for set membership,
$$\{\mid\} \in FC(\mathcal{L})$$
This means that the one-member set (singleton set) with the Sheffer Stroke as its only member is a functionally complete set or is a member of the set of functionally complete sets.

Of special interest is a proper subset of the functionally complete sets of connectives $FC(\mathcal{L})$: sets that are minimally or non-redundantly functionally complete, $MFC(\mathcal{L})$. Here is what this means.

A set $X = \{f_{1}, \ldots, f_{n}\}$ of logical connectives is minimally functionally complete (MFC) if and only if it is functionally complete (FC) and also it is the case that no connective in the set can be defined by using other connectives in the set.

If so, then each connective in the set is independent of the other connectives or simply independent.

Now, consider the set that is comprised only of the Sheffer Stroke connective:
$$X_{\mid} = \{\mid\}$$

This set is functionally complete. Since it has only one member, this set has to be MFC (minimally functionally complete) if it is FC (functionally complete). This is because there is only one connective; so, it is impossible for it to be defined in terms of other connectives in the set: this connective has to be independent! There are exactly two such MFC singletons in standard propositional logic (up to binary connectives):

$$X_{\mid} = \{\mid\} \in MFC(\mathcal{L})$$

$$X_{\downarrow} = \{\downarrow\} \in MFC(\mathcal{L})$$

This section concludes by highlighting certain other properties possessed by the Sheffer Stroke connective. This is done because having those properties is the underlying reason that the Sheffer Stroke connective is functionally complete. This brief investigation of those properties and of how they are related to functional completeness encapsulates the results established by the mathematicians Emil Post (1921, 1941) and William Wernick (1942).

1. Before examining the relationship between functional completeness and certain other logical properties of the Sheffer Stroke, there is a straightforward way of establishing that $$\{\mid \}$$ is FC (functionally complete). To do this, consider how one can use the truth-tabular definition of any connective to extract its Disjunctive Normal Form (DNF). Here is an example. The same can be done with any connective regardless of its arity. This example is of a ternary or three-place connective. Extra columns are added to the truth table for illustrative purposes. In these columns the truth values of the individual propositional variables are traced across the rows in which the connective receives T as its truth value; then, it is shown how the DNF is formed.

Consider the Disjunctive Normal Form (DNF) of the given ternary connective. The method is this: form conjunctions of the atomic propositional variables across each row in which the connective receives the truth value T; the variables are written as seen in the added row of the example. Then form the inclusive disjunction of the conjunctions constructed in the previous step. Based on this truth table and the DNF that can be obtained, the definition of this connective in DNF could be given as follows:

$$f_{\#}(p, q, r) \stackrel{\text{def}}{=} (p \wedge q \wedge \neg r) \vee (p \wedge \neg q \wedge r) \vee (\neg p \wedge \neg q \wedge r)$$

A note about formal grammar: One takes advantage of the associativity of conjunction, inclusive disjunction and equivalence to omit unnecessary parentheses without ambiguity: when a connective $\#$ is associative, “($\varphi \# \chi) \# \psi$” and “$\varphi \# (\chi \# \psi)$” are equivalent; hence, they can both be written as “$\varphi \# \chi \# \psi$”. Note omission of outer parentheses.

Because the above can be done for any connective, one can conclude that the set $$X_{1} = \{\neg, \vee, \wedge \}$$ is FC (functionally complete): any mathematically definable connective of the propositional language can be defined by using only connectives from the set $X_{1}$. This includes the unary connectives. First, notice that negation is included in the set $X_{1}$. The other three unary connectives are also definable as shown below.

$$id p \stackrel{\text{def}}{=} p; T^{1} p \stackrel{\text{def}}{=} p \vee \neg p; \bot^{1} \stackrel{\text{def}}{=} (p \wedge \neg p)$$

And when it comes to connectives of arity $\leq 2$, the truth table shows the way to define them by using only connectives from the set $X_{1}$, as above.

The set $$X_{1} = \{\neg, \vee, \wedge \}$$ is functionally complete, as just established, but it is not minimally functionally complete. There is redundancy in it because the connectives conjunction and inclusive disjunction are inter-definable as can be seen in light of the following so-called DeMorgan equivalences:

$$(p \vee q) \leftrightarrow \neg (\neg p \wedge \neg q))$$

$$(p \wedge q) \leftrightarrow \neg (\neg p \vee \neg q)$$

The following two sets are not only FC (functionally complete) but also MFC (minimally functionally complete):

$$X_{2} = \{\neg, \vee \}$$

$$X_{2} = \{\neg, \wedge \}$$

Now consider the Sheffer Stroke. In order to show that the set $X_{\mid} = \{\mid \}$ is FC, show that negation and either conjunction or inclusive disjunction are definable in terms of the Sheffer Stroke. Because $\{\neg, \vee \}$ and $\{\neg, \wedge \}$ are, each, functionally complete, if the symbolized connectives in any one of these sets are definable in terms of the connective in $\{\mid \}$, then this latter set also must be functionally complete.

It can be shown that, indeed, negation and inclusive disjunction, as well as conjunction, are definable in terms of the Sheffer Stroke. The truth table method can be used to verify that the following equivalences indeed obtain:

$$\neg p \leftrightarrow (p \mid p)$$
$$(p \vee q) \leftrightarrow ((p \mid p) \mid (q \mid q))$$
$$(p \wedge q) \leftrightarrow ((p \mid q) \mid (p \mid q))$$
These equivalences can be justified in another way. Taking advantage of certain valid equivalences of the standard propositional logic, which are used to make replacements of phrases by their equivalents without alteration to truth value, one has:
$$\neg p \leftrightarrow \neg (p \wedge p) \leftrightarrow (p \mid p)$$
$$(p \vee q) \leftrightarrow \neg (\neg p \wedge \neg q) \leftrightarrow (\neg p \mid \neg q) \leftrightarrow ((p \mid p) \mid (q \mid q))$$
$$(p \wedge q) \leftrightarrow \neg \neg (p \wedge q) \leftrightarrow \neg (p \mid q) \leftrightarrow ((p \mid q) \mid (p \mid q))$$

2. Emil Post (1921, 1941; see also Pelletier and Martin, 1990) showed that any set $X$ of definable logical connectives of the standard propositional logic is functionally complete if and only if $X$ is not a subset of any of the following sets of connectives:

the set of monotonic connectives (MC($\mathcal{L}$));
the set of linear (also called countable, counting, or affine) connectives (L($\mathcal{L}$));
the set of self-dual connectives (SD($\mathcal{L}$));
the set of truth-preserving connectives (TP($\mathcal{L}$));
and the set of falsehood- (or falsity-) preserving connectives (FP($\mathcal{L}$)).

If one single logical connective $f$ is to be functionally complete by itself (or if the singleton set with the function symbolized by $f$ as its only member is functionally complete), then the function $f$ must lack all of the above properties of connectives. In other words,

$f$ should not be monotonic;
$f$ should not be linear;
$f$ should not be self-dual;
$f$ should not be truth-preserving;
$f$ should not be falsehood-preserving.

After briefly defining these interesting properties, it can be shown that, among definable binary connectives, only the Sheffer functions (the Sheffer Stroke or NAND and the Peirce Arrow or NOR) lack all of those properties when one considers all the definable unary and binary connectives of the standard propositional logic. If one is examining a set of functions to determine whether it is functionally complete, check that there is at least one function in the set, which lacks one of the above properties; and perform this check for each property. So, one needs to ensure that, for each of the properties above, there is at least one function that lacks this property.

One can engage briefly the deeper analysis behind this seminal result, which can be called the Post Result (while the test presented above can be called the Post Test): All the enumerated properties are so-called Hereditary Properties. This means that if a function $f$ (or corresponding semantic connective) has a property like this, then all functions that can be defined by using only $f$ must also have this property. This means that every such hereditary property $\mathbb{P}$ is “inherited” necessarily by all functions that are defined by means only of the function $f$ that has $\mathbb{P}$. But these hereditary properties are not characteristic of all definable functions. In other words, there are definable functions that lack $\mathbb{P}$, for each hereditary property $\mathbb{P}$. This explains Post’s result. A function that can indeed define, just by itself, all definable functions should not have any one of the hereditary properties because, if it had any such property, it would necessarily transmit it to every function it defines; but, then, the function could not define functions that lack this property.

Monotonicity:

Consider the case of binary functions, given the present inquiry. Note that these definitions of properties apply for n-ary functions in general. Note also that the two truth values $$(2 = \{T, F\})$$ are ordered so that the truth value denoted by “F” is lower than that denoted by “T”. The table makes this point:

This is called partial ordering and, as a relation, it can be defined set-theoretically as:

$$\{<F, F>, <F, T>, <T, T>\}$$

A binary function $$f(x, y)$$ is monotonic if and only if, for all input values $x_{1}, x_{2}, y_{1}, y_{2}$:

If $x_{1} \leq x_{2}$ and $y_{1} \leq y_{2}$, then $f(x_{1}, y_{1}) \leq f(x_{2}, y_{2})$

What does this mean in our case of binary logical connectives? There is a test, which follows from this definition, for determining whether a given binary connective is monotonic or not. The test goes like this. Start by writing out the input pairs for truth values (<$T, T>, <T, F>, <F, T>, <F, F$>) by using a diagram as shown below. This diagram arranges the pairs of truth values so that the arrows show the ordering just talked about. Next, write as a superscript for each pair of input values the truth value taken by the connective for that pair. For instance, for conjunction one has
$$<T, T>^{T}, <T, F>^{F}, <F, T>^{F}, <F, F>^{F}$$

For the Sheffer Stroke (as can be checked from its truth table), one has:
$$<T, T>^{F}, <T, F>^{T}, <F, T>^{T}, <F, F>^{T}$$

There is failure of monotonicity if and only if there is at least one case in which one can proceed down the arrows from a T to an F superscript. This type of diagram is used below to show that the Sheffer Stroke is non-monotonic.

Since there are instances in which there is a shift in truth value of the connective from T to F as one goes down the red arrows, one can infer that this connective is not monotonic.

The set of monotonic unary or binary connectives is:

MC$(\mathcal{L}) = \{\wedge, \vee, \top^{2,} \bot^{2}\}$

The Sheffer Stroke is not among them. Likewise, the Sheffer Stroke fails to be included among the other types of connectives identified above. And, so, by Post’s result, the Sheffer Stroke is functionally complete.

Linearity:

A logical connective is linear (also called countable, counting, or affine) if and only if it is the case that either all or none of the propositional inputs affect the truth value of the output. This means that for a linear connective, and only in the case of such a connective, for each of its inputs, changing the value of the input results in one of the following two cases: either the output value always changes or it never changes. For present purposes, concentrate on unary and binary connectives and proceed straight to presenting a test that can be used to determine whether a connective is linear or not. Here is how the test works: Check the cases in which the connectives takes T as its truth value. Call these the T-cases. Similarly, call the rest the F-cases. Then count the number of input variables that take T. If, and only if, the connective is linear, this number is always even for the T-cases and odd for the F-cases, or it is always odd for T-cases and even for the F-cases. One can show that this is not the case for the Sheffer Stroke. Hence, the Sheffer Stroke is not countable.

The rule is violated. The input variables that take T are even when the connective takes T as its truth value; but for the cases in which the connective takes F as its truth value, there is a mixture of even and odd numbers of input variables that are T. Hence, the Sheffer Stroke is not a linear connective.

The linear unary and binary connectives are as follows, and the Sheffer Stroke, again, is not among them.

L($\mathcal{L}) = \{\top^{1}, \bot^{1}, \neg, \nleftrightarrow, T^{2}, \bot^{2}\}$

Self-Duality:

Consider the case of unary and binary connectives. The dual of a unary or binary connective, ($f(p))’$ and ($f(p, q))’$ respectively, can be defined as follows:

$$f(p))’ \stackrel{\text{def}}{=} \neg f(\neg p); (f(p, q))’ \stackrel{\text{def}}{=} \neg f(\neg p, \neg q)$$

Interestingly, the dual of the Sheffer Stroke is the other Sheffer function, NOR.

$$(p \mid q)’ = \neg (\neg p \mid \neg q)) \leftrightarrow \neg \neg (\neg p \wedge \neg q) \leftrightarrow (\neg p \wedge \neg q) \leftrightarrow \neg (p \vee q) \leftrightarrow (p \downarrow q)$$

Now, a connective has the property of self-duality if and only if it is its own dual. As just seen, the dual of the Sheffer Stroke is the other Sheffer function, NOR; hence, the Sheffer Stroke does not have the property of self-duality. It is not among the members of the set of self-dual unary and binary connectives of the standard propositional logic.

SD($\mathcal{L}) = \{\neg, 1, 2, \neg 1, \neg2 \}$

Truth-Preservation and Falsehood-Preservation:

Finally, consider the two remaining properties of connectives, of interest for these purposes: truth-preservation and falsehood-preservation. It can be shown again that the Sheffer Stroke lacks these properties as well.

A connective is truth-preserving if and only if it yields the truth value T for all cases in which all its variable inputs are T.

In the general case of an n-ary connective, one has:

$$|f(T, \ldots, T)| = T$$

A connective is falsehood-preserving if and only if it yields the truth value F for all cases in which all its variable inputs are F.

$$|f(F, \ldots, F)| = F$$

The truth-preserving and falsehood-preserving unary and binary connectives of the standard propositional logic are given below, and, once again, the Sheffer Stroke is not among them.

TP($\mathcal{L}) = \{id, \top^{1}, \wedge, \vee, \rightarrow , \leftarrow , \top^{2} \}$

FP($\mathcal{L}) = \{id, \bot^{1}, \wedge, \vee, \nrightarrow , \nleftarrow , \bot^{2} \}$

The same tests could be applied on the other Sheffer function, the NOR connective, to show that this connective is also excluded from all these sets. No other unary or binary connectives would be excluded from all these sets. Therefore, the Sheffer Stroke and NOR are functionally complete and are the only connectives (among unary and binary connectives) that are functionally complete.

It can be shown that any logical connective, regardless of arity, is functionally complete if it has a property that is called complete symmetry. (see Bimbó, 1992)

It can then be ascertained that no unary connectives have this property and that the only binary connectives that have the property are the two Sheffer functions.

A binary logical connective $f$ is completely symmetrical if and only if the following conditions hold. (“$| \varphi |$” denotes truth value.)

$$|f(T, T)| = F$$
$$|f(F, F)| = T$$
$$|f(T, F)| = |f(F, T)|$$

In the literature, other functionally complete connectives (or, rather, their associated Boolean functions) are also called Sheffer functions. This applies in the case of non-standard or alternative logics, but these fall outside the scope of this article. In the case of the standard propositional logic, a Sheffer function is a function of any arity n (n $\geq$ 2) that is, taken by itself, functionally complete. The relevant fact to consider is this: Regardless of arity, a connective is functionally complete if it is completely symmetrical. This result applies only in the case of the standard (two-valued) propositional logic.

The definition of a completely symmetrical n-ary $f^{n}$ connective is now given:

$$|f^{n}(T, \ldots, T)| = F$$

$$|f^{n}(F, \ldots, F)| = T$$

For all other cases (that is, when $p_{1}, \ldots, p_{n}$ are not all T or all F):
$$|f^{n}(p1, \ldots, p_{n})| = |f^{n}(\neg p_{1}, \ldots, \neg p_{n})|$$

Here is a suggestive sketch of a proof of the fact that $f^{n}$ is functionally complete insofar as it is completely symmetrical. (see Bimbó, 1992)

Assume a completely symmetrical n-ary connective, $f^{n}$. Now, take the case of the truth table that can be constructed for $$f^{n}(p, \ldots, p, q, \ldots, q)$$. This truth table must have only four rows, since there are exactly two propositional variables; it will have n columns since the function is n-ary.

Consider the results from the truth table above.

Since the connective is completely symmetrical, it must return or yield the same truth values (either T or F) for input values <T, F> and <F, T>. This yields exactly two cases: one in which the two truth values are T and one case in which the two truth values are both F. The first case has the truth table for the Sheffer Stroke; the second case has the truth table for the NOR connective. Thus,

a. $f^{n}(p, \ldots, p, q, \ldots, q) \leftrightarrow (p \mid q)$, or

b.$f^{n}(p, \ldots, p, q, \ldots, q) \leftrightarrow (p \downarrow q)$

In either case, the connective can be defined in terms of a functionally complete connective (either the Sheffer Stroke or NOR).

Accordingly, every definable function can be defined in terms of $f$ since $f$ is itself definable in terms of a functionally complete connective. This shows that $f$ is itself functionally complete.

Apply the Post Test to determine whether a given set of functions is functionally complete,which means that by using only the functions in the set, all mathematically possible functions of the formal language can be defined. There are examples of sets of functions of the standard propositional logic, which are functionally complete, and one can see how the members of these sets lack, taken together, the hereditary properties discussed above. The Sheffer Stroke, and the Peirce Arrow, lack all those properties; therefore, the one-member sets that have as their single members the Sheffer Stroke or the Peirce Arrow are functionally complete. On the other hand, some sets are not functionally complete because some of the identified hereditary properties are not lacked by any one function in the given set. “TP” abbreviates “Truth-Preservativeness”, “FP” abbreviates “Falsehood-Preservativeness”, “SD” abbreviates “Self-Duality”, “M” abbreviates “Monotonicity”, and “L” abbreviates “Linearity.” Lacking the property is indicated by “x” while having the property is labeled by “+”. Thus, look for a set to have some “x” underneath each property across the row if this set is to be functionally complete.

It can be shown that the Sheffer Stroke possesses the property of functional completeness by examining its polynomial representation, which were introduced in section 1a; and the result is:

$$f_{\mid}(x, y) = xy + 1$$

Linearity can be defined for the polynomial representations of the functions as absence of any multiplicational products from the polynomial. (This also means that all definable unary functions are linear since they have the general form $$f_{*}(x) = \alpha x + \beta$$ So, no unary function can be functionally complete since it has to be linear.) By examining the polynomial representation of the Sheffer Stroke we see that it is non-linear as it has a multiplicative product in it. Therefore, it lacks the hereditary property of linearity.

Next, show that it also lacks monotonicity.

$$0 \leq 1$$
$$f_{\mid}(0, 0) = 1 \nleq f_{\mid}(1, 1) = 0$$

Next, establish that the Sheffer Stroke is not self-dual. For a binary function in polynomial form, the self-duality condition can be given as follows.

$$f_{*}(x, y) = f_{*}(1 + x, 1 + y) + 1$$
$$f_{\mid}(x, y) = xy + 1 \neq f_{\mid}(1 + x, 1 + y) + 1 = (1 + x)(1 + y) + 1 + 1 = 1 + x + y + xy$$

In fact, the dual of the Sheffer Stroke is the other binary function that is functionally complete, the Peirce Arrow, whose polynomial representation is indeed: $$1 + x + y + xy = 1 + f_{\vee}(x, y)$$.

Finally, it can be shown that the Sheffer Stroke is not truth-preserving and is not falsehood-preserving.

$$f_{\mid}(1, 1) = 1 \times 1 + 1 = 1 + 1 = 0$$
$$f_{\mid}(0, 0) = 0 \times 0 + 1 = 0 + 1 = 1$$

5. Significance of the Sheffer Stroke for Mathematical Logic, Philosophical Logic, and Philosophy

The significance of the Sheffer Stroke connective for mathematical logic and metalogic (the study of formal systems of logic) is evident from the observations made in the preceding section regarding the properties of this connective. Those properties are shared by its dual, the NOR connective. These two connectives are the only binary connectives that are functionally or expressively complete. They are also the first such connectives discovered to have this property as one ascends from zeroary or unary connectives. Examination of such properties belongs to what is known as Metalogic (sometimes called Metatheory). The possibility of economizing in the use of theoretical resources is greatly appealing to mathematicians and scientists. The principle widely known as Occam’s razor roughly states that stipulated entities should not be multiplied beyond the bare minimum of what is needed for a proposed theory to be fully constructible. Economy or parsimony with respect to the resources of a theory is considered a virtue and is demanded methodologically in the sense that, between two theories that have equal explanatory power and/or applications, the one that is more parsimonious should be adopted. It is not claimed that we have some independent insight into the subject addressed by the theories (for instance “nature” or a pre-theoretic structure of independent reality.) What is claimed is simply that parsimony or economy of resources is a methodological and theoretical requirement that good theories must meet.

Formal systems of logic, and formal languages, have expressive resources that are symbolic. Economic use of those resources means using in the construction and implementation of the theory as few such resources as is possible without loss of any systemic powers of expression. Ideally, economy dictates that only one resource of a certain kind is to be used, if such a resource is available or definable and is effective in the construction of all the remaining expressive resources. In the case of the connective symbols of a formal language of propositional logic, this reduction to one effective symbol proves to be feasible in the case of the standard propositional logic: hence, the revelatory significance of Sheffer’s discovery (which, as seen, had already been achieved by Peirce.) For the reduction to be effective, of course, it must be the case that all other connectives (of any arity $\geq 1$) must be definable in terms of the single connective symbol; in this way all the other connectives can be eliminated as expressive resources without causing a loss of the ability to express what those symbols refer to. Thus, for example, instead of “$\neg \varphi$”, one can write “$\varphi \mid \varphi$”, and the same for all other connective symbols.

The advantages obtained from reduction of resources can be concrete in the case of implementations or applications of formal systems. For instance, in the construction of logic gates in electronic circuitry, the gate-types NAND and NOR are the electronic-theoretical interpretations of the same Boolean functions that are propositionally interpreted as the Sheffer connectives. As one ought to expect, NAND and NOR are universal gates. This means that any theoretically definable gate can be actually constructed from using just NAND gates or just NOR gates. Discoveries of this kind signal that a reduction in complexity is feasible, and this result can have economic and design advantages.

In practice, the advantage claimed from this reduction is outweighed by the fact that writing out well-formed expressions becomes prohibitively unwieldy if only one kind of connective symbol is used. For example, in the history of modern logic, Gottlob Frege’s notational variant never had a chance of being widely adopted because of the practically unmanageable demands it placed on typographical execution. One can think of this challenge as posing a trade-off between economy of resources and notational convenience. Or the trade-off is between reducing the type of resource (for instance, gate) used and needed, on the one hand, and the length or extension of the constructions that will have to be made, on the other. For example, to return to propositional logic, to express a well-formed formula like “$\neg p \vee q$” in terms of a single connective symbol, $\mid$, one must write out the much longer equivalent well-formed formula shown below. The notational version being used in this way is significantly more unwieldy than a notational version (a grammar) that uses more, not fewer, connective symbols. Consider the formula

$$((p \mid p) \mid (p \mid p)) \mid (q \mid q)$$

It is possible to adopt conventions that remove its complexity to some degree. For instance, stipulating that “$\varphi \mid \varphi$” is to be written as “$\varphi^{2}$”, permits simplification of the formula above to

$$(p^{2})^{2} \mid q^{2}$$

It is less obvious whether there is a deeper philosophical significance of the fact that a connective like Sheffer’s Stroke is available in a system of logic. Whitehead and Russell expressed boundless enthusiasm about Sheffer’s discovery, hinting only at an underlying significance of this while adopting the connective symbol in the second edition of Principia Mathematica. On the other hand, two other pioneering writers of logic textbooks, Hilbert and Ackermann, were unimpressed and reported on the Sheffer Stroke as if they were referring to trivia. Certainly, the Sheffer functions do not add to the logical system of standard propositional logic in any way. The simplification they make possible is an internal matter. If there are other logics for which, hypothetically, Sheffer functions are not available, this does not automatically mean that there is something wrong with those other systems insofar as they are assessed as formally constructed languages.

It was the influential thinker Ludwig Wittgenstein who attributed far-reaching significance to the fact that Sheffer functions are available. He did this in a somewhat obscure fashion in an influential logical-philosophical work.

a. Wittgenstein’s Tractatus and the Sheffer Stroke

In his Tractatus Logico-Philosophicus (1922, 5.1311, 6.001) Wittgenstein extolled the significance of the Sheffer functions, hinting that discovery of the functions vindicates some of the seminal claims he was raising in this famous text. It is not clear that Wittgenstein knew that there are two binary functions with the same property of being functionally complete. Wittgenstein’s connective symbol may appear, at first blush, to be the same symbol as NOR, which is the connective used by Sheffer himself in his alternative axiomatization of Huntington’s system. Wittgenstein’s connective has been mistaken as such even by Bertrand Russell, but this is a mistake. Wittgenstein uses a rather eccentric function, known in the literature as the N-operator, which has attracted attention and even led to disputes. Although this is not the place to enter into details, a few words are in order about Wittgenstein’s N-operator which is not the sentential NOR operator although it is inspired by it. A technical study of the subject is given by Soames (1983; see also Geach, 1981.)

Wittgenstein’s N-operator is defined over an open-ended set of propositional variables. Because the language that is needed is that of first-order or predicate logic, a propositional variable atom is a predicate symbol, of any arity n, accompanied by n individual constants all of which have as specified referents members of the universe of discourse (or domain.) It is an open problem for Wittgenstein’s language (whose grammar specification is rudimentary) that the domain set may or may not have a denumerably infinite number of subjects. Assuming a finitary domain for this brief excursion, and bear in mind that whatever fixes are available to address problems with Wittgenstein’s operator, are not efficient in the case of an infinite domain. Consider a grammar that comprises symbol letters for 22, individual constants, predicate (non-logical) constants, and the operator symbol. (These are not Wittgenstein’s symbols. Instead he legislates: $$\{ẑ, Nẑ\}$$ where the circumflex hints at the recursive mode of defining what expressions are grammatically correct. He uses “ξ” instead of “z” for molecular, not necessarily atomic, well-formed formulas.)

$$\{x/a_{i}/F_{j}^{n}/N\}$$

Then, application of the N-operator is, by definition, to negate all atomic propositions $$\ulcorner F^{n}a_{1} \ldots a_{n}\urcorner$$ in the set. This means that the N-operator can be defined through the following logical equivalences (insofar as the additional symbols are allowed in the metalanguage). The symbols for the existential and universal quantifier are “$\forall$” and “$\exists$”. These are missing from Wittgenstein’s language which is more parsimonious; but, as will be seen, Wittgenstein’s language, constructed on the N-operator, is expressively incomplete! Take, as example, the case of a unary predicate constant:

$$N\{Fa_{1}, \ldots, Fa_{n}\} \leftrightarrow \forall x \neg Fx \leftrightarrow \neg \exists xFx$$

One could then proceed to iterated applications of the N-operator, which will now give a clue as to how Wittgenstein’s operator is expressively incomplete.

$$
N(N\{Fa_{1}, \ldots, Fa_{n}\}) \leftrightarrow \forall x \neg (\forall x \neg Fx) \leftrightarrow \forall x\exists xFx \leftrightarrow \exists xFx$$

The symbolic language cannot sort out more than one individual variable within the scope of another variable. It can express a formula like the following:
\begin{multline*}N \lbrack N\{F_{1}a_{1}, \ldots, F_{1}a_{n}\}, N\{F_{2}a_{1}, \ldots, F_{2}a_{n}\}\rbrack \leftrightarrow \forall x \neg (\forall x \neg F_{1}x \vee \forall x \neg F_{2}x) \leftrightarrow \\ \forall x(\exists x_{1}F_{1}x \wedge \exists x_{2}F_{2}x) \leftrightarrow (\exists x_{1}F_{1}x \wedge \exists x_{2}F_{2}x)
\end{multline*}
But the language lacks the resources to express formulas like the following, for which differentiation of individual variables within scopes is required:

$$\forall x \exists yFxy$$
$$\exists y \forall xFxy$$

Interestingly, the language also lacks resources for expressing $\ulcorner \exists x \neg Fx\urcorner$ . As Soames shows (1985), the defect can be remedied by adopting some additional symbolic convention that permits differentiation of individual variables within scopes. Thus, ironically, Wittgenstein’s constructed analogue to a Sheffer function, his N-operator, lacks expressive completeness. The set $\{N\}$ dispenses with the need for other connective symbols, and also for quantifier symbols (of which Wittgenstein thinks are defined through inclusive disjunction or conjunction, again disregarding the prospect of an infinite domain); yet, the language cannot express all constructible formulas of first-order logic. It was Moses Schöfinkel, the originator of combinatorial logic, (Bimbo, 2010) who constructed a functionally complete language for first-order logic using one Sheffer function.

To conclude, consider the discussion of functional completeness, as touted by Wittgenstein in the Tractatus, putting aside the vicissitudes of his symbolic language. Although Wittgenstein claimed that the main subject of his Tractatus is ethical, the work examines a plethora of philosophical and logical subjects. An oft-discussed overriding objective of the work is to demarcate the limits of language; what cannot be expressed by language can be “shown,” as Wittgenstein famously claimed. The present subject fits within the Tractatus’ discussion of the nature of propositional logic and its relationship to the task of elucidation of meaning. (See Wittgenstein.)

Bursting into the scene on the heels of advances in modern logic made by Frege and Russell, the Tractatus is remarkable for its contributions to the philosophical discussion of the new logic as an instrument for clarification of logical meaning. Wittgenstein later abandoned the work’s objective of constructing an ideal formal language that would be “isomorphic” to the world of empirically ascertainable facts; he also moved away from a version of the Correspondence Theory of Truth that seems to be underpinning the Tractatus.

In the Tractatus, Wittgenstein explains that the logic of our theories about the world is not itself to be sought in the world. Let us assume that “A” symbolizes the proposition expressed by the sentence “snow is white” and “B” symbolizes the proposition “snow is a kind of precipitation.” Let us also assume for our present purposes that the truth or falsehood of propositions A and B are to be established by referring to empirical facts. It so happens in this example that both propositions, expressed by the two English sentences, are true in our actual, empirically accessible, world. Now form the compound proposition “A and B.” This new proposition must be true because both its component propositions are true. This is evident because the meaning of “and.” But how is this known? The empirical world itself does not come to our assistance. We know this regardless of empirical experience: what we know is that any compound proposition of the logical form “p and q” has to be true if, and only if, both of its components, the individual or atomic propositions p and q, are true. Thus, given p and q, the conclusion “p and q” follows validly: it is logically impossible to have a case in which the given premises are all true but the conclusion is false. Nevertheless, the logical meaning of any conjunctive proposition of the logical form “p and q” is identical with its truth conditions which comprises the determinate relations between truth value assignments to the components (whether p and q are true or false) the functionally determined truth value of the whole conjunction. Thus, the empirical fact that the conjunctive sentence is true in our actual world is irrelevant from the standpoint of the logical meaning (the truth conditions) of the logical form exemplified by the sentence “snow is white and snow is a form of precipitation.” The valuation dependency $$<<T, T>, T>$$ is one of four logically possible combinations which comprise the truth conditions of the conjunctive logical form: $$\{<<T, T>, T>, <<T, F>, F>, <<F, T>, F>, <<F, F>, F>\}$$ The actual world is not logically privileged, and Wittgenstein’s conceit that an isomorphic mapping can be accomplished, which would produce an ideal language of comprehensive applicability, was bound to be frustrated. Disregarding this rather metaphysical aspect, which Wittgenstein later also disregarded, the Tractatus contains an astute understanding and analysis of the formal logical instrument that has arisen out of modern mathematical developments. Wittgenstein’s contribution to the discussion of functional completeness fit under this aspect of the work.

Wittgenstein makes the point that “internal,” or “structural,” features of propositional forms account for truth preservation from the joint premises to the conclusion of a valid argument form. It is structural features that account, for instance, for the equivalence of logical meaning between any two propositions. This means that the propositions have forms that receive the same truth values for the same valuations (truth value assignments to their components.) Cases or valuations (also called interpretations and models) are determined by assigning truth values, true and false, to all the components of a propositional form. Wittgenstein uses the term “truth grounds” and “(logically) possible worlds” when referring to truth value assignments or valuations. Wittgenstein says that “these relations are internal and they exist as soon as, and by the very fact, that the propositions exist.” (1922, 5.13) The next thesis in Wittgenstein’s text (5.1311) is the one in which he uses his N-operator. The point made there is now presented roughly: having briefly examined the complications that arise out of Wittgenstein’s definition of an N-operator, one adjusts, instead, to a propositional language, pretending that Wittgenstein actually used the NOR function to make his case. Nothing is lost in this way because the point is to illustrate Wittgenstein’s remarks on the significance of functionally complete operators rather than to pursue further any details attaching to the N-operator itself.

Consider a valid argument form:

$$p \vee q, \neg p \vdash q$$

The usual name of this valid argument form is Disjunctive Syllogism. This is not a string of propositional forms; it is a schema, and so is something like a recipe for how to proceed correctly when drawing inferences. Wittgenstein makes the point that conventions of symbolism may create the wrong impression that there is no internal, structural connection running through all propositional forms; that there is something newly productive introduced by the multiple (connective) symbols. This, however, would be wrong. The accidental fact that many different symbols are used is what is misleading. Moreover, Wittgenstein has philosophical objections to working from the semantic side of constructing logical systems, and this has consequences for the subject under discussion. Wittgenstein considers semantic attempts to be nonsensical: for instance, to specify the referent of conjunction, in order to obtain a working semantics, commits one to the nonsense of speaking about extra-empirical items and, indeed, about things that cannot be talked about. This way of thinking shows certain underlying philosophical assumptions, which lie beyond this article’s scope, but the problem that arises is this: The construction of a logical system is to be understood as a matter of specifying formal-grammatical rules for concatenating and transforming the available symbolic resources of the system. Because of this, the failure of the grammatical or syntactical setup to show perspicuously what happens in logical operations is serious. Hence, it is imperative to show solely by manipulating the symbolic resources that there is an internal structural connection that relates all possible transformations. This is accomplished by using only one functionally complete operator symbol. This is the reason Wittgenstein extols the “discovery”. Even if one opts to multiply connective symbols, because of the greater simplicity and even intuitive appeal gained in that way, it is still crucial to be able to show that only one connective symbol suffices. Indeed, as is known from the above study of functional completeness, one could have opted for eliminating all but one connective symbol, one of the Sheffer functions. Consider further how the claim is to be made that single-connective symbolism reveals something deeper about logic itself.

Logical properties are structural features of the forms: thus, one can have tautologous, contradictory, and indeterminate (also called contingent and indefinite) propositional forms. All tautologies would have to have the same referent which, in the Fregean analysis, is the truth value true. If semantic referents are rejected, however, that leaves the grammatical means for showing the collapse of all tautologies, namely that they all have the logical meaning. The same is the case with all contradictory logical forms; they check as false for all logically possible assignments of values to their components. The remaining structural type, the contingent propositional form, is basically not logic’s business! This is indicated by the convention of assigning both truth values to a single propositional variable to generate two cases: these are two logically possible worlds if one is to semantically model the setup. The proposition can logically be true in one case and false in another; as a proposition it must be one or the other and it is not logically possible for it to be both true and false. Notice then that the two logical possibilities (p-T and p-F) have the same status. It does not matter if one of those, for an interpretation of the propositional symbol, happens to be the actual world. Logic, not depending on the workings of the empirical world, is attuned to characteristics that are invariable across all possible cases: this means, tautologies, which are true in all logically possible cases, and contradictions, which are false in all logically possible cases. The validity of the inferential schema above guarantees, for two-valued logic, that the following is a propositional tautology:

$$\vdash ((p \vee q) \wedge \neg p) \rightarrow q$$

Once again, the proliferation of symbols obscures the facts about the internal structural simplicity of logic. All compound propositional forms are internally connected because they result from elementary propositional forms by means of connectives. The logic is determined by how the logical connectives are defined. Starting with elementary (also called individual or atomic) propositions, one always proceeds by combining them by means of connectives: the compounds generated are in every case dependent for their meanings (truth and falsehood) on the meanings (truth and falsehood) of their components. If one were to proceed in the opposite direction, from compounds toward the elementary propositions, there would be a decomposition of the compound propositions; the process would terminate with the elementary propositions. This is possible because all the connectives are truth-functional connectives. Thus, if, for instance, “p and q” is given as true, one can dissolve this into “p is true” and “q is true” given the definition of “and.” Once again, one sees that propositional forms are related with each other and, ultimately, they are related to two basic propositions, the true and the false, out of which any complex can be generated by using truth-functional connectives. This also shows that nothing in the logic of propositions can ever be arbitrary.

The symbolism that uses multiple connective symbols obscures this. A stronger point can be made: Something is wrong with a notational idiom, a symbolism, that fails to capture the identity of logical meanings (logical equivalence). For instance, consider the following two logically equivalent expressions or formulas, which are well-formed, it is assumed, in the idiom or notation (and represented here in the symbolically enriched metalanguage):

$$\neg (p \wedge q) \dashv \vdash (\neg p \vee \neg q)$$

Even though the expressions are logically equivalent, the grammatically correct formulas representing them are not the same! This may be considered a radical notational or formal-grammatical defect. It gets even worse. There is a view that formalism is fundamentally a matter of systematic and specified manipulation of symbolic resources. Consequently, the defect faced in this case goes all the way to the roots of the most basic task of all: how to construct a faithful symbolic system relative to a given purpose. In that case, it would appear that the correct way to construct a formal system is exclusively through its minimally functionally complete sets of operators. If one has to switch to alternative idioms that have redundant operators in them (operators that can be defined by the other operators in the system), that would have to be justified by pleading such a reason as expediency or convenience.

The symbolic notation of a formal language idiom that uses only one connective symbol would remove this notational illusion, or, to make the stronger case, would remedy the deep formal-grammatical defect: then it could be perspicuously shown that all that is had is an unfolding of internal connections that run across propositional forms. Wittgenstein proceeds to write the above argument schema by using one single connective symbol which allows elimination of the symbols for disjunction and negation in order to make “the inner connection” obvious. (The contemporary symbol for the connective is the one used by Wittgenstein, which is NOR.)

$$(p \downarrow q) \downarrow (p \downarrow q), p \downarrow p \vdash q$$

To do this, replace “$p \vee q$” by “$(p \downarrow q) \downarrow (p \downarrow q)$” (thus eliminating the inclusive-disjunction symbol) and “$\neg p$” by “$p \downarrow p$” (thus eliminating the negation symbol). The NOR symbol is used to effect both eliminations. As a result, we have the schema shown above, in which only one connective symbol is used. Of course, one could have used the NAND or Sheffer Stroke function to effect the same elimination, in which case the result would be:

$$(p \mid p) \mid (q \mid q), p \mid p \vdash q$$

Moreover, when multiple logical connectives are used in the construction of a formal system, an impression of arbitrariness may be created. Why, one may ask, is one set of logical connectives used instead of another set? The right answer is that nothing depends on which connectives are used because all the propositional formulas are internally related in strict, non-arbitrary fashion, and the construction ultimately depends on the basic building blocks and connectives. To illustrate this point, construct a formal system of the standard propositional logic by using as its set of connectives either $$\{\neg, \rightarrow\}, \{\neg, \vee\} or \{\neg, \wedge\}$$. Basically, it amounts to the same thing whichever one is used. This is not immediately obvious regarding the plurality of connectives seen above. But now consider how all of the connectives in these sets are definable in terms of the connective in $\{\mid\}$. Thus, $\{\neg, \rightarrow\}$ can be replaced by $\{\mid\}$; and $\{\neg, \vee\}$ can be replaced by $\{\mid\}$; and $\{\neg, \wedge\}$ can be replaced by $\{\mid\}$. This fact makes clear that nothing depends on arbitrary choices about the connectives used. This discovery can be used as proof that there is a strict internal connection that runs through all the expressive resources.

Wittgenstein even points out (5.42) that having connectives in the formal system, which are interdefinable, means that they should not be properly regarded as “primitives.”
Now one can revisit the subject of the triviality of tautologies (and of logical contradictions), which is another subject on which Wittgenstein touches. There is one tautology, and a contradiction is the negation of the tautology (for the standard definition of negation.) Of course, negation itself can be expressed in terms of a Sheffer function. The ultimately perspicuous manifestation of the inner structural inter-connectedness of all logical propositions can be shown insofar as all valid tautologies can be derived from one single axiom that uses a single connective symbol. Rules of transformation and inference can be specified, to be applied to the axiom schema, to generate all valid tautologies. This is indeed possible, as French logician Jean Nicod (1917) demonstrated by constructing producing a one-postulate axiomatization of the standard propositional logic. Nicod’s postulate, written with metalinguistic symbols for writing a schema, is:

$$(\Pi \mid (\Sigma \mid \Psi)) \mid ((\Theta \mid (\Theta \mid \Theta)) \mid ((X \mid \Sigma) \mid ((\Pi \mid X) \mid (\Pi \mid X))))$$

An alternative and equivalent formulation of the Nicod Postulate, which avoids having any sub-formulas of the postulate schema being tautologous, is the following. (Notably, in the original formulation, the sub-formula $\ulcorner t \mid (t \mid t)\urcorner$ is tautologous.)

$$(\Pi \mid (\Sigma \mid \Psi)) \mid ((\Pi \mid (\Psi \mid \Pi)) \mid ((X \mid \Sigma) \mid ((\Pi \mid X) \mid (\Pi \mid X))))$$

The Nicod Postulate can be deployed as the single axiom in a formal system whose only rule of inference is given by the following rule schema:

$$\Pi, \Pi \mid (\Sigma \mid \Psi) \vdash_{nicod} \Psi$$

6. References and Further Reading

Béziau, Jean-Yves. 2001. “Sequents and Bivaluations”, Logique et Analyse 176, pp. 373-94.
Bimbó, Katalin. 2010. “Schöfinkel-Type Operators for Classical Logic”, Studia Logica 95: 355-78.
Church, Alonzo. 1996. Introduction to Mathematical Logic, revised and enlarged edition,
Princeton: Princeton University Press.
Church, Alonzo. 1953. “Review of Sobociński (1953), Journal of Symbolic Logic 18: 284-85.
Geach, P. T. 1981. “Wittgenstein’s Operator N”, Analysis 41: 168-71.
Goodell, John D. and Tenny Lode. 1953. “Decision Elements”, Journal of Symbolic Logic 18:
283-84.
Hilbert, D., and W. Ackermann. 1928. Grundzüge der theoretischen Logik. Berlin: Springer.
Houser, N., Roberts, Don D., and Van Evra, James (eds.). 1997. Studies in the Logic of Charles Sanders Peirce. Bloomington, IN: Indiana University Press.
Nicod, Jean G. P. 1917. “A Reduction in the Number of Primitives Propositions of Logic”, Proceedings of the Cambridge Philosophical Society 19: 32-41.
Peirce, Charles S. 1931-1966. Collected Papers of Charles Sanders Peirce. 8 volumes, ed. by
Hartshorne, C, Weiss, P. and Burks, A. W.. Cambridge, MA: Harvard University Press.
Peirce, Charles S. 1967. “Annotated Catalogue of the Papers of Charles S. Peirce”,
Manuscripts in the Houghton Library of Harvard University, as identified by Richard
Robin. Amherst: University of Massachusetts Press.
Peirce, Charles, S. 1971. “The Peirce Papers: A supplementary catalogue”, Transactions of
the C. S. Peirce Society 7: 37–57.
Pelletier, Jeffrey Francis and Norman M. Martin. 1990. “Post’s Functional Completeness
Theorem”, Notre Dame Journal of Formal Logic 31: 462-75.
Post, Emil L. 1921. “Introduction to a General Theory of Elementary Propositions”, American Journal of Mathematics 43: 163-85.
Post, Emil L. 1941. The Two-Valued Iterative Systems of Mathematical Logic, vol. 5 of Annals of Mathematical Studies, Princeton: Princeton University Press.
Prior, Arthur N. 1962. Formal Logic. Oxford: Clarendon Press.
Quine, Willard Van O. 1995. Selected Logic Papers, enlarged edition, Cambridge, MA:
Harvard University Press.
Read, Steven. 1999. “Sheffer’s Stroke: A Study in Proof-Theoretic Harmony”, Danish
Yearbook of Philosophy 34: 7-24.
Riser, John. 1967. “A Gentzen-Type Calculus for Sequents for Single-Operator Propositional
Logic”, The Journal of Symbolic Logic 32: 75-80.
Sheffer, H. M. 1913. “A Set of Five Independent Postulates for Boolean Algebras, with
Application to Logical Constants”, Transactions of the American Mathematical Society 14: 481-88.
Soames, Scott. 1983. “Generality, Truth Functions, and Expressive Capacity in the Tractatus”, The Philosophical Review 92: 573-89.
Sobociński, Bolesław, 1953. “On a Universal Decision Element”, Journal of Computing Systems 1: 71-80.
Wernick, William. 1942. “Complete Sets of Logical Functions”, Transactions of the American
Mathematical Society 51: 117-32.
Whitehead, Alfred and Bertrand Russell. 1910, 1912, 1913. Principia Mathematica, 3
volumes. Cambridge: Cambridge University Press; 1925, 1927. Principia Mathematica, second edition, 2 volumes, Cambridge: Cambridge University Press.
Wittgenstein, Ludwig. 1922. Tractatus Logico-Philosophicus, tr. C.K. Ogden. London:
Routledge & Kegan Paul.

Author Information

Odysseus Makridis
Email: makridis@fdu.edu
Fairleigh Dickinson University
U. S. A.

Material Composition

A material composite object is an object composed of two or more material parts. The world, it seems, is simply awash with such things. The Eiffel Tower, for instance, is composed of iron girders, nuts and bolts, and so on. You and I, as human beings, are composed of flesh and bone, and various organs. Moreover, these parts themselves are composed of further parts, such as molecules, which themselves are composed of atoms, which are composed of sub-atomic particles. Material composite objects are, it seems, ubiquitous. However, despite their ubiquity, a little philosophical reflection on the matter, as is so often the case, reveals that they are also deeply puzzling.

The question which has received most attention from philosophers interested in material composition is: under what circumstances do two or more material objects compose a further object? Why is it, for instance, that a collection of iron girders that are bolted together in the centre of Paris do compose an object (that is, the Eiffel Tower), but that there is no object composed of the Eiffel Tower and the Moon? What conditions are satisfied by the first set of objects, and not by the second set of objects, which make this the case? In short, what are the necessary and sufficient conditions for composition to occur?

Since the 1980s, philosophers have devoted considerable attention to this question, and it has proved difficult to answer. This article provides a survey of the various answers that have been given to this question, plus the arguments that have been offered in their defence.

Some Important Preliminaries
1. Mereological Technicalities
2. Composition and Constitution
The Special Composition Question
1. Answering the Special Composition Question
Compositional Restrictivism
Compositional Universalism
1. Arguments for Universalism
  1. The Argument from Elimination
  2. The Argument from CAI
2. Arguments against Universalism
Compositional Nihilism
1. Arguments for Nihilism
2. Arguments against Nihilism
Deflationism
1. Hirsch and Quantifier Variance
References and Further Reading

1. Some Important Preliminaries

a. Mereological Technicalities

The topic of material composition falls under the wider purview of mereology, which is simply the study of parts and wholes. Much of the focus of mereology over the last hundred years or so has been on producing a formal theory of part–whole relations, that is, a formal theory of the logical relations that hold between parts and the wholes they compose (examples include Lesniewski, 1916; Leonard and Goodman, 1940; Simons, 1987). The current entry will overlook much of the formal side of the study of mereology, and will instead concentrate on some of the key metaphysical questions concerning the nature of material composite objects, such as whether there are any such things, and what criteria some things need to satisfy in order to compose a composite object. However, it will be useful in the first instance to define a few of the key technical terms and expressions that are peculiar to the field of mereology:

Part:

The term ‘part’ has a slightly different meaning in mereology to that which it has in ordinary language. In ordinary language, we use the term part to mean a portion or subsection of an object, for example, the Earth is part of the Solar System, the tail is part of the cat and so forth. In mereology, however, the term is used such that not only are an object’s subsections its parts (for example, the tail is part of the cat), but objects are also taken to be parts of themselves (for example, the cat is part of the cat). So if you were tasked with writing an exhaustive list of all the cat’s parts, on this understanding of the term, you should include the cat itself on the list.

Proper Part:

Philosophers have taken to distinguishing parts from what are called ‘proper parts’. ‘Proper part’ is the mereological term that would best tally up with our ordinary or common-sense use of the term ‘part’, in that an object’s proper parts exclude the object itself. Thus, if you were tasked with writing an exhaustive list of all the cat’s proper parts, the cat itself should not be included on the list.

Plurally Referring Expressions:

Following Peter van Inwagen (van Inwagen, 1990), it has become common to use the plurally referring expression, ‘the xs’, to refer to some plurality of material objects. This enables one to refer to a number of objects at a time in a neutral manner, without supposing that those objects do (or do not) compose a further object.

Composition:

Some xs compose a further object, y =_df the xs are all parts of y, none of the xs overlap, and every part of y overlaps at least one of the xs.

(The qualifications about ‘overlap’ in the above definition can make it sound a bit more complicated than it really is. They merely stipulate that one should not list overlapping parts of an object when listing the parts that compose it. For instance, suppose a necklace were made entirely of pearls. In that case, it would be correct to say that the pearls compose the necklace. But, given that the pearls themselves are made of atoms, it would also be correct to say that the atoms compose the necklace. However, it would be wrong to say that the pearls and the atoms compose the necklace, since the pearls overlap the atoms.)

Fusion:

y is a fusion of the xs =_dfthe xs compose y

(Note: the term ‘sum’ is sometimes used instead of ‘fusion’.)

Simple:

x is simple =_df x has no proper parts

(Note: ‘simple’ is sometimes used as a noun, as well as an adjective, thus one might speak of, ‘a simple’, or ‘the simples’.)

These are just a few of the many technical terms involved in formal mereology, and they are defined here quite informally, for ease of understanding. For those interested in formal mereology, Peter Simons’ 1987 book, Parts: A Study in Ontology, provides an excellent place to start.

b. Composition and Constitution

The debate over material composition should be distinguished from a related debate concerning material constitution. Material composition concerns the question of when two or more objects compose a further, composite object. (For instance, if you attach four wooden legs to a flat wooden surface, do those five objects now compose a new object: a table?) Those interested in material constitution, by contrast, are interested in the question of when one object (for example, a lump of bronze) constitutes another object (for example, a statue of Napoleon), and indeed, what the relation of constitution actually consists in. Material constitution presents some real puzzles of its own. For instance, is the lump of bronze a distinct object from the statue of Napoleon, or are they numerically identical? If we adopt the latter view, that is, that there is just a single object there, but one that can be called by different names (that is, ‘lump of bronze’ or ‘statue of Napoleon’), we seem to run into trouble. The trouble emerges when you consider what happens if you were to melt down the statue and form it into a shapeless lump. The lump of bronze, it seems, would still exist; but the statue would clearly not. By melting it down, you destroy the statue, but you do not destroy the lump. This might suggest, therefore, that the lump and the statue were not identical after all. Perhaps, then, we should adopt the former view, and say that the statue and the lump are not identical but, in fact, distinct objects. The problem with this, however, is that it now looks as though, before the melting down occurred, we had two distinct objects occupying exactly the same space at exactly the same time, which, one could plausibly argue, is impossible. This is the central problem of material constitution, and it has generated a considerable literature (see Rea, 1997).

Although the debate over material constitution is certainly a puzzling one, it is quite distinct from the debate over material composition. However, the two are related in certain ways, and there are times at which adopting a view on one debate might well have an impact on one’s view concerning the other. The differences, and similarities, between these two distinct debates, and how they interrelate, will become clearer as the entry progresses.

2. The Special Composition Question

Questions concerning material composition have a long history in philosophy, but they have attracted increased attention over recent years thanks largely to the work of Peter van Inwagen. In a 1987 article, and at greater length in his 1990 book, Material Beings, van Inwagen posed what he called the Special Composition Question (SCQ from hereon). (It is only fair to note here that van Inwagen actually credits Hestevold, 1981, with originally formulating the SCQ, but it is van Inwagen who made it well known). This question can be phrased as follows:

(SCQ): Under what conditions do two or more material objects compose a further, composite object?

In other words: what is required in order for some objects to be parts of another object? Or as van Inwagen has put it, if you had two objects, what would you need to do to them in order to get them to compose something?

It is perhaps worth noting here that van Inwagen called this the ‘Special’ Composition Question, in order to differentiate it from what he called the ‘General Composition Question’ (GCQ). The GCQ asks the broader question of what the composition relation actually is, in general. Van Inwagen was sceptical about the prospects of answering this question, stating he did not even know how to approach it, let alone answer it. It seems that most philosophers have followed suit, as there is not a great deal of literature on the GCQ. (However, see Hawley, 2006, for an attempt to shed some more light on the matter).

a. Answering the Special Composition Question

A satisfactory answer to this question should take something like the following form:

(ANSWER): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if _______________________________________.

The task, therefore, is to fill in the right-hand side of the above biconditional. But as van Inwagen went on to show, this is no easy task. In particular, it seems very difficult to provide a principled and systematic answer to the SCQ that accommodates our common-sense intuitions about when composition does and does not occur. In the end, he concluded that it is impossible to provide such an answer. Instead, his own answer is radically counter-intuitive. Van Inwagen’s answer to the SCQ, which has come to be known as ‘organicism’, is:

(ORGANICISM): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if the collective activity of those xs constitutes a life.

The reason that this answer is so counter-intuitive is that if it is true, it means that the only composite objects in existence are living beings. Inanimate composite objects, according to this view, do not exist. There are no cars or buildings, tables or chairs, planets or stars and so forth. Van Inwagen recognises just how radical this view is—indeed, he calls it ‘the denial’—but he insists that a thorough analysis of the SCQ leads inexorably and inevitably to it. Van Inwagen’s own answer to the SCQ has not proved to be all that popular. However, the SCQ itself has generated huge amounts of subsequent interest and swathes of further literature.

It is important to note that any proposed answer to the SCQ will fall into one of the three following categories:

Compositional Universalism:

Whenever you have two or more material objects, there is always a further object that they compose.

Compositional Nihilism:

No objects compose, and no objects have parts. That is, there are no composite objects in existence.

Compositional Restrictivism:

Some collections of material objects compose further objects, but others do not.

Each of these approaches to the SCQ comes with its own merits and demerits, and each has been defended (and attacked) in the contemporary literature. Van Inwagen’s own answer falls into the third category, as it says composition occurs sometimes, but only sometimes (specifically, when some xs partake in collective activity which constitutes a life). What follows will survey some of the central arguments that have been given for and against each of these three positions.

3. Compositional Restrictivism

There is one very compelling reason to think that some variety of compositional restrictivism must be true: common sense. On first inspection, it seems simply obvious that composition is restricted, that is, it occurs sometimes, but not all the time. After all, one does not need to engage in much serious reflection to realise that the Eiffel Tower, for instance, is composed of iron girders, and that the Great Pyramid of Giza is composed of limestone blocks. Yet equally obvious is the fact that there is no object which these two great edifices together compose (that is, there is no object which has just the Eiffel Tower and the Great Pyramid of Giza as parts). So, since it is plainly evident that there are some cases in which objects do compose and other cases in which they do not, it also seems plainly evident that composition must be restricted.

The challenge for the restrictivist, however, is to formulate an answer to the SCQ that accommodates these common-sense intuitions. That is to say, she must specify the necessary and sufficient conditions under which composition occurs, such that they are satisfied by the iron girders in Paris (which compose the Eiffel Tower), and the limestone blocks in Giza (which compose a pyramid), but not satisfied by the girders and blocks taken together (so that we do not end up with some rather unusual composite pyramid-tower, or suchlike). The literature that has emerged on this topic shows that providing such an answer is no easy task.

a. Simple Bonding Answers

In Material Beings, Peter van Inwagen tried to formulate an answer to the SCQ that preserves some of our common-sense intuitions about composition. He noted that these intuitions very often seem to be based on certain facts about how objects are grouped or connected together. That is, we often seem to think that objects compose a further object if they are bonded together in some appropriate way. The reason that the iron girders in Paris compose a tower, for instance, is that they are fastened together with many millions of bolts and rivets, and what have you, to form a solid and rigid structure. Moreover, the reason that the Eiffel Tower and the Great Pyramid of Giza do not compose a material object is precisely because they lack any such bonding or unity; they are completely distinct and disconnected objects separated by well over a thousand miles. Perhaps, then, bonding could be the secret to unlocking the SCQ?

Van Inwagen labelled his first attempt at a bonding-style answer to the SCQ, CONTACT. Very simply, it states that objects need to be physically touching one another if they are to compose a further object.

(CONTACT): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if the xs are in contact with one another.

Although this answer certainly does give us the intuitive result that the collection of iron girders in Paris do compose a tower, and the limestone blocks in Giza do compose a pyramid, it also entails certain conclusions which simply fly in the face of common sense. As van Inwagen notes, if CONTACT were true, it would mean that every time you shook someone’s hand, a new material object would instantaneously pop into existence, only to vanish back into nothingness once the handshake ceased. The sheer absurdity of this consequence seems to suggest that CONTACT cannot possibly be the correct answer to the SCQ, particularly when you remember that what originally motivated it was a desire to preserve common sense.

Van Inwagen went on to consider a number of other bonding-style answers to the SCQ, which he called FASTENING, COHESION, and FUSION. Each of these solutions involves a greater strength of bond than the last, culminating in FUSION, whereby for objects to compose, they must be fused together, which means they must be ‘melt[ed] into each other in a way that leaves no discernible boundary’ (Van Inwagen, 1990, 59).

In light of the above comments, however, it should be fairly straightforward to see that none of these answers are going to work (at least, none of them will satisfy common sense). If we return to the example of two people shaking hands, it seems evident that even if you stick their hands together, even if you fuse them with an unbreakable adhesive, you will never make them compose a single object. You will simply have two objects—two distinct persons—in the rather unfortunate situation of being stuck.

Moreover, all these bonding answers fail to account for the possibility of what are known as scattered composite objects—that is, composite objects whose parts are not in contact with one another. But common sense suggests that there are in fact such scattered objects. A bikini, for instance, seems to be an ordinary composite object, yet it is composed of two distinct, and spatially separated, parts. Or the USA, to give another example, seems to be a composite object, yet it is composed of spatially disconnected parts—the island of Hawaii is separated from the mainland by a considerable distance, as is Alaska. If any variety of bonding-style answer were correct, then it would turn out that there are not, in fact, any bikinis in existence, and even more worryingly, many Hawaiians and Alaskans would lose their country of residence! Bonding-style answers, therefore, have found very few supporters.

b. Series-Style Answers

Van Inwagen then went on to consider the idea that there is perhaps not a single, one-size-fits-all answer to the SCQ, but instead, that different criteria will apply to different types of objects, according to which they will compose or fail to compose. The thought is that the criteria that a bunch of cells need to satisfy in order to compose a human being, for instance, might be very different from the criteria that a bunch of bricks might need to satisfy in order to compose a house. If this is right, then perhaps when answering the SCQ, we need to set out the specific criteria of composition for different types of material object. Such answers have come to be known as series-style answers (SSAs), since they will consist of a long series of different criteria that different types of object must satisfy in order to compose. A SSA to the SCQ will look something like the following:

(SSA): for any xs (where those xs are material objects), there is a further material object, y, composed of those xs if and only if the xs are F1s and stand in relation R1, or the xs are F2s and stand in relation R2, or …, the xs are Fns and stand in relation Rn.

The attraction of this kind of answer is that it looks like it might accommodate certain intuitions we have about composition, such as the fact that by fastening bricks together with cement, you can compose a further object (for example, a house), but by fastening human beings together with cement, you cannot.

Van Inwagen was fairly quick to dismiss the prospects of a satisfactory SSA to the SCQ, however, as he thought that they suffered from a number of difficulties. One of the main problems he foresaw was that a SSA to the SCQ would violate the transitivity of parthood, which he took to be an unacceptable consequence. It is clear to see why one might well assume that parthood is a transitive relation. For if x is a part of y, and y is a part of z, then it just seems evident that x must also be a part of y. For example, if the bearing is part of the wheel, and the wheel is part of the car, then the bearing must also be part of the car.

Van Inwagen claimed, however, that SSAs to the SCQ would violate this transitivity. For instance, suppose we endorsed a SSA that included the fact that xs composed ys if and only if they were related by R1, and ys composed zs if and only if they were related by R2. In that case, an x could be a part of a y which was itself part of a z, yet x would not be part of z (because, as per the answer, xs cannot compose zs; zs can only be composed by ys related by R2).

Since the publication of Material Beings, surprisingly, little attention has been paid to the possibility of SSAs. However, some recent work on the topic suggests that van Inwagen’s dismissal of such answers may have been a little hasty. Silva (2013) has responded to van Inwagen’s objections and shown that SSAs need not be inconsistent with the transitivity of parthood. Carmichael (2015) has gone one step further and formulated a clearly defined SSA to the SCQ—one which he claims satisfies our common-sense intuitions about composition and which overcomes van Inwagen’s objections.

c. Sorites Paradoxes and Sharp Cut-Off Points

A significant problem that affects all restrictivist positions (or, at least, virtually all of them—see the following section, 3d, for one exception to this) is that they are susceptible to sorites-style arguments. This style of argument takes its name from the ancient sorites paradox, or the paradox of the heap (Soros, from where the term ‘sorites’ derives, is Greek for ‘heap’). The paradox, which is usually accredited to the Greek philosopher, Eubulides, is simple to set up. First, consider a single grain of rice. It seems quite clear that a single grain of rice is not a heap of rice, and neither is two grains, nor three. But if we had ten thousand grains, we would most certainly have a heap. The paradox arises because it is difficult, if not impossible, to state the precise point at which the heap emerges. The crucial thought that drives the paradox is that a single grain of rice, it is supposed, is simply not significant enough to make the difference between a heap and a non-heap. Adding or removing just one grain of rice could never create or destroy a heap. But if this is right, it seems to follow that if you start off without a heap, then by adding grains one at a time, you will never be able to make a heap, no matter how many grains you add. Conversely, if you start off with a heap, then by removing grains one at a time, you will never get rid of the heap, even if you were to remove all the grains! Thus, the paradox ensues.

The force of the sorites paradox strikes right at the heart of compositional restrictivism. To see why, just consider any ordinary composite object; let us say a chair. That chair will be composed of many billions of atoms, each one of which will be very small indeed. Now when you consider just how small a single atom actually is, it seems quite clear that the difference of a single atom could not possibly make the difference of there being a chair or there not being a chair. To suppose otherwise seems, frankly, preposterous. But, now suppose that, with some ultra-high-precision tweezers, you began the long and laborious task of removing atoms from the chair, one by one. Eventually, you would reach a stage at which you had removed all the atoms except one; at which point, you would clearly no longer have a chair in front of you. (A single atom doth not a chair make.) What seems to follow from all this is that there must be a cut-off point at some stage of the atom-removal process at which the removal of a particular atom makes the composite object—the chair—suddenly cease to exist. To many, however, this is a simply fantastical proposal! To suppose that a single, nugatory atom could make the difference between a chair’s existing and not existing is a very hard conclusion to swallow.

These sorts of considerations, concerning sorites-style arguments and sharp cut-off points, have led many to believe that restricted composition, in any of its possible guises, is untenable. Peter Unger (1979; 1980) is perhaps the most notable advocate of using sorites-style arguments against the existence of ordinary objects.

It remains to be said, however, that although these sorites-style arguments certainly have force, they are not without opposition. Both Korman (2015) and Carmichael (2011), for instance, have articulated responses to the arguments and maintain a resolute conviction that composition is restricted.

d. Brutal Composition

Ned Markosian is one of the few philosophers who has persevered with restrictivism. In a 1998 paper, he outlines and argues for a novel view which he calls ‘Brutal Composition’. According to this view, ‘there is no true, non-trivial, and finitely long answer to the SCQ’ (Markosian, 1998, 213).

Instead, Markosian claims that whenever composition does or does not occur is simply a brute fact. That is to say, it is a fact, but it does not obtain in virtue of any other facts, and there can be no illuminating explanation of why it obtains. It is a fact, and that is just the way it is.

On this view, then, the iron girders in Paris do compose the Eiffel Tower, and the limestone blocks in Giza do compose the pyramid. Likewise, it is also true that the Eiffel Tower and the Pyramid, taken together, do not compose any further object. These are just some of the facts about composition that obtain in the world. According to Markosian, however, there is no principled explanation of why these facts obtain. They just do.

An advantage of Markosian’s view, he claims, is that it is capable of accommodating all of our common-sense intuitions about composition (although this could be resisted—see below). Ordinary composite objects really do exist, and exotic, gerrymandered composite objects (like the object composed of the Eiffel Tower and the Great Pyramid) do not.

Likewise, brutal composition has a clear answer to the sorites-style arguments that we encountered above. There are sharp cut-off points between cases of composition and non-composition; a single atom really can make the difference between a chair’s existing and not existing. We do not know exactly where the cut-off points will lie, of course, but they will be there somewhere. And there is no pressure on the brute compositionalist to explain why a cut-off point lies where it does, precisely because compositional facts are brute; they admit of no further explanation. As Markosian notes, the brute compositionalist ‘can just shrug and say, “there is no reason. It is a brute fact”’ (Markosian, 1998, 37).

However, there are a number of reasons one might be suspicious about brutal composition. The main reason is that Markosian’s only real motivation for endorsing the view is that it is meant to be the only theory capable of preserving our common-sense intuitions about composition. The problem is, however, that it is not at all clear that it actually does this.

For instance, as James Van Cleve has pointed out, common sense may well point to the fact the composition is restricted, but it also surely points to the fact that there is a reason why it is restricted. (Van Cleve, 2008, 333). Yes, common sense suggests that the Eiffel Tower exists, and that it is composed of iron girders, but it also suggests that there is a reason it exists, namely, that it was purposely built, and that the parts are fixed together in an appropriate way, and so on and so forth. It is not by sheer arbitrary chance that these items compose a tower, or so common sense would have it.

According to brutal composition, there is no reason why some objects compose; that is what it means to say that compositional facts are brute. It therefore follows that the arrangement of the iron girders, and the way they are fixed together, has nothing to do with the fact that they compose the Eiffel Tower. We could dismantle the tower completely, we could fire the girders into the furthest depths of the universe, but according to brutal composition, they would still compose an object. But this could hardly be said to be consistent with our intuitions about composition!

It is for reasons such as this, as well as the more general concern that it seems just too ad hoc, that Brutal Composition has not proved at all popular among those philosophers who have worked on this topic.

e. Concluding Remarks

Because of the problems raised above, the majority of writers on this topic have concluded that compositional restrictivism, in any of its guises, is an untenable position. There are exceptions to this, of course, with van Inwagen, Markosian, and Korman, being notable among them, but these exceptions are undoubtedly in the minority. Our initial intuitions may well point to the fact that composition is restricted, but close philosophical analysis reveals that a principled theory that can accommodate such intuitions seems very difficult, if not impossible, to come by.

But if this majority are correct, and material composition is not restricted, then it means that we are left only with what van Inwagen called the ‘extreme answers’ to the SCQ (van Inwagen, 1990, 72). That is, one must say that composition always occurs (that is, endorse compositional universalism) or say that composition never occurs (that is, endorse compositional nihilism). Of these two options, it is the former that has proved the most popular among contemporary philosophers; indeed, it would probably be fair to say that universalism is the default view. (Although this may be beginning to change: in very recent years, nihilism has begun to grow in popularity.)

One of the main advantages that both universalism and nihilism wield over restrictivism, and one of the main reasons they are the most popular answers to the SCQ, is that they are completely unaffected by the sorites-style arguments articulated above, in section 3c. For neither answer has to state where the cut-off points will lie between cases of composition and cases of non-composition, because neither answer admits that there are such points. According to universalism, there are no cases of non-composition, and according to nihilism, there are no cases of composition, thus neither theory admits the existence of cut-off points.

The following two sections will give an overview of both universalism and nihilism, and the main arguments that have been given for and against them.

4. Compositional Universalism

Compositional Universalism (CU) can be defined as follows:

(CU): for any xs whatsoever (where those xs are material objects), there is a further material object, y, which those xs compose.

For the reader unfamiliar with this debate, it may come as something of a surprise to learn that the view of the informed majority is that compositional universalism is true. The reason for this is that the truth of universalism implies the existence of a vast number of weird and wonderful composite objects. After all, if universalism is true, then for any collection of material objects whatsoever, there will be a further object that they compose. Thus, there will be a material object composed of your favourite shirt, Donald Trump’s hair, and the top half of the planet Mars. And it would turn out that there is, after all, an object composed of the Eiffel Tower and the Great Pyramid of Giza. Universalism is entirely indiscriminate. It matters not how disparate or incontiguous two objects may be; according to universalism, they will compose something. Despite this rather unusual fact, however, universalism remains a popular view.

a. Arguments for Universalism

i. The Argument from Elimination

There is an argument for universalism which seems to hold considerable sway with a number of philosophers, even though it is rarely explicitly stated. It is an argument from elimination, and it consists of two claims. The first claim is that composition is not restricted (based on the type of consideration covered in the previous section), and second claim is that composition clearly occurs in some cases (for example, I exist, and I am composed of parts). The conjunction of these claims is taken to entail the truth of universalism:

Composition is not restricted.
Therefore, composition must either always occur or never occur.
Composition definitely occurs in some cases.
Therefore, composition must always occur.
Therefore, compositional universalism is true.

David Lewis has endorsed precisely this type of argument. He says: ‘no restrictions on composition can serve the intuitions that motivate it. So restriction would be gratuitous. Composition is unrestricted’ (Lewis, 1986, 213). Ted Sider has also advanced an argument similar to this (see Sider, 2001, 120–132. It is interesting to note, however, that Sider has now changed his view and endorses compositional nihilism).

The argument appears clearly valid, but in premise 3, it includes a significant assumption. Many, like Lewis, think that premise 3 is obviously true. Indeed, you will note in the above quote from Lewis that he does not even state anything like premise 3. He jumps straight from the claim that composition is not restricted to the conclusion that it must be unrestricted. The truth of premise 3 must have been so obvious to Lewis as to be not worth mentioning.

However, for many philosophers, premise 3 is not obviously true, and cannot simply be assumed. One reason to think this is that once we reject compositional restrictivism, then we seem to reject most (if not all) of our common-sense intuitions about composition along with it. As such, it looks questionable to make any assumptions about whether composition does or does not occur in any given case. If these assumptions are given up, then the above argument loses its force, and collapses into a mere restatement of the fact that composition is not restricted.

ii. The Argument from CAI

It has been suggested that composite objects are identical to their parts taken together. That is to say, if a composite object, o, is composed of some parts, the xs, then o is not an additional object to the xs; it just is the xs, taken collectively. This thesis has come to be known as Composition as Identity (CAI), and has its most notable proponent in Donald Baxter, who has provided some compelling examples in its support. For instance:

Someone with a six-pack of orange juice may reflect on how many items he has when entering a ‘six items or less’ line in a grocery store. He may think he has one item, or six, but he would be astonished if the cashier said ‘Go to the next line please, you have seven items’. We do not ordinarily think of a six-pack as seven items, six parts plus one whole. (Baxter, 1988, 579)

The thought is, therefore, that composite objects are identical, in the strict sense of numerical identity, to the parts that compose them. The six-pack literally is the six bottles taken together—nothing more, and nothing less. (See Wallace, 2011, for a nice introduction to the topic of CAI, and Baxter and Cotnoir (eds.) 2014 for more in-depth discussion.)

Moreover, it has also been suggested, by Trenton Merricks, that CAI entails universalism. That is, if CAI is true, then universalism must also be true. CAI, therefore, offers another potential line of argument in favour of universalism (albeit, a line of argument that is dependent on the truth of CAI).

The thrust of the argument is that if composition is identity, then the fusion of any objects just is those objects taken together. So for any objects whatsoever, you automatically get their fusion, because their fusion just is those objects. Trenton Merricks forwards just such a proposal, making the seemingly plausible claim that ‘it seems nonsensical to deny the existence of something that would, if it existed, be (identical with) things whose existence one already affirms’ (Merricks, 2005, 629. It should be noted that Merricks does not endorse universalism, however. Although he does claim that CAI entails universalism, he does not believe that CAI is true). But that is precisely what someone would be doing if they endorsed CAI but did not endorse universalism, or so the argument goes. Therefore, we are led to conclude that if CAI is true, universalism must be true also. To illustrate, consider once more our six-pack of orange juice. First, suppose that you accept, unremittingly, the existence of the six individual bottles of juice. Now according to CAI, the six-pack (the whole) just is the six bottles taken together, nothing more, and nothing less. So given the fact that you accept the existence of the six bottles, you already accept the existence of the six-pack. And the same goes for any collection of objects you can think of. It is as simple as that: CAI entails universalism.

There are two potential problems with the argument from CAI. First, as Ross Cameron has argued, there are reasons to think that the argument is not valid. Cameron’s central point is that CAI is a thesis about the nature of composition (that is, it tells us what composition is—identity), but it does not tell us when composition does and does not occur. For CAI tells us that when there is a composite object, that object is identical to its parts taken together. Furthermore, it tells us that when some objects are, taken together, identical to some single object, then they compose that object. Crucially, however, it does not tell us when some objects are identical to a single object and when they are not. As Cameron says: ‘[CAI] does not tell us whether, given some xs, they in fact compose; it only settles the biconditional: they compose iff there is some one to which they are identical’ (Cameron, 2012, 534). In order for CAI to entail universalism, one must already assume that given any xs whatsoever, there is a single object to which those xs are identical—in other words, that there is a single object which those xs compose. But that is just to beg the question in favour of universalism.

The second problem with the argument is that CAI itself is a highly controversial thesis. Indeed, for many, CAI is not just controversial, but incoherent. The main problem with it is that it seems to twist and contort the standard understanding of the relation of identity to unacceptable extremes. For instance, it appears that CAI violates Leibniz’s law, which states that if x = y, then anything true of x must also be true of y, and vice versa. If CAI is true, then it seems that this principle no longer holds. To see why, consider again our six-pack of juice. CAI says the six-pack is identical to the six bottles of juice. But the six-pack is a single object, whereas the six bottles are six objects. Therefore, it looks like something is true of the six bottles (that is, they are six) which is not true of the six-pack (that is, it is one), which is a violation of Leibniz’s law.

b. Arguments against Universalism

i. The Gratuitousness of Universalism

An often-noted drawback of universalism is that it posits the existence of too many objects. The ontology of the universalist is vast. The reason for this is that universalism states that for any collection of objects whatsoever, there will always be an object which those objects compose. It should be quite clear to see, therefore, that universalism implies the existence of a simply astronomical number of objects. For some, this objection is enough to reject universalism out of hand. Markosian, for instance, claims, ‘there is what seems to me to be a fatal objection to universalism: universalism entails that there are far more composite objects than common sense intuitions allow. […] On the basis of this objection, I reject universalism’ (Markosian, 1998, 22–23).

There are two main strategies universalists employ to overcome this objection. The first, endorsed by the likes of David Lewis and David Armstrong, is to say that, although universalism does posit a vast number of composite objects, this should not count against the theory because these composite objects are taken to be ontologically innocent.

The idea here is that composite objects do not contain any extra matter, over and above their constituent parts, and therefore they somehow come for free, ontologically speaking. Armstrong, for instance, tells us that ‘mereological wholes are not ontologically additional to their parts’ (Armstrong, 1997, 12), whilst Achille Varzi states, ‘the whole and the parts encompass the same amount of reality and should not, therefore, be listed separately in an inventory of the world’ (Varzi, 2000, 285). David Lewis, too, echoes these sentiments by saying ‘it would be double counting to list the cats and then list their fusion’ (Lewis, 1991, 81).

The main problem with this strategy is that the notion of ontological innocence is somewhat mysterious; that is, it is not obviously clear what it is meant to consist in. If a table, for instance, is taken to be ontologically innocent, yet one of the atoms that composes it is not, then are these two entities supposed to exist in the very same sense? If so, then it is not clear why only one of them should ‘count’, ontologically speaking. But if not, then one might think that we need a clearer explanation of what this existential difference actually consists in. This suggests that the notion of ontological innocence is perhaps not informative to the degree really required.

It is perhaps worth mentioning, however, that this objection to ontological innocence loses its force if the proponent of ontological innocence also endorses CAI. After all, if you already accept the existence of some parts, then accepting their fusion does not seem like an extra ontological commitment if it is identical to those very parts. Without the addition of CAI, however, the problem persists.

The second strategy that has been proposed by universalists, in response to the charge of ontological gratuity, is to simply bite the bullet. That is, admit that universalism is not very parsimonious with respect to the number of composite objects it posits, but then deny that parsimony in that respect is particularly important. This line has been taken by Lewis, who makes a distinction between quantitative and qualitative parsimony. Qualitative parsimony is concerned only with the number of types of entity that a theory posits, whereas quantitative parsimony concerns the number of tokens of those types. Lewis has argued that only qualitative parsimony is an important theoretical virtue; once you have admitted a particular type of entity into your ontology (for example, composite objects), then it does not matter how many tokens of that type your ontology contains. Given that most of us already accept the existence of the type—material composite object—then it does not matter that universalism posits a lot of them; this should not count against the theory.

There are two potential sticking points for this response. The first is that some thinkers, such as Daniel Nolan (1997), have argued that quantitative parsimony is in fact a theoretical virtue. If these thinkers are right, then Lewis’s response looks clearly flawed. The second thing to note is that compositional nihilists are likely to remind the universalist that they do not countenance material composite objects at all. Therefore, even if we ignore quantitative parsimony, nihilism has the advantage of being qualitatively more parsimonious than universalism, since it posits one fewer type of thing.

ii. The Counter-Intuitiveness of Universalism

A different objection that is sometimes levelled at universalism is that it flies in the face of common sense. The vast majority of the composite objects that universalism posits are just not the sort of object that common sense would countenance. Think of any collection of objects you like—no matter how random, how disparate, and how disconnected they may be, there will, according to universalism, be a further object they compose. As Lewis (1991, 7–8) reminds us, universalism admits the existence of trout-turkeys: entities composed of the undetached front half of a trout, and the undetached rear half of a turkey. Some may think, therefore, that these sorts of objects simply make universalism too counter-intuitive to be true.

Lewis, however, has a solution ready at hand. He claims that in ordinary thought and talk we restrict the domain of our quantifiers such that they range only over the ordinary objects of common sense, and not over extraordinary, gerrymandered objects such as trout-turkeys. It is only because of this that universalism seems so counter-intuitive.

In Lewis’s defence, we do often use quantifiers in a restricted sense in ordinary communication. For instance, if a mugger stole your wallet, you may tell the police that he stole all your money. But you would not literally mean all your money. (Presumably, the mugger did not empty your bank account and gather all the loose change from the back of your sofa.) What you would have meant, of course, is that the mugger stole all the money you had with you at the time. Thus, you would have been tacitly restricting the domain of your quantifiers such that they ranged only over the contents of your wallet, or perhaps over whatever you had on your person. Once this is recognised, it becomes clear that we actually employ restricted quantification all the time. (Note: that does not actually mean all the time.)

Lewis suggests this is what happens when we talk about composite objects. We tacitly restrict our domain of quantification such that it includes only those composite objects recognised by common sense, and does not include exotic composites like trout-turkeys:

Restrict quantifiers not composition. […] We have no name for the mereological sum of the right half of my left shoe plus the moon plus the sum of all her Majesty’s ear-rings, except for the long and clumsy name I just gave it; we have no predicates under which such entities fall, except for technical terms like ‘physical object’ (in a special sense known to philosophers) or blanket terms like ‘entity’ and maybe ‘thing’; we seldom admit it to our domains of restricted quantification. It is very sensible to ignore such a thing in our ordinary thought and language. But ignoring it won’t make it go away. (Lewis, 1986, 213)

The restricted quantification strategy is quite popular among universalists, but it is not without its problems. A central problem with it is that it looks prima facie implausible (see Korman, 2007). Returning to our example of the mugger, imagine that a particularly meticulous police officer responded to your claim with an arched eyebrow and asked, ‘you really mean he stole all your money; every last penny you owned?’. You may well be exasperated by such a response, but you would probably understand what the officer meant. You would simply have to re-iterate more precisely that you meant the mugger stole all the money that was in your wallet.

But now suppose, on telling the officer that there were precisely two items in the wallet—two twenty-pound notes, say—he were to respond, ‘only two items, you say? But what about the object that those two notes compose? And what about the object composed of the left half of one note and the right half of the other?’. Such a question would not exasperate, but completely befuddle! It seems highly implausible that one might casually respond, ‘Oh, sorry, I didn’t realise you were counting those types of object too’.

What these observations suggest is that although we certainly do restrict our quantifiers in certain circumstances, it usually only takes minimal reflection (or perhaps for someone—like a fussy police officer—to point it out to us) for us to realise, and to accept, that we are doing so. But there is no controversy there—it is just something that we do. In contrast, it appears much more controversial to suggest that we regularly restrict our quantifiers to exclude exotic composite objects. For if you tried to point out to someone that they were doing that, it is unlikely that they would even understand what you were talking about, let alone accept that what you said was true. Moreover, once you had explained what you meant, it is still plausibly unlikely that they would accept what you have said. Much more likely is that they would simply insist that the exotic composites you were attempting to refer to did not exist. Seen in this light, some, like Korman, claim that it stretches the limits of credibility to suggest that, in ordinary thought and talk, we restrict our quantifiers so as to exclude exotic composites.

iii. The Argument from Primitive Cardinality

Juan Comesaña (2008) has presented an argument against universalism based on the grounds that it places unacceptable restrictions on the number of material objects that a world could contain. More technically, it conflicts with a principle that he calls primitive cardinality (PC).

(PC): For any n, there could have been exactly n material things.

PC simply states that there is a possible world containing just one material thing, a possible world containing just two material things, a possible world containing just three material things, and so on and so forth, for every positive integer. Comesaña makes the plausible claim that PC seems obviously true. After all, why could there not be a possible world with just seven material objects in it, for instance, or any other whole number? There seems no good reason to think that this could not be the case.

However, according to universalism, PC is false. For instance, it is impossible, if universalism is true, to have a world in which there are just two material objects. For according to universalism, if you have two objects, you always get a further object that they compose. Thus, it is impossible to have a two-object world, because there will automatically be a third object at such a world: the mereological fusion of those two objects.

Furthermore, universalism does not only rule out the possibility of two-thing worlds, but it also rules out the possibility of four-thing worlds, five-thing worlds, six-thing worlds, eight-thing worlds, and countless more. The reason for this is that with the addition of each individual simple, there will also be the automatic addition of numerous fusions composed of the previously existing simples and the newly added simple. More precisely, for any world with a particular number of simples, n, the total number of material things (that is, simples and fusions) at that world will be 2ⁿ-1. Therefore, universalism is incompatible with PC.

How seriously one takes this argument will depend on the strength of one’s conviction in the truth of PC. Comesaña claims that intuition supports the truth of PC. He claims that we have ‘particular pre-theoretical judgments that there could have been exactly two things, and exactly three things, and…’, whereas universalism is supported only by abstract and theoretical principles. Moreover, he claims that it is ‘standard methodological procedure’ in many areas of philosophy to give precedence to pre-theoretical judgements over general theoretical principles, when they conflict. Because of this, he claims that this constitutes prima facie evidence in favour of PC (Comesaña, 2008).

The argument from PC is unlikely to be considered as fatal to universalism. After all, the universalist can just bite the bullet and admit that it is simply a consequence of the theory that PC is rendered false. This may well violate an intuition we have, but it is not clear how strong an intuition that is in the first place. Moreover, if compositional restrictivism is false, we have already had to concede that many of our intuitions about material objects are false, so one further concession may not be that hard to take.

Finally, the universalist can remind us that although her theory renders PC, as stated above, as false, it is perfectly compatible with a similar principle, that one could call the primitive cardinality of simples (PCS).

(PCS): For any n, there could have been exactly n simples.

Universalism is perfectly compatible with PCS, and, indeed, it may well be PCS, not PC, that our pre-theoretical judgements are driving at.

iv. The Identity Argument

One final argument against universalism suggests that the universalist owes us some answers to some particularly tricky questions concerning the identity of composite objects. The argument was originally proposed by van Inwagen (1990, 75), but the version presented below is a modified, somewhat more neutral, version than his.

The argument rests on the fact that according to universalism, any collection of objects composes a further object, regardless of any facts concerning those objects’ nature, their locations, or the spatial or causal relations that hold between them. Indeed, according to universalism, it is enough that two objects merely exist, that they compose a further object. No other conditions need be satisfied.

Given this fact, the argument can be set up as follows. Consider an ordinary composite object, let us say, a tree, and let us call this tree, ‘Spruce’. According to universalism, Spruce is a composite object and is composed of a large number of simples (sub-atomic particles, or what-have-you) that are arranged in a tree-like fashion. If we call the fusion of those simples, ‘F’, we can say that Spruce = F.

Now suppose that a bolt of lightning were to strike Spruce and vaporise it. The force of the bolt was such that Spruce was completely destroyed, and all her constituent simple parts were scattered far and wide throughout the surrounding area.

In this eventuality, it would seem quite clear that Spruce no longer exists. If you were to look at the exact spot of the incident, there would be no tree present. However, F does still exist. The simples that composed Spruce have not been destroyed but merely rearranged, scattered far and wide. But according to universalism, their spatial location does not affect their compositional status—they still compose the very same fusion they composed before. Thus, we now have a situation in which F exists, but Spruce does not. But this contradicts our earlier claim that Spruce = F. For if x = y, it is impossible for x to exist although y does not.

The upshot of this argument is that although universalism does posit lots of mereological fusions (like F), these fusions are clearly not the ordinary objects of common sense (like Spruce). This is because these fusions are virtually indestructible—you can scatter their parts to the furthest corners of the known universe, and they will still exist. But the same cannot be said of trees, like Spruce, or indeed any ordinary objects of common sense. In light of all this, it looks like the universalist needs to answer two particularly difficult questions:

What are ordinary objects, if not mereological fusions of simples?
Why should we accept the existence of all these peculiar mereological fusions, if they do not include, after all, the ordinary objects of common sense we thought they did?

There are a couple of ways in which the universalist could respond to this argument. The first is to endorse a relation of constitution, and the second is to endorse four-dimensionalism.

On the first option, the universalist would deny the premise in the argument that states Spruce = F. Instead, ordinary objects are not taken to be identical to mereological fusions, but constituted by mereological fusions. (Recall the discussion of material constitution in section 1b). According to this view, one can say that F constitutes Spruce whilst its parts are arranged in a tree-like fashion, but when the parts are spread far and wide, after the lightning bolt, F no longer constitutes Spruce.

Although this view certainly overcomes the argument, it leaves many questions unanswered. First and foremost, what is this relation of constitution meant to be? Moreover, if it is the case that F constitutes Spruce at some points of its existence but not others, it implies that constitution is restricted (in the same sense that composition was taken to be restricted in section 3). But this seems to leave the view open to the sorites-style arguments we encountered earlier, that is, where will the cut-off points lie between cases of constitution and cases of non-constitution? It also seems to invite a question similar to the SCQ, that we could call the Special Constitution Question; that is, under what conditions does some object, o, constitute an F? This question may well prove to be just as difficult to answer as the original SCQ.

The second option for the universalist would be to endorse four-dimensionalism: the view which states that material objects are extended through time, in much the same way they are extended through space. Hence, material objects are four-dimensional (extended in the three dimensions of space, and the fourth dimension of time). As such, material objects have not only spatial parts, but also temporal parts.

A consequence of this view is that objects are not wholly present at any particular moment of time. Rather, they merely have a temporal part that is wholly present. To illustrate, consider an analogy. The river Thames is not wholly present at London Bridge. Rather, only a part of the river exists there. The entire river stretches all the way from the Cotswolds to the North Sea. In the same way, four-dimensionalists would say that the river Thames is not wholly present at any given time. Rather, only a (temporal) part of it is. The whole river stretches (temporally) all the way from that moment of time at which it came in to existence, to that moment of time at which it will cease to be.

Interestingly, just like the constitution theorist, the four-dimensionalist will deny the premise which claims Spruce = F, but for very different reasons. Instead, Spruce and F are taken to be distinct, four-dimensional objects that merely share some temporal parts (in the way that two distinct streets could share some spatial parts, at the region at which they cross one another). Specifically, they share the temporal part at which all the parts of F are arranged in a tree-like manner. So according to this view, there are not two distinct objects located in the same place at the same time. Rather, at t, there is a single object present, which is a (temporal) part of two distinct objects, Spruce and F.

Each of these two responses does enough to overcome the identity argument, but they both represent a cost to the universalist. Thus, although it is not insurmountable, the identity argument seems to show that accepting universalism (which is already a controversial metaphysical thesis) forces one into accepting at least one other controversial metaphysical thesis: either the constitution view or four-dimensionalism. This is unlikely to be considered a fatal cost, but it is a cost that must be recognised nonetheless.

5. Compositional Nihilism

The remaining answer to the SCQ is compositional nihilism (CN):

(CN): for any xs (where those xs are material objects), there is never a further material object which those xs compose.

More simply put, according to nihilism, there are no material composite objects at all; all material objects in existence are mereologically simple.

Nihilism, on the face of it at least, is even more radical than universalism. Think of any object at all that you consider to be composite, that is, to have parts. According to the nihilist, it does not exist. For the nihilist, there are no tables, there are no buildings, there are no planets or stars. There are not even any human beings. (That is, so long as you take such entities to be composite). For this reason, nihilism is often dismissed as obviously false. Any theory which entails the view that there are no human beings is obviously false, or so one might well be tempted to think. However, nihilism has recently been growing in popularity and has been defended in print by a number of philosophers (for instance, Cameron, 2010; Sider, 2013; Cornell, 2017). These philosophers tend to claim that these supposedly absurd consequences of the view (for example, that there are no human beings) are not, in fact, as absurd as they may seem. Once the view is properly understood, they maintain, these apparent absurdities can easily be explained away.

a. Arguments for Nihilism

i. The Causal Overdetermination Argument

One type of argument that has proved to be quite influential in the debate over material composition is that which suggests we should reject the existence of composite objects because, if there were any such things, they would be causally redundant.

Causal redundancy arguments of this ilk are probably more familiar within the philosophy of mind, as they have often been employed in support of physicalism. The idea is that we can give a full causal explanation of human action in terms of the physical states and processes that occur in the brain. As such, there is no need to posit any non-physical, mental entities, as such things would have no causal role to play; they would be causally redundant. (See Kim, 1993.) A similar type of argument can be formulated in support of nihilism. That is, we can give a full causal explanation of any physical event solely by appealing to the microphysical particles involved, their properties, and the relations in which they stand. Thus, there is no need to posit any macroscopic, composite objects, because such things would have no causal role to play; they would be causally redundant.

Trenton Merricks has provided the clearest, and most forceful, version of this argument (Merricks, 2001). (Although it should be made clear that Merricks only uses the argument to support a quasi-nihilistic view rather than a full-blown compositional nihilism. He does, for instance, allow that human beings exist and are material composite objects. However, he rejects the existence of all inanimate composite objects). Central to Merricks’s argument is the notion of causal overdetermination. Causal overdetermination occurs when there are multiple, individually sufficient, causes for an event. That is, when an event has more than one cause, each of which would have been fully sufficient, on its own, to bring that event about. It is widely agreed that causal overdetermination is objectionable, and that we should avoid endorsing any theories which involve it (see, for instance, Bunzl, 1979; Loeb 1974; Kim, 1993). Merricks seizes on this claim and uses it to argue against inanimate material composite objects.

To see how the argument works, consider Merricks’s example of a baseball smashing a window. The thought is that the activity of the atoms which are taken to compose the baseball is quite enough on its own to give a complete causal explanation of the shattering of the window. Therefore, if there exists a baseball in addition to the atoms, then that baseball cannot play any causal role in the shattering of the window—if it did, the shattering of the window would be causally overdetermined. Thus, we must therefore conclude that baseballs (and, by extension, all other material composite objects), if they were to exist, would have no causal powers at all.

Merricks completes the argument by making the seemingly plausible claim that material composite objects, like baseballs, surely would have causal powers if they existed. A baseball, if it existed, would be a physical object, with physical properties such as mass and so on and so forth. Thus, it would be implausible to suggest that such a thing would be causally inert; indeed, such a suggestion may well contravene basic laws of physics. As such, he argues, we have no option but to conclude that material composite objects, like baseballs, do not in fact exist.

There are ways in which the argument can be resisted. The most straightforward way of doing so is to simply allow that physical events (like the shattering of windows by baseballs) are in fact causally overdetermined, that is, that they are caused by composite objects and by the constituent parts of those objects. Allowing this would certainly undermine Merricks’s argument, but at the same time, it would also entail that there is widespread and systematic causal overdetermination in the world. For most, however, this conclusion is simply too unpalatable to accept.

A more sophisticated response has been offered by Amie Thomasson, who suggests that the argument is flawed because it is based on the incorrect assumption that composite objects are separate and independent entities from the simple parts of which they are composed (Thomasson, 2006). Thomasson accepts that causal overdetermination is highly objectionable, but only in those cases in which the two overdetermining causes are completely separate and independent from one another. (To take a well-used example, consider a person executed by firing squad, who is hit by two bullets at exactly the same time, each one of which was fully sufficient to kill them.) Thomasson claims, however, that composite objects are clearly not separate and independent from their constituent parts, thus the worry about causal overdetermination is misplaced.

Thomasson certainly has a point that there is a particularly intimate connection between a composite object and its parts. They are not separate and independent in the same way that the two bullets in the firing squad example are. It would be impossible to throw the baseball, for instance, without also throwing its constituent parts. However, providing that one does not endorse CAI, the baseball must be considered a distinct object from the parts that make it up. As a result, it is not obvious as to just how concerned we should be about the claim that both the baseball and its constituent parts have causal powers.

ii. The Problem-Solving Argument

Another point in favour of compositional nihilism is that it provides a straightforward solution to a number of long-standing problems generated by ordinary material objects. For instance, in section 1b, we considered the problem that arises when one considers a statue and the lump of bronze (or clay, or whatever) that it is made of. The puzzle emerges because it looks like we need to say that the statue and the lump of bronze are distinct objects, as they have different properties (for example, the lump existed before the statue did, and it would survive being squashed into a ball, whereas the statue would not). But this leads to the seemingly bizarre conclusion that we have two distinct objects (a statue and a lump of bronze) occupying exactly the same space at exactly the same time. Other recalcitrant problems which are similar include the Ship of Theseus, the case of Tibbles the cat (see Wiggins, 1968), and the problem of the many (see Unger, 1980).

Various potential solutions have been offered to these problems, but most of them involve the acceptance of some controversial metaphysical thesis or other, such as four-dimensionalism, or the constitution view. The compositional nihilist, however, avoids all these problems in their entirety. This is because, according to the nihilist, there are no composite objects at all. There are no statues, and there are no lumps of bronze, thus the question of how these things relate to one another never arises. Likewise, the nihilist does not have to worry about the problems of the Ship of Theseus or of Tibbles the cat, because there are no such things as ships or cats.

Compositional nihilists often point to this fact as providing support to their view: it offers a simple and elegant way of dissolving (or, rather, avoiding) all the problems generated by material constitution. Indeed, the nihilist could well go one step further and say that the only reason these puzzles have arisen at all is that we have been mistakenly assuming that statues/lumps/ships/cats/ and so forth exist in the first place. The puzzles are a direct product of a confused and fallacious understanding of the world. Once we understand the true nature of the world (that is, that there is no such thing as material composition), the puzzles never even get off the ground.

The obvious counter-response to this argument, however, is that compositional nihilism is a far more extreme and controversial metaphysical thesis than any of those which are invoked to solve the problems of material constitution. On this view, therefore, the nihilist cuts off their nose to spite their face. Sure, nihilism might avoid these philosophically puzzling problems of material constitution, but it does so at the exorbitant cost of denying the existence of any ordinary material objects whatsoever. This, for some, is just far too high a cost to pay.

iii. The Argument from Ideological Parsimony

Ted Sider has recently put forward an argument for compositional nihilism based on what he calls ‘ideological parsimony’ (Sider, 2013). The argument appeals to a distinction, originally made by Quine, between a theory’s ‘ontology’ (which consists of the objects the theory posits) and its ‘ideology’ (which consists of the primitive, or unexplained, terms or notions the theory employs).

Arguments that appeal to ontological parsimony (that is, arguments which suggest one theory is better than another because it posits fewer objects) are fairly commonplace in philosophy. But Sider claims that similar arguments can be made which appeal to ideological parsimony (that is, arguments which claim one theory is better than another because it employs fewer primitive terms). Sider’s claim is that nihilism is not only ontologically more parsimonious than universalism (it posits fewer objects—only simples, and no composites), but it is also more ideologically parsimonious, because it can completely do away with the notion of parthood and the related mereological terms and concepts that go with it. The general idea is that this makes nihilism an ideologically simpler theory than universalism (or, indeed, than any theory that accepts the existence of any composite objects), and this should count in its favour.

One way to respond to Sider’s argument would be to accept it in spirit, but to question its strength. That is, it is not obvious how much weight one should afford the notion of ideological parsimony in the first place. If one was unconvinced, then it may seem that the advantage offered by nihilism in this regard was marginal at best. But for those who place great value on ideological parsimony, by contrast, the argument might have considerable power. The jury, it seems, is still out on this issue (although see Cowling, 2013, for a defence of the virtues of ideological parsimony). A final point worth considering here is that one may well think that any advantage that nihilism gains in ideological parsimony is going to be outweighed by the various costs it incurs, such as the fact it denies the existence of ordinary composite objects, like tables, chairs, and human beings.

b. Arguments against Nihilism

i. The Common-Sense Argument

By far the most common objection to compositional nihilism in the extant literature is one that appeals to common sense. It is simply obvious that composite objects exist, thus it is simply obvious that nihilism is false, or so the argument goes. This view is shared by many eminent thinkers (such as Markosian, 1998, 221; Schaffer, 2009, 358), and is perhaps best summed up by Michael Rea, who says: ‘it is just obvious that there are tables, chairs, computers and cars. The fact that some philosophical arguments suggest otherwise seems simply an indication that something has gone wrong with those arguments’ (Rea, 1998, 348).

This kind of argument shares many similarities with G. E. Moore’s famous, hand-raising, refutation of idealism. Essentially, the idea is that we can be far more certain of the common-sense fact that tables, chairs, and other composite objects exist, than we can of any of the abstract and theoretical premises employed in arguments for nihilism. Therefore, common sense should win out—we should accept the existence of ordinary composite objects and conclude that nihilism, regardless of any theoretical advantages it may offer, is false.

Those attracted to compositional nihilism have employed a number of different strategies to combat this objection. A theme that is common to many of them is that the common-sense objection is simply misjudged. That is, it misunderstands, or misconstrues, precisely what nihilism actually states. The point, which has been made by a number of contemporary thinkers (for instance, Sider, 2013; Cornell, 2017), is that although nihilism does deny the existence of ordinary objects like tables and chairs, it does not deny the existence of the physical matter that allegedly composes those objects. Once this fact is recognised, nihilism does not, in fact, violate our common-sense intuitions in the objectionable way it is often claimed to.

As an example, consider an ordinary composite object: a house. According to common sense, this house is made up of many parts. At base, these parts will be very small indeed, that is, some kind of sub-atomic particles or whatever our scientific theories tell us are the fundamental constituents of matter. The point is that the nihilist accepts the existence of all these sub-atomic particles. All she denies is that these particles compose some single, composite object: a house. When seen like this, the common-sense objection seems to lose some of its bite. As Cian Dorr has observed:

If all the plates in my kitchen dresser were to cease to exist, but all the molecules in my dresser were to stay arranged exactly as they are, I wouldn’t care very much. My guests would have no new reason to worry about their food getting all over the tablecloth. In fact, they would never know unless I told them—but come to think of it, I would never know either. (Dorr, 2002, 42–43)

In light of all this, there appears to be a question mark over just how much of a conflict there really is between compositional nihilism and common sense. Taken at face value, with its outright denial of all composite objects, nihilism seems about as controversial a theory as one could wish for, but once it is recognised that nihilism still acknowledges the matter that is taken to compose these composite objects, then the power of the common-sense objection seems to wane.

ii. The Argument from Emergence

A further argument against compositional nihilism is based on what has been called the problem of emergence. In its basic form, the argument begins with the claim that nihilism is incompatible with the existence of emergent properties. It then goes on to say that since there are very good reasons to believe that there are emergent properties, there are equally good reasons to think that nihilism must be false. The beginnings of an argument like this can be found in van Inwagen (1990) and Merricks (2001), but perhaps its clearest articulation is in Schaffer (2007).

To appreciate the force of the argument, one first has to understand what emergent properties actually are. An emergent property is a property of an object or system that cannot be explained or accounted for solely by the properties of that objects parts. That is to say, emergent properties are taken to be things that are somehow over and above a mere combination of the properties and relations of their bearer’s base constituents. Most familiar properties are clearly not emergent in this sense. Take mass, for instance. Mass, like most properties, is reducible; one can explain the mass of an object or system reductively, in terms of the mass of each of its constituent parts. (For example, the mass of a 100kg pile of bricks can be explained or accounted for solely by the fact that each of the one hundred bricks in the pile has a mass of 1kg.)

Emergent properties, by contrast, resist this kind of reduction. If a property, F, of an object or system is emergent, then it cannot be explained or accounted for solely by an appeal to the properties and relations of its constituent parts. To illustrate, consider the water in a swimming pool. It has the property of being wet. But none of the individual H₂O molecules that make it up have that property (a single molecule is not wet). Thus, the property of wetness seems to emerge at the macro-level, and it cannot be reduced to a mere aggregation of properties and relations at the micro-level. (Note: this is just an illustration. It is far from clear whether wetness is, in fact, a genuinely emergent property.)

The most common arena in which emergent properties are postulated is the philosophy of mind and consciousness. The thought is that mental states—something like an excruciating pain, or a sharp pang of guilt, for example—are so entirely distinct in character from the electro-chemical, neurological properties that are instantiated by parts of the brain, that they cannot be explicable purely in terms of those properties. (Just like a single molecule of water is not wet, a single cell in the brain does not feel pain/guilt/love/and so forth.) They may well be caused by activity in the brain, but they emerge holistically as being far greater than the sum of their causal beginnings.

Another quite distinct field in which emergence plays a prominent role is quantum mechanics. Very roughly, the thought is that certain composite quantum objects or systems (often referred to as ‘entangled systems’) can exhibit properties that are quite inexplicable in terms of the object’s/system’s sub-atomic constituents alone (see Schaffer, 2007, for more details of emergence in quantum physics).

With this understanding of emergent properties in hand, it is only a short step to see why they cause such a problem for the nihilist. The reason is that emergent properties seem to imply a stratified picture of the world, whereby reality is divided up into levels of mereological complexity. At the base level, you have the mereological simples, and then you have higher levels populated by the composite objects those simples compose. Emergent properties are those which emerge at higher levels than the base level—that is, which are instantiated by composite objects—and which cannot be explained purely by appealing to the objects and properties at the base level. The problem for the nihilist is that they deny this stratification of reality: there is the base level and nothing else. As such, there are no candidate objects in the nihilist’s ontology that could have emergent properties. Quite simply, there is nowhere for emergent properties to emerge.

If this is right, and nihilism is incompatible with emergent properties, then given that there are good reasons to think that emergent properties do exist (and both quantum physics and philosophy of mind suggest that there are such reasons), these reasons also seem to suggest that nihilism is false.

There appear to be three possible ways in which the nihilist could respond to this charge. The first strategy would be to simply reject the possibility of emergent properties. But since this would conflict with popular views in both quantum physics and philosophy of mind, it is not a particularly attractive route to take. The second strategy would be to endorse a fairly radical form of compositional nihilism—known in the literature as existence monism—which claims that there is only a single material object in existence, the world itself, and it is mereologically simple (that is, has no parts). Schaffer (2007) argues that this is the only way for the nihilist to overcome the problem of emergence (although it should be noted that Shaffer himself does not endorse existence monism). The problem with this strategy is that existence monism is considered by many as being even more extreme and implausible than standard nihilism. Shaffer himself, for instance, labels it a ‘crazy view’. However, see Horgan and Potrč (2008), or Cornell (2016), for recent defences of monism.

The final, and most promising, strategy would be to argue that nihilism is in fact compatible with emergent properties. This strategy has been put forward by Caves (2018) and Cornell (2017), who both argue that simples can collectively instantiate emergent properties, even if none of them individually instantiate that property, and that no composite objects are required in order for this to be possible. Therefore, emergent properties (such as mental states) can still emerge at the macro-level, even though there are no composite objects at that level to instantiate them.

iii. The Problem of Atomless Gunk

According to our current scientific theories, physical matter bottoms out at a ‘base level’. For instance, an ordinary object, like a table, is made of molecules; those molecules are made of atoms, and those atoms are made of even smaller parts such as leptons and quarks. However, that, we are told, is as far as we can go. Leptons and quarks themselves have no smaller parts. They are fundamental particles; they are simple; they represent the ‘bottom layer’ of reality.

But what if this view was wrong? What if the particles that we currently think to be fundamental are in fact made of even smaller parts? This is surely a live possibility. After all, before we discovered the existence of sub-atomic particles, it was presumed that atoms themselves were the smallest constituents of reality. (Indeed, the term ‘atom’ was used precisely because it is derived from the Greek for ‘indivisible’.) We were wrong then, so we could surely also be wrong now.

Some have suggested, however, that it is possible that there may not be a ‘base level’ at all. That is, matter could be infinitely divisible. Another way of saying that is that for any bit of physical matter you choose, all of its parts will have further parts. This rather exotic type of physical matter was labelled by David Lewis as ‘atomless gunk’ (Lewis, 1991, 20), although it is more commonly referred to now as plain ‘gunk’.

The possibility of gunk represents a threat to nihilism. The reason for this is that according to nihilism, the only material objects that exist are simples (that is, objects with no parts). But if matter were ‘gunky’, then it would turn out that there were no simples at all (because every part of gunky matter has further parts—there are no simple parts of gunk). Therefore, if matter were gunky, the nihilist would be committed to saying that there were, in fact, no material objects in existence at all.

The most common response to this problem is to deny flat-out that gunk is a real possibility. It may seem as though it is possible, in the sense that we can conceive of such stuff without running into any obvious contradiction, but this appearance is illusory. Gunk is not possible, and matter must bottom out at some point, thus nihilism is preserved. See Williams (2006) for a defence of this approach.

6. Deflationism

So far, it has been suggested that all answers to the SCQ must fall into one of the three categories: restrictivism, universalism, or nihilism. However, there is in fact a fourth way in which one could respond to the question: to dismiss it altogether. This kind of response has been articulated by a number of philosophers, who have dismissed the SCQ for a variety of reasons. These views fall under the more general heading of ‘deflationism’, as they attempt to ‘deflate’ the importance of the debate over material composition.

Some examples of such deflationist views include that of Amie Thomasson, who claims the SCQ to be an unanswerable question (Thomasson, 2006), and that of Jonathan Schaffer, who takes the existence of composite objects to be obvious and trivial (Schaffer, 2009). But the most influential deflationary account is that of Eli Hirsch, and is discussed below.

a. Hirsch and Quantifier Variance

Hirsch argues that the debate over material composition is not a genuinely ontological debate, but rather, merely a verbal dispute (see Hirsch, 2005). What this means is that when a compositional nihilist argues with a compositional universalist about whether there are any tables, for instance, they are merely talking past one another rather than having a genuine disagreement about what things exist. The source of confusion is that they are using the same words to mean different things. In slogan-like fashion: they agree about the facts, they disagree about the semantics.

More specifically, Hirsch has proposed a theory—that has its roots in the thought of Rudolf Carnap—of ‘quantifier variance’, whereby different speakers use quantifiers (that is, quantificational expressions, such as ‘exists’, and ‘there is’) with different meanings. Thus, when a universalist says ‘tables exist’, and a nihilist responds, ‘tables do not exist’, they are both in fact speaking the truth, but merely taking the term, ’exist’, to mean different things. Central to Hirsch’s view, moreover, is that there is no correct or privileged way to use quantificational language. There are many different ways in which one can describe reality (for example, ways in which mereological fusions of the Eiffel Tower and the Great Pyramid at Giza are said to exist, and ways in which they are not), but none of these ways are any more correct than any other. The upshot is that disputes like that between the nihilist and the universalist arise because the two parties are speaking different (albeit similar) languages. They are, at base, disputes about the meaning of words, not about the nature of reality.

There is strong opposition to Hirsch’s view, however, largely because it is often supposed to involve a radical anti-realism about the nature of reality. In short, if there is no correct way in which to describe reality, then it seems to follow that reality does not have an objective nature at all (for if it did, one could describe it rightly or wrongly). Many thinkers claim that reality does have an objective nature. What this means is that there is a correct way to describe the world, and that some descriptions are better (that is, more accurate) than others. Ted Sider, in his 2011 book, Writing the Book of the World, gives a comprehensive defence of this kind of view.

This debate over the legitimacy, or substantiality, of the SCQ is just a small part of the larger debate between ontological realists and ontological anti-realists. The former camp, including the likes of Sider, maintain that the ontological questions that metaphysicians often concern themselves with (concerning disputed entities such as composite objects, temporal parts, possible objects, abstract objects, universals, and so on) are important and substantive, and need to be answered satisfactorily. The latter camp, by contrast, including the likes of Hirsch, argue that these disputes are, for a variety of reasons, either defective, or unimportant. This debate remains unresolved, and takes centre stage in the currently burgeoning field of metametaphysics. (see Chalmers, Manley, and Wasserman (eds.) 2009).

7. References and Further Reading

Armstrong, D. M. (1997) A World of States of Affairs (Cambridge: CUP).
Baxter, D. (1988) ‘Identity in the Loose and Popular Sense’, Mind, 97, 388, pp.575–582.
Baxter, D. and Cotnoir, A. (eds.) (2014) Composition as Identity (Oxford: OUP).
Bunzl, M. (1979) ‘Causal Overdetermination’, The Journal of Philosophy, 76, 3, pp.134–150.
Cameron, R. (2010) ‘How to Have a Radically Minimal Ontology’, Philosophical Studies, 151, 2, pp.249–264.
Cameron, R. (2012) ‘CAI Doesn’t Settle the SCQ’, Philosophy and Phenomenological Research, 84, 3, pp.531–554.
Carmichael, C. (2011) ‘Vague Composition Without Vague Existence’, Nous, 45, 2, pp.315–327.
Carmichael, C. (2014) ‘Toward a Common Sense Answer to the Special Composition Question’, Australasian Journal of Philosophy, 93, 3, pp.475–490.
Caves, R. (2018) ‘Emergence for Nihilists’, Pacific Philosophical Quarterly, 99, 2–28.
Chalmers, D., Manley, D., and Wasserman, R. (2009) Metametaphysics: New Essays on the Foundations of Ontology (Oxford: OUP).
Comesaña, J. (2008) ‘Could There Be Exactly Two Things?’, Synthese, 162, 1, pp.31–35.
Cornell, D. (2016) ‘Taking Monism Seriously’, Philosophical Studies, 173, 9, pp.2397–2415.
Cornell, D. (2017) ‘Mereological Nihilism and the Problem of Emergence’, American Philosophical Quarterly, 54, 1, pp.77–87.
Cowling, S. (2013) ‘Ideological Parsimony’, Synthese, 190, 17, pp.3889–3908
Dorr, C. (2002) The Simplicity of Everything, PhD Thesis, University of Oxford, England.
Goldstein, L. (2012) ‘The Sorites Is a Nonsense Disguised by a Fallacy’, Analysis, 72, 1, pp.61–65.
Hawley, K. (2006) ‘Principles of Composition and Criteria of Identity’, Australasian Journal of Philosophy, 84, 4, pp.481–493.
Hestevold, S. (1981) ‘Conjoining’, Philosophy and Phenomenological Research, 41, 3, pp.371–385.
Hirsch, E. (2005) ‘Physical-Object Ontology, Verbal Disputes, and Common Sense’, Philosophy and Phenomenological Research, 70, 1, pp.67–97.
Horgan, T. and Potrč, M. (2008) Austere Realism (London: MIT Press).
Kim, J. (1993) Supervenience and Mind (Cambridge: CUP).
Korman, D. (2007) ‘Unrestricted Composition and Restricted Quantification’, Philosophical Studies, 140, 3, pp.319–344.
Korman, D. (2015) Objects: Nothing Out of the Ordinary (Oxford: OUP).
Leonard, H. S. and Goodman, N. (1940) ‘The Calculus of Individuals and Its Uses’, Journal of Symbolic Logic, 5, pp.45–55.
Lesniewski, S. (1916), ‘Foundations of the General Theory of Sets’ in Lesniewski, S. Collected Works, eds. S. J. Surma, J. Srzednicki, D. I. Barnett, and F. V. Rickey, trans. D. I. Barnett (Dordrecht, Kluwer, 1992) vol. 1, pp.129–173.
Lewis, D. (1986) On the Plurality of Worlds (Oxford: Basil Blackwell).
Lewis, D. (1991) Parts of Classes (Oxford: Basil Blackwell).
Loeb, L. E. (1974) ‘Causal Theories and Causal Overdetermination’, The Journal of Philosophy, 71, 15, pp.525–544.
Markosian, N. (1998), ‘Brutal Composition’, Philosophical Studies, 92, 3, pp.211–249.
Merricks, T. (2001) Objects and Persons (Oxford: OUP).
Merricks, T. (2005) ‘Composition and Vagueness’, Mind, 114, 455, pp.615–637.
Nolan, D. (1997) ‘Quantitative Parsimony’, The British Journal for the Philosophy of Science, 48, 3, pp.329–343.
Rea, M. (1998) ‘In Defence of Mereological Universalism’, Philosophy and Phenomenological Research, 58, 2, pp.347–360.
Rea, M. (ed.) (1997) Material Constitution: A Reader (Oxford: Rowman & Littlefield).
Schaffer, J. (2007a) ‘From Nihilism to Monism’, Australasian Journal of Philosophy, 85, 2, pp.175–191.
Schaffer, J. (2009) ‘On What Grounds What’ inMetametaphysics: New Essays on the Foundations of Ontology, edited by David J. Chalmers, David Manley and Ryan Wasserman. (Oxford: OUP), chapter 12.
Sider, T. (2001) Four-Dimensionalism (Oxford: OUP).
Sider, T. (2011) Writing the Book of the World (Oxford: OUP).
Sider, T. (2013) ‘Against Parthood’, in Bennett, K. and Zimmerman, D. (eds.) Oxford Studies in Metaphysics, vol. 8. (Oxford: OUP), pp.237–293.
Silva, P. (2013) ‘Ordinary Objects and Series-Style Answers to the Special Composition Question’, Pacific Philosophical Quarterly, 94, 1, pp.69–88.
Simons, P. (1987) Parts: A Study in Ontology (Oxford: Clarendon).
Thomasson, A. (2006) ‘Metaphysical Arguments against Ordinary Objects’, The Philosophical Quarterly, 56, 224, pp.340–360.
Unger, P. (1979) ‘There Are No Ordinary Things’, Synthese, 41, pp.117–154.
Unger, P. (1980) ‘The Problem of the Many’, Midwest Studies in Philosophy, 5, 1, pp.411–468.
Van Cleve, J. (2008) ‘The Moon and Sixpence: A Defence of Mereological Universalism’, in Hawthorne, J., Sider, T., and Zimmerman, D. (eds.) Contemporary Debates in Metaphysics (Oxford: Blackwell), pp.321–340.
Van Inwagen, P. (1987) ‘When Are Objects Parts?’ Philosophical Perspectives, 1, pp.21–47.
Van Inwagen, P. (1990), Material Beings (Ithaca: Cornell UP).
Varzi, A. (2000) ‘Mereological Commitments’, Dialectica, 54, pp.283–305.
Wallace, M. (2011) ‘Composition as Identity’, parts I and II, Philosophy Compass, 6, 11, pp.804–827.
Wiggins, D. (1968) ‘On Being in the Same Place at the Same Time’, Philosophical Review, 77, 1, pp.90–95.
Williams, J.R.G. (2006) ‘Illusions of Gunk’, Philosophical Perspectives, 20, 1, pp.493–513

Author Information

David Cornell
Email: DMCornell@uclan.ac.uk
University of Central Lancashire
United Kingdom

Chinese Philosophy: Overview of History

There was no effort to write a comprehensive history of the Chinese Philosophy until the modern period of Western influence on Chinese culture. This is not to say that Chinese thinkers did not engage selectively with philosophers of earlier or contemporary eras.

What has come down to us as the final chapter of the Zhuangzi (ch. 33, Tian Xia “Under Heaven”) offers a sort of history of the development of Chinese philosophy. Of the writers of texts that survive to this day, it was Sima Tan (165?-110 B.C.E.) who made the first real attempt to classify Chinese thinkers into six major schools: Yin-Yang, Confucianism (Rujia), Mohism (Mojia), the School of Names (Mingjia), Legalism (Fajia), and Daoism (Daojia). As the history of Chinese philosophy evolved, more categories were added to these six, as well as various permutations and blends of them (for example, Profound Learning/Xuanxue and Neo-Confucianism/Lijia).

Hu Shi’s An Outline of the History of Chinese Philosophy (1919) is the first work by a Chinese scholar to undertake the project of writing a comprehensive history of the transformations of Chinese philosophical thought, although it is presented by the author as only an outline. Feng Youlan (Fung Yu-lan, 1895-1990) wrote the most widely known and used work on the history of Chinese philosophy in the 20^th century. His two-volume History of Chinese Philosophy (volume 1, 1931 and volume 2, 1934) is a landmark work having a range and depth far exceeding that of Hu Shi’s Outline. Lao Siguang‘s History of Chinese Philosophy in 1982 makes it quite clear that his intention was to write a work that made use of Western critical standards in all respects. One of the most thorough and well-informed studies of the history of Chinese philosophy in a single volume is The History of Chinese Philosophy, edited by Bo Mou.

It is a common characterization of the history of Chinese philosophy to say that its overall trajectory may be captured in the concept of “the three teachings” (sanjiao): Confucianism, Daoism, and Buddhism. If we acknowledge the numerous permutations, revisions, re-conceptualizations, and syntheses of them, and if we speak of the three teachings as analogous to streams of influence flowing together into the broad river of Chinese philosophy, then this is still a fruitful way of conceiving of the major historical forces at work in the tradition, from at least the 3^rd century C.E. down at least to the modern period. Beginning in the late 18^th century, Western philosophical influences began to flow into the stream of Chinese philosophy, as well.

Classical Chinese Philosophy in the Pre-Qin Period (before 221 B.C.E.)
Philosophy from the Qin (221 B.C.E.) to the Tang (618 C.E.)
Early Buddhism in China
The Song Period (960-1279 C.E.) and Neo-Confucianism
1. Morality Books of the Three Teachings (Sanjiao) Tradition
2. Neo-Confucianism: The Original Way of Confucius for a New Era
The Chinese and Western Encounter in Philosophy
Whither China? Philosophical Views
1. Kang Xiaoguang (b. 1963 C.E.)
2. Tu Wei-ming (1940-) and New Confucianism
References and Further Reading

1. Classical Chinese Philosophy in the Pre-Qin Period (before 221 B.C.E.)

a. The “Great Commentary (Da Zhuan)” to the Classic of Changes (Yijing)

In terms of a repository of philosophical reflection on the nature of reality and the human place in it, the story of Chinese philosophy may be said to begin with the Classic of Changes (Yijing). This work is composed of two parts: 1) a quite ancient manual of divination known simply as the Changes (Yi), or, more correctly, as the Zhouyi because it is a handbook of practices and procedures are traceable to the period of the Western Zhou dynasty (c. 1046-771 B.C.E.) and 2) a set of seven commentaries (zhuan) attached to the Yi and traditionally ascribed to Confucius, although there is no firm evidence that he wrote them, or even that he used them. Three of the commentaries are composed of two sections each, so taken as a whole, the commentary set is known as “The Ten Wings (Shiyi).” One of the commentaries to the Yi is known by the various titles of “The Great Commentary (Da Zhuan)” or “Appended Statements (Xici).” For a study of philosophy, “The Great Commentary” is arguably the most important single offering an understanding of the earliest written understanding of Chinese ontology currently available to us.

Edward Shaughnessy (1997) has done a recent translation of The Classic of Changes based on the Mawangdui archaeological finds. He offers reasons for thinking the work was edited most likely during the long period from 320-168 B.C.E. While it is true that some material in the “Great Commentary” may have its origin as late as the Han dynasty, there is clear evidence in concepts and reasoning of a much earlier period in the text, as well. The “Great Commentary” provides a clear exposition of the early Chinese worldview that all things are in a constant process of change. Readers will notice that “Yi” is sometimes used for the Zhouyi/Yijing as a divination guide and sometimes simply for “the process of reality” itself.

b. Confucius (551-479 B.C.E.) of the Analects

The earliest association of Chinese philosophy with a specific figure whose work is not only still extant, but widely used, is that of Confucius (personal name Kong Qiu, also known as “Master Kong” or Kongzi, 551-479 B.C.E.). Confucius was born, lived, and taught during the classical period of China. His philosophical teachings were gathered and transmitted largely, but not exclusively, in a work known as the Analects (also known as Lunyu, meaning “Selected Sayings”). This book is composed of short texts and brief conversations in which Confucius is often, but not always, the main teacher. The received version of the Analects is divided into 20 books that are further categorized with the convention of listing the book first, then the analect (that is, 3.1 is Book Three, analect one). Recent textual critical studies of the received text of the Analects have identified various strata in the collection, according to which some analects are more likely to be traceable to the historical Confucius, others to his disciples, others to master teachers associated with him or a generation removed, and still others that may be several generations removed from Confucius himself.

Ronnie Littlejohn’s understanding of this structure divides the text into the following categories: basic teachings on philosophical concepts probably traceable to Confucius (for example, Book 4); comments on disciples and personages by Confucius (for example, Book 5); collection of teachings to specific students by topic (for example, Book 12); and later materials codified for transmission by students and later masters (for example, Books 19 and 20) (Littlejohn 2011). There are many fine complete English translations of the Analects, some of which are available online.

Confucius stood within the tradition of scholars called Ru (儒). In the Han dynasty (206 B.C.E.-220 C.E.), two centuries after Confucius’s life, Liu Xin (46? B.C.E.-23 C.E.) says the Ru first appeared as an identifiable professional group in the early Zhou dynasty (c. 1046-256 B.C.E.). They were noted for their allegiance to the sage kings of ancient China who followed what they called “the Way (Dao) of Heaven” and its social, religious, and moral proprieties (li 禮). Liu Xin tells us the Ru were devoted to the “Six Classics (Liujing)” and took Confucius as their master teacher.

The “Confucianism” of teachers and literati who studied, modified, and applied Confucius’s ideas began robustly during the Han dynasty and continues in some forms down to the 21^st century. Confucianism struggled throughout Chinese history with other intellectual streams, including Daoism and Buddhism. In the 12^th century, Zhu Xi (1130-1200 C.E.) assembled a set of works known simply as the “Four Books,” which he took to represent the core of Confucian teachings: the “Great Learning (Daxue),” the “Way of Balance (Zhongyong),” the Analects, and the Mencius. This collection became the curriculum for China’s civil service examination system down to the year 1911, as well as for similar national exams in Japan, Korea, and Vietnam. During the period 1966-1976, Confucius and Confucianism were attacked as feudalistic and oppressive. Not until the mid-1980s did a recovery of Confucian philosophy begin with the so-called New Confucians.

Unlike the “Great Commentary,” Confucius’s teachings in the Analects are not concerned with ontology or cosmology but rather with human self-development and social ethics. Because of the prominence of Confucian thought both in China and in the early encounter of Western philosophy with Chinese philosophy, it was often said that Chinese thought is only socio-political or ethico-moral in its interests. This is, of course, not true at all. However, it is a not an inaccurate representation of Confucius’s thought.

Based on the development of Confucianism in history, the following concepts from the Analects can be identified with confidence as central to Confucius’s thought and contribution to Chinese philosophy: ren 仁 (humaneness, benevolence); junzi 君子 (exemplary person, gentleman); yi 義 (righteousness, appropriate behavior for the situation), xiao 孝 (filiality), and li 禮 (ritual propriety).

c. Mozi (c. 470-391 B.C.E.) and Mohism

Although it is little known and not influential today, the “Mohist School” was one of the most influential movements in pre-Qin China. The thinkers in this tradition were students and later followers of Mo Di (also known as “Master Mo” or Mozi, c. 470–391 B.C.E.). According to the Records of the Historian (Shiji) by Sima Qian, Mozi was an official of the state of Song, and he lived just after Confucius. Our primary source for his thought is a collection of materials edited into a large anthology simply called the Mozi, although this text also contains materials much later in origin than the historical Mo Di, thus representing more fully Mohism as a school or movement. The Mozi contains essays, short dialogues, anecdotes, and compact philosophical discussions. One part of the text sets out the “Ten Core Doctrines” of Mohism in a triad of essays, each exploring the same principal ideas and often containing repeated language and examples. The essay layers are designated as shang 上, zhong 中, and xia 下 (that is, upper, middle, and lower). Just why there are three versions of each core doctrine is not certain, but the prevailing theory is that the triads are probably versions of oral and/or written traditions representing three different lineages of Mohism coming down from the historical Mozi. It is clear, though, that between the triads there are some divergences in philosophical positions between the versions. Some of these are of little consequence, but others are more significant. There is no attempt to harmonize all of these into a monolithic version.

An English translation of the entire Mozi is Ian Johnston’s The Mozi: A Complete Translation (2010). The Mozi contains philosophical reflections on a wide range of questions and problems. Mozi is revealed as a master of argument, making him the foremost representative of the “debaters (bianshi)” of the classical period. He sets forward the earliest form of consequentialism in political and moral thought, opposes military aggression, advocates state welfare for the people, holds to an absolute merit-based principle for most levels of political leadership, and consistently advocates for the folk religious beliefs of his day. The Mozi takes sophisticated positions on logic, epistemology, causality, and language. Arguably, the central moral idea of Mozi is jian ai 兼爱, which may be rendered as “universal love” or “impartial concern.”

d. The School of Names, Mingjia 名家 (Disputers, Dialecticians, Bianshi)

Included among the dialecticians (debaters, bianshi) associated with what we may call the School of Names, Hui Shi (350-260 B.C.E.) and Gongsun Long (320-250 B.C.E.) are easily the most prominent. Other thinkers often mentioned are Deng Xi and Yin Wen. Unfortunately, with the exception of the partial anthology Gongsunlong Zi, the works of these thinkers have all been lost.

In both the Zhuangzi’s chapter 33 and in Sima Tan’s remarks on “the six schools,” the School of Names and its dialecticians are somewhat ridiculed for making minute examinations of trifling points or intricate distinctions in the use of terms. For example, they are associated with making philosophical points about the distinctions between shapes and colors, unity and plurality, similarity and difference.

However, the philosophers practicing this methodology were interested in demonstrating that our understanding of the world is a radical form of perspectivism or relativism. They sought to move persons out of dogmatic positions, which tended to elevate particular points of view to absolute truths. In Zhuangzi 33, twenty-one theses associated with these thinkers, are included and they are most often taken by interpreters as examples full of counterintuitive and even absurd implications in ways familiar to the characterizations of Zeno in Western thought. Ten theses of Hui Shi, also reported in Zhuangzi 33, are examples of these methods.

e. The Daodejing

The long-standing tradition in China is that an individual philosophical master named Laozi was the author of a philosophical work known as the Daodejing, which means the “Classic of Dao (the Way) and its De (virtuous power).” This understanding of authorship is almost universally rejected by scholars now in favor of the view that the text is a collection of materials from “ancient masters” collected in different versions beginning in about 300 B.C.E. and continuing until the standard edition made by Wang Bi sometime between 226 and 249 C.E. Nonetheless, the impact of the Daodejing has been monumental as the classical representative of the tradition of Daoism, often characterized as the yin (passive) spirit in Chinese philosophy, where Confucianism is regarded as the yang (active). The remarks in the Daodejing counsel naturalness, simplicity, and spontaneous action without effort according to the movement of Dao, while Confucianism advocates active cultivation of one’s nature by learning, vigorous effort and political involvement, and conformity with established proprieties reconceived in their application by each new generation.

The Daodejing is one of the most translated texts of China into other world languages and, as with the Analects, there are many English versions of the complete text. Among these, some that stand out include P. J. Ivanhoe, The Daodejing of Laozi, D. C. Lau, Tao Te Ching, and Michael LaFargue, The Tao of the Tao-te-ching. Textual, literary, and redaction-critical approaches have shown that the received text is a collection of teachings used across lineages of Daoist teachers and not the work of a single author, although in its received form a final editor did collate its current arrangement. The component aphorisms and remarks of the Daodejing are strung together somewhat like beads on a string by an editor or editors.

f. The Zhuangzi

The Zhuangzi is one of the formative texts of classical Chinese Daoism traditionally ascribed to the philosopher Zhuang Zhou (c. 365?-290? B.C.E.). The received text was edited by a scholar official named Guo Xiang (d. 312 C.E.) and contains 33 chapters. Most of these, like those in the Daodejing, contain many component logia. However, unlike the case of the Daodejing, we know that there was a much larger and older Zhuangzi. This “lost Zhuangzi” consisted of 52 chapters, and it is mentioned on a list in Imperial bibliographies dating from about 110 C.E. Within the Zhuangzi are cycles of materials related to Laozi and other characters, real and imaginary. Contemporary scholars, such as Liu Xiaogan (1994), Harold Roth (1991), and Ronnie Littlejohn (2010), have all suggested models for understanding the structure of the text of the Zhuangzi. The following represents the textual division by Littlejohn:

Inner Chapters (chs. 1-7) contain a number of logia that may be attributed to Zhuang Zhou and very likely represent the oldest material in the book.

Daode Chapters (chs. 8-10e) represent a clear break in the text and form a coherent essay, often using the first person and employing illustrations of its points internal to the essay. The essay is not interrupted by any disconnected logia. As such, it is likely that the essay was written by a single individual who made use of texts and themes, some of which are also found in the Daodejing.

Yellow Emperor-Laozi Chapters (also known as Huang-Lao Daoism) (largely chs. 11-16, 18, 19, and 22) are traceable to a lineage of Daoist teachers that developed during and after the heyday of the Jixia Academy (318?-284? B.C.E.) and had distinctively different emphases than those found in the other layers of the Zhuangzi and in the Daodejing. The earliest look that we get at the characteristics of this important tradition in Daoist history is in the Zhuangzi itself. The Masters of Huainan (Huainanzi, 139 B.C.E.) represents a continuation and maturing of these ideas.

Zhuangzi Disciples Chapters (largely Chapters 17-28) contain logia associated with the earliest disciples and second-generation transmitters of Zhuang Zhou’s teachings and that have close connections with ideas in the Inner Chapters (Littlejohn 2010).

g. Mencius (c. 372-289 B.C.E.)

If our ancient sources are correct in their chronologies, Mencius (that is, Meng Ke, Mengzi, or “Master Meng,” c. 372?-289? B.C.E.) was a contemporary of Zhuang Zhou, the Daoist master. The text coming down to us as the Mengzi contains virtually all of his significant teachings. Within the Confucian stream of Chinese philosophy, Mencius’s influence was so significant that he became recognized as the most authoritative interpreter of Confucius’s teachings and was known as “Mengzi the Second Sage.” He was a defender of Confucianism during the period of the Hundred Schools of Thought during the so-called Spring and Autumn (771-476 B.C.E.) and Warring States (475-221 B.C.E.) periods of Chinese history. Mencius was likely one of the major teachers at what has been called the Jixia Academy (318?-284? B.C.E.). The Mengzi that contains his philosophical remarks later became one of the “Four Books (Sishu)” that formed the core of the Confucian examination and education system for centuries.

The Mengzi appears to have been collected by Mencius’s disciples, some of whom are referred to in the text as “masters” themselves, indicating a later period of composition for those passages. The received text was edited by Zhao Qi (d. 201 C.E.) into seven books, each in two parts, and each part with a number of passages. When scholars cite the Mengzi, the form is always in this manner: book, section, passage (that is, 3B9). This citation form enables the reader to locate the passage in any of the complete translations of the text. Among the best full-text translations of the Mengzi are D. C. Lau, Mencius; Bryan Van Norden, Mengzi with Selections from Traditional Commentaries; and Irene Bloom, Mencius (completed and edited by Philip J. Ivanhoe).

h. Xun Kuang or Xunzi (c. 325?-235? B.C.E.)

What little is known of the life of Xun Kuang (also known as Master Xun or Xunzi, c. 325?-235? B.C.E.) is culled from evidence in his own writings and from the brief biography written by the historian Sima Qian some hundred years or so after Xunzi’s death. If we are right about Xunzi’s year of birth, he would have been around 20 years old when Mencius died. Sima Qian reports that Xunzi studied at the Jixia Academy, and it is quite possible that he was well acquainted with Mencius’ ideas directly or through first-generation disciples. He and his disciples seem to have been highly regarded by the rising Qin rulers. In fact, two of his students, Han Fei and Li Si, were instrumental in developing the theory of law and justice used during the Qin dynasty (221-206 B.C.E.) and known simply as Legalism. The primary source for Xun Kuang’s thought is known simply as the Xunzi. This book consists of 32 chapters that are essentially well-crafted, self-contained essays. Interestingly, though, the Xunzi was not a part of any of the later lists of Confucian classics in the canon, very much unlike the Mengzi that became part of the Four Books and occupied a central place in Confucian learning.

For years, the standard English translation of Xunzi was that by John Knoblock (1988, 1990, 1994), but in 2014 a new complete version appeared by Eric Hutton.

2. Philosophy from the Qin (221 B.C.E.) to the Tang (618 C.E.)

a. Syncretic Philosophies in the Qin and Han Periods

During the Qin and Han periods, it was not uncommon to gather communities of scholars together and also collect numerous texts, all from different philosophical traditions. A result of this process was the creation of works that attempted to unify and synthesize previous learning, representing an effort to create a harmonized body of truth. Two of these syncretic works are the Hanfeizi and the Masters of Huainan (Huainanzi).

i. Master Han Fei (c. 280-233 B.C.E.) and Legalist Philosophy

Master Han Fei (Hanfeizi, c. 280-233 B.C.E.) was a student of Xunzi, probably at the Jixia Academy. His essays, gathered into the work Hanfeizi, were most likely written for the kings of the Han state, King Huan Hui (r. 272-239 B.C.E.) and King An (r. 238-230 B.C.E.). Han Fei is regarded as a principal representative of the “Legalist School (fa jia).” The “Legalist School” refers loosely to Chinese philosophers of the classical period whose common conviction was that law rather than morality was the most reliable ordering mechanism for society. A number of philosophers associated with this school were active in government and as imperial consultants. Han Fei himself was an advisor in the Han state just prior to its annexation by the Qin during the consolidation of China’s first empire in 221 B.C.E. Wenkui Liao’s translation, The Complete Works of Han Fei Tzu with Collected Commentaries, is available electronically at The Institute for Advanced Technology in the Humanities University of Virginia, ed. Anne Kinney.

ii. The Masters of Huainan (Huainanzi)

According to his biography in the Book of the Early Han, Liu An (179-122 B.C.E.), the king of Huainan (in modern Anhui province) and uncle of Han Emperor Wu, gathered a large number of philosophers, scholars, and practitioners of esoteric techniques to Huainan roughly in the period 160-140 B.C.E. to debate and synthesize all learning. The collected volume now known as Masters of Huainan (Huainanzi) was a product of this interchange of ideas. It was presented to emperor Wu in 139 B.C.E. as a suggested program of rulership, although it was rejected, leading perhaps to the death of Liu An himself.

The Masters of Huainan is a synthetic document meant to harmonize the thought of the so-called “Hundred Schools (zhuzi baijia)” as a sort of universal encyclopedia of knowledge, although most scholars hold that its primary influence is associated with what is known as Yellow-Emperor Daoism (Huang-Lao Daoism). In its received form, it is a work of 21 essays ranging in subject matter from cosmology and astronomy to inner qi (vital energy) cultivation, bio-spiritual transformation, and political rulership.

The first complete English translation of the text is The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China, by John Major, Sarah Queen, Andrew Set Meyer, and Harold Roth (2010).

iii. The Luxuriant Dew of the Spring and Autumn Annals of Dong Zhongshu

Dong Zhongshu (c. 198-104 B.C.E.) was more successful than Liu An in crafting a philosophical vision attractive to the Han rulers. Dong was one of the central figures involved in the resurgence of Confucianism and the Confucian classics in the Han Dynasty. His version of Confucianism drew within it the cosmologies of the five phases (wuxing) and the yin-yang school prominent during the Han period. The Luxuriant Dew of the Spring and Autumn Annals (Chun Qiu Fan Lu) is a work in 17 parts, containing 123 chapter titles, of which 79 chapters survive. Although traditionally ascribed solely to Dong Zhongshu, it shows the signs of multiple editorial hands and cannot be attributed in its entirety to him.

Selections from Dong have been translated by Mark Csikszentmihalyi and are included in Readings in Later Chinese Philosophy: Han to the Twentieth Century.

b. The Rise of Critical Philosophy in China: Wang Chong (25-100 C.E.)

Wang Chong (25-100 C.E.) studied in the imperial school in Luoyang, Henan province. After his training, he returned to his home near modern Shangyu, Zhejiang province in the position as Officer of Merit. His writings on subjects ranging from morality, to government, to science and technology were compiled into the work Critical Essays (Lunheng). Actually, each of the essays is meant to stand alone as a separate philosophical analysis, and there is no attempt to harmonize any seeming contradictions or inconsistencies apparent between the essays that come into view when the collection is read as a whole. Wang is generally acknowledged as a philosopher who is critical of many traditional beliefs of his day. He does not believe Heaven interferes with natural happenings, neither does it reward and punish persons for their actions, as Mozi thought it did. Destiny, chance, and luck are more important operators for describing what happens to us in his philosophy. Wang thinks human activity is actually of little consequence in the grand sweep of reality, and he largely disconnects happiness and unhappiness from the notions of legal reward and punishment, or even from any direct connection to our moral actions. He especially rejects reports of what we would call supernatural occurrences and interventions in human life and nature.

Alfred Forke’s translation of Wang’s Critical Essays is available online at http://www.humanistictexts.org/wangchung.htm.

c. Profound Learning (Xuanxue)

The movement known as Profound (or Mysterious) Learning (Xuanxue) has been labeled “Neo-Daoism.” This generalized term once was used to refer to the period of development of Chinese philosophy from the decades immediately preceding the fall of the Han dynasty to approximately the early 300s C.E. However, the term is misleading and no longer in favor due to the fact that the movement it seeks to describe claimed no particular Daoist sectarian identity but instead encapsulates a complex set of fresh insights and intense debates about new directions in Chinese thought.

Major figures generally associated with Profound Learning include He Yan (c. 207-249 C.E.), Wang Bi (226-249 C.E.), and Guo Xiang (d. 312 C.E.). Generally speaking, all three of these philosophers were working with the syncretic philosophies of the late Han as their background, but they were seeking to make new interpretations of original classical sources such as the Yijing, the Daodejing, and the Zhuangzi, or what were known as “the Three Profound Treatises (sanxuan).” Wang Bi and Guo Xiang edited and commented on what may now be called the standard texts of the Daodejing and the Zhuangzi. He Yan commented on the Analects. All of the philosophers in this tradition were seeking to demonstrate a unity of the Chinese classical texts into one tradition. However, this is not to say that the three thinkers mentioned here shared the same interpretations of Chinese concepts, nor that they even ranked the classical texts and thinkers in the same priority, although all valorized Confucius.

3. Early Buddhism in China

Buddhism first reached China from India roughly 2,000 years ago during the Han dynasty. It is generally agreed now that Buddhism entered along several different trade routes in the 1^st century C.E., both in northern and southern regions of China, but the northern route known simply as the Silk Road is still regarded as the line along which Buddhist monks, believers, and traders had the most prominent manner of entry. The Buddhists entering along this route established famous monastic and study sites at places such as Dunhuang, Chang’an (Xi’an), and Luoyang, leaving behind marvels of art and architecture, as well as fascinating texts. As early as the 2^nd century C.E., a few Buddhist monks, such as Lokaksema (Zhi Loujiachen, 147-? C.E.), a monk from Gandhara, began translating Buddhist sutras and commentaries from Sanskrit into Chinese. The most famous of such monks was Xuanzang (602-664 C.E.), whose travels to India to acquire texts and create a translation school at Chang’an are both made famous in historical records, as well as the classic Chinese novel Journey to the West (Xiyou ji).

a. The Dhammapada (Chinese translation, c. 224 C.E.)

The Dhammapada (Fa Jujing) was translated into Chinese about 224 C.E., and the tradition is that it represents a 423-verse sermon attributed to the historical Buddha, that is, Siddhartha. This work is often neglected in a study of Buddhism’s early impact on Chinese philosophy. While it is arguably the most popular work in the Pali Canon, how it came to China and just how widely it was used are still matters of debate. Nevertheless, it represents well the earliest texts introducing the new way of thinking known as Buddhism into the Chinese philosophical tradition. The selection that follows is taken from John Richards’s 1993 translation available electronically at http://www.geocities.ws/sharibushariputra/SharibuShariputra/BuddhaDharma-Dhammapada_1.htm. The verse number is provided at the end of each teaching.

b. Tiantai Buddhism

The Tiantai School of Buddhism (Tiantai zong) was entirely of Chinese origin. Tiantai grew and flourished as a Buddhist school under its fourth patriarch, Zhiyi (538-597 C.E.), who asserted that the Lotus Sutra (that is, The Sutra of the Lotus Blossom of the Subtle Dharma, Miaofa Lianhua Jing) contained the supreme teaching of Buddhism. The school derives its name from the Tiantai Mountain that served as its most important monastic community and the one at which Zhiyi studied.

Two of Zhiyi’s most important philosophical teachings are “the Ten Ways of Existing in Reality” and “The Threefold Truth.”

The most distinctive ontological claim of Tiantai is that there is only one reality, which is both the phenomenal existence of our everyday experience and nirvana itself. There is no transcendent dimension or place that exists apart from the reality we are experiencing here and now. In fact, Tiantai writings describe 10 ways one may exist in reality:

Hell Beings
Hungry Ghosts
Beasts (that is, beings of animal nature)
Asuras (demons)
Human Beings
Gods or celestial creatures
Voice-hearers (Skravakas)
Self-enlightened Ones (Pratyekabuddhas)
Bodhisattvas
Living Buddhas

In Tiantai ontology, the reality that the Hell Beings inhabit is the same reality in which the Buddhas live. There is no supernatural boundary between these ways of existing or transcendent place to which some go (for example, Heaven), while others dwell elsewhere (Hell). Living and working next to us may be one who is a Hell Being or a Bodhisattva or even a Buddha. Indeed, we ourselves may be demons or Bodhisattvas, depending on whether we follow the Buddhist way.

Zhiyi’s Teaching of the Threefold Truth (san di) may be summarized in the following way. 1) We can make true statements about the world of existing things. These truths are about things that exist and their interactions in a network of interdependent causes. These are the truths of history, science, and so forth. The truth of a statement here is verified by testing it over against the world of our experience. 2) It is also true to say that all things are empty (kong di) and have no permanence. There is no permanent essence to anything in our world of experience, including ourselves. Everything in reality is devoid of any permanent essence. 3) The third character of truth is that the mundane or phenomenal world is real and at the same time it is impermanent and ultimately empty.

The Great Calming and Contemplation (Mohe zhiguan), a massive treatise of edited lectures by Zhiyi on meditation, offers the teaching that we may dwell in one or more of the Buddhist 10 realms at any given time. The more one moves in calm and contemplation toward Buddha consciousness, however, the more the other realms of consciousness recede and eventually dissipate. The contents of the work are organized into 10 chapters, which systematically trace the perfect path of calming and contemplation to the final actualization of Buddhahood itself. The translation by Daniel Stevenson (1996) is available electronically at http://chancenter.org/cmc/1996/08/26/selections-from-chi-is-great-calming-and-contemplation/.

c. Consciousness-only Buddhism

Xuanzang (602-664 C.E.), born Chen Hui, was a Chinese Buddhist monk, scholar, traveler, and translator in the early Tang dynasty. Born in Chenhe village, near present-day Luoyang in what is now Henan province in 602, his family was well educated. Although he received an orthodox Confucian education, he lived for five years at Jingtu monastery (Jingtu si) in Luoyan. He spent more than 10 years traveling and studying in India. When he returned, he brought back 657 Buddhist texts and devoted the remainder of his life to a translation school he established in Chang’an (Xi’an). His travels in India are recorded in detail in the classic Chinese text Great Tang Records on the Western Regions (Da Tang Xiyuji), which in turn provided the inspiration for the fictitious religious novel Journey to the West (Xiyou ji) written by Wu Cheng’en during the Ming dynasty, around nine centuries after Xuanzang’s death. Xuanzang’s creation in China of the “Consciousness-only” School of Buddhism (Weishi zong) was greatly influenced by the writings of the Indian Yogacara master, Vasubandhu (Chinese name, Shi Qin). Xuanzang wrote an extensive commentary in 10 volumes on Vasubandhu’s text Thirty Stanzas of Consciousness-Only entitled, A Treatise on the Establishment of Consciousness-Only (Cheng Wei-shi Lun), and used it to set out his own views of this tradition of Buddhist teaching. The only complete English translation of Xuanzang’s Treatise is by Tat Wei (1973), but Chan’s Sourcebook (1973: ch. 23) contains excerpts from it.

The central ontological tenet of Consciousness-only Buddhism is that nothing exists but consciousness. Of course, this is in direct conflict with early Chinese ontology since qi is an energy that may produce consciousness but is not itself a form of consciousness. According to Consciousness-only philosophy, we have a flow of experienced ideas that we also call perceptions. However, these ideas or perceptions are not caused by concrete or material things external to us and that continue to exist whether we are conscious of them or not. In philosophical language, the ontology of Consciousness-only is simply called Idealism.

d. Chan Buddhism

Chan Buddhism developed in China between the 6^th and 8^th centuries C.E. It is regarded as a uniquely Chinese form of Buddhism that later was transplanted into Japan, where it became prominent as Zen. The Chinese word chan is used to translate the Sanskrit dhyana, which means “meditation.” Although regarded as Chinese in origin and tenor, the founding legend of Chan is that the Buddha transmitted a private esoteric teaching, never written on any sutra, but passed only from one teacher to another. The twenty-eighth patriarch in this lineage of transmission is known as Bodhidharma (470-543 C.E.), and he is said to have brought the teaching to China.

In the history of Chan, there is a Northern and Southern School. The Northern followed Shenxiu (c. 605–706 C.E.) as its patriarch, and the Southern followed Dajian Huineng (638–713 C.E.). According to the Platform Sutra of the Sixth Patriarch, regarded as the canonical expression of Chan philosophy, the split between these schools arose over who should succeed Hongren (601–74 C.E.) who was the fifth patriarch of Chan. The sutra tells the story of Huineng’s ascendancy to that role.

Chan’s theory of knowledge is concerned with one’s own mind, elevating in importance that which is known by the mind through immediate, direct acquaintance. When Chan philosophers claim that we know the world through our own minds, they do not identify mind with the thoughts presently in front of us. They mean one’s original mind, before the mind was clouded over by experiences or human distinctions made in language. It is in our original minds that we have awareness of absolutely certain truth. D. T. Suzuki says this knowledge “is not derivative but primitive; not inferential, not rationalistic, not mediational, but direct, immediate; not analytical but synthetic; not cognitive, but symbolical; not intending but merely expressive; not abstract, but concrete; not processional, not purposive, but ultimate, final and irreducible; not eternally receding, but infinitely inclusive; etc.” (1956: 34).

The Platform Sutra of the Sixth Patriarch (Liuzu Tanjing) presents itself as a written transcription of the lectures of Huineng. Philip J. Ivanhoe’s translation is based on the Dunhuang version.

4. The Song Period (960-1279 C.E.) and
Neo-Confucianism

a. Morality Books of the Three Teachings (Sanjiao) Tradition

The village lecture system of Song dynasty China made use of morality books (shanshu) to create, shape, and transmit a unified moral culture throughout the empire. Arguably, the most important of these was Tract of the Most Exalted on Action and Response (Taishang ganying pian). The Tract likely reached its final form between the 10^th and 12^th centuries C.E., and it is still widely available today. Although primarily a work of Daoist spiritual piety and relatively brief in length, having only 1,277 characters, it shows numerous Buddhist and Confucian influences and moral injunctions as well, thereby representing in itself the “Three Teachings (sanjiao)” of China. The work attributes its own authorship to Taishang, by which is meant Laozi. The term ganying employed in the title of the work is a way of speaking about sowing and reaping or receiving the results of one’s action, a concept used often in the Masters of Huainan and frequently associated with the Buddhist notion of karma when it entered China. The work builds on earlier Chinese moral tracts of similar scope and teaching such as The Code of Nuqing for Controlling Ghosts (probably composed sometime between 143 and 224 C.E.) and Ge Hong’s (283-343 C.E.) merit system in The Master Who Embraces Simplicity (c. 316 C.E.).

Like these earlier works, the Tract represents an extremely quantitative view of morality, relying on the counting of good and evil deeds as way of predicting coming blessings or punishments, including the shortening or lengthening of life. In the introductory remarks of the Tract, Taishang says that moral transgressions reduce a person’s lifespan and poverty comes upon the immoral person. The immoral person meets with calamity and misery, and all men hate him. In this work, Taishang reports a bureaucracy of numinal beings who are record keepers in charge of recording the good and evil deeds of every individual. According to the text, those who wish to attain to a celestial spiritual life should perform a net result of 1,300 good deeds, and those who wish to attain an indefinite earthly life should perform 300. One’s moral deeds are kept, as it were, on a ledger and counted by the celestial powers. Evil deeds do not disappear from one’s ledger, indicating their lasting effect, but they may be counter-balanced by good works.

T. Suzuki’sand Paul Carus’ translation is entitledTreatise on Response & Retribution.

b. Neo-Confucianism: The Original Way of Confucius for a New Era

i. Zhou Dunyi (1017-1073)

Zhou Dunyi’s (1017-1073 C.E.) Diagram of the Supreme Ultimate Explained (Taiji tushuo) is a work of importance to the articulation of the common understanding of the structure of reality that we find in the Neo-Confucian thinkers, including Cheng Hao (1032-1085 C.E.), Cheng Yi (1033-1107 C.E.), and Zhu Xi (1130-1200 C.E.). All of the most important concepts of the Chinese worldview as it was being understood and remade during the 11^th to 13^th centuries C.E. are present in Zhou Dunyi’s essay: qi, yin and yang, the five phases (wuxing), principle (li 理), and the trigrams and hexagrams of the Yijing. A translation may be found in Bryan W. Van Norden’s and Justin Tiwald’s Readings in Later Chinese Philosophy: Han to the Twentieth Century.

ii. Cheng Hao (1032-1085 C.E.) and Cheng Yi (1033-1107 C.E.)

The Cheng brothers made a powerful impact on the development of Neo-Confucian thought. Cheng Hao was one of the principal figures of the Neo-Confucian movement, and his work connected ontology and morality in a skillful way. His brother Cheng Yi reinterpreted a number of key figures and ideas in Chinese classical philosophy, giving them a distinctive Neo-Confucian flavor. The translations of their work by Philip J. Ivanhoe in Readings in Later Chinese Philosophy: Han to the Twentieth Century are based upon the Chinese texts found in Collected Works of the Two Chengs (Er Cheng ji).

iii. Zhu Xi (1130-1200 C.E.) and the Neo-Confucian Synthesis

Zhu Xi was born in Youqi in Fujian province, China in 1130 C.E. His early interests were in Daoism and Buddhism, but he became the student of Li Tong (1093-1163 C.E.). Li worked within the philosophical tradition of Cheng Hao and Cheng Yi. Zhu Xi compiled an anthology of these thinkers known as Reflections on Things at Hand that became essentially the primer for Neo-Confucianism for generations. If we were to compare him to Western philosophers of the same far-reaching influence, we would take note of Aristotle’s influence in the classical period, Thomas Aquinas in the Medieval period, and Immanuel Kant in the Enlightenment period. He ranks along with Confucius and Mencius as one of the three preeminent thinkers of China. As such, his philosophy represents the most thoroughgoing example of Neo-Confucianism.

One of Zhu Xi’s greatest accomplishments was collecting and compiling the Four Books (sishu), which were made the foundation of the all-important imperial examinations. His systematization of Confucianism into a coherent program of education became the foundation for educational systems in China, Korea, and Japan. Zhu Xi’s oral teachings to students are preserved in Conversations of Master Zhu, Arranged Topically or Categorized Conversations (1270). The translation of excerpts from this text by Bryan W. Van Norden in Readings in Later Chinese Philosophy: Han to the Twentieth Century is based on Zhuzi yulei, vol. 1 (1986 reprint edition) as well as Fung Yulan (Feng Youlan)’s A History of Chinese Philosophy: The Period of Classical Learning, vol. 2.

iv. Wang Yangming (1472-1529 C.E.)

Wang Yangming, a Ming dynasty general and official, practiced Daoist “sitting in forgetfulness (zuowang),” grasped the realization of the unity of knowledge and action that Daoist thinkers know as wu-wei, and taught that the highest form of knowledge was what he called “pure knowledge (liangzhi),” resembling in many ways the epistemology of Chan (Zen) Buddhism. The principal sources for Wang’s ideas are his works, A Record for Practice (1518 C.E., Chuan Xilu) and “Inquiry on the Great Learning” (1527 C.E., Daxue Wen). Excerpts from these texts have been translated by Philip J. Ivanhoe in Readings from the Lu-Wang School of Neo-Confucianism.

Wang actually had a rather stormy career due in large measure to his opposition to the philosophy of Zhu Xi. He departed from Zhu in both his ontology and epistemology. In fact, during the Ming dynasty (1368-1644 C.E.), Wang Yangming became the most deliberative of Zhu Xi’s critics, even if he continued to use much of the philosophical vocabulary of Zhu Xi and other Neo-Confucians.

5. The Chinese and Western Encounter in Philosophy

a. Dai Zhen (1724-1777 C.E.)

Dai Zhen was born in Longfu City (Tunxi city) in Anhui Province into the family of a poor cloth merchant. He devoted himself to the study of the basic works of Chinese philosophy. His two most prominent philosophical works are entitled On the Good (Yuanshan) and An Evidential Commentary on the Meaning of Terms in the Mengzi (Mengzi Ziyi Shu). Dai Zhen’s Evidential Commentary is organized into several parts, each devoted to a particular philosophical term or phrase. His approach is to begin the analysis of each important concept with a philological analysis. Sometimes he shows how the term or phrase was used in selected passages in the history of Chinese philosophy. One of his principal goals is to correct misunderstandings, most particularly those he associates with the Neo-Confucians with respect to their views on reality’s Principle(s) (li) and our human desires. Dai makes the point that, unlike Buddhism’s rejection of all desire as the root of suffering, some desires may actually be positive. Selections from this text, translated by Justin Tiwald, may be found in Readings in Later Chinese Philosophy: Han Dynasty to the 20^th Century.

b. Kang Youwei (1858-1927 C.E.)

Kang Youwei was a committed Chinese nationalist in the last years of the Qing dynasty (1644-1912 C.E.). He developed a philosophical construction of a utopian state entitled Book of Great Unity (Da Tong Shu), which should be considered along with other such political visions developed in world philosophy as Plato’s Republic. The work was not published in its entirety until 1935, eight years after Kang’s death. Laurence G. Thompson’s translation of Book of Great Unity is entitled The One-World Philosophy of K’ang Yu-wei.

c. Zhang Dongsun (1886-1973 C.E.)

Zhang Dongsun was well educated in the philosophy and method of the Western philosopher Immanuel Kant. He even interpreted Confucianism along Kantian lines. As an intellectual, he was quite active in government during the early years of the People’s Republic of China but was sent to a re-education camp during the Cultural Revolution (1966-1976 C.E.). He is best known for articulating a “pluralistic epistemology” that emphasizes the importance of sociology, culture, and language in the shaping of worldviews and philosophical approaches. His essay, “A Chinese Philosopher’s Theory of Knowledge,” appears in Our Language and Our World: Selections from Etc.: A Review of General Semantics.

d. Hu Shi (1891-1962 C.E.)

Although influenced by Buddhism in his youth, Hu Shi studied in Shanghai in three schools known for their curriculum called “the New Education,” which was a reference both to a Western style of learning and its content. He later completed his Ph.D. in Philosophy under the direction of John Dewey at Columbia University in the U. S. A. After completing his doctorate in 1917 C.E., he returned to China to become professor of Chinese and Western philosophy at Beijing University. He was instrumental in the development of the New Culture Movement (1912-1920 C.E.) that was dedicated to the modernization of Chinese learning and social progress. He was also a key figure in introducing Pragmatism and scientific research methodologies to China. A succinct representation of the shift to Western science in China during the 20^th century is Hu’s “New Credo.”

e. Mao Zedong (1893-1976 C.E.)

Mao Zedong was born in a village in Hunan province into a well-to-do farming family. He was influenced by Sun Yat-sen’s calls for a Republic of China and read widely from Western texts, including Darwin, Mill, and Rousseau. When the May Fourth Movement (May 4, 1919) erupted in Beijing as a response to imperialism, Mao started a magazine in Changsha and called for a union of the popular masses, the liberation of women, and a new Chinese nationalism. When the Communist Party was founded in Shanghai in 1921, Mao started a branch in Changsha. From 1923 to 1925, he worked as a member of the Party Committee alongside the KMT (Kuomintang/Nationalists) and even ran the KMT activities in Hunan. After 1927, Mao became commander of the Red Army or People’s Liberation Army. He later became the first Chairman of the Central Committee of the Chinese Communist Party of the People’s Republic of China and the “Father of the Nation.”

The complete collected works of Mao from 1917-1945 are available in English at the U. S. Government’s Joint Publications Research Service, where all articles signed by Chairman Mao individually or jointly, as well as those unsigned but verified as his, are available http://marxists.org/reference/archive/mao/works/collected-works-pdf/index.htm. For works after 1945, Selected Works of Mao Zedong (1968) is also a good source for his work: http://www.marxists.org/reference/archive/mao/selected-works/index.htm.

While some question Mao’s credentials as a philosopher, actually he did educate himself extensively with regard to Chinese history and philosophy. Of course, Mao’s concerns are directed into a relatively narrow range of philosophical inquiry: specifically, social, political, and economic thought.

6. Whither China? Philosophical Views

a. Kang Xiaoguang (b. 1963 C.E.)

Kang Xiaoguang has taken up the challenge to offer a political philosophy for China’s post-Mao years in several works. A good overview of his views in English is David Ownby’s “Kang Xiaoguang: Social Science, Civil Society, and Confucian Religion.” Kang’s principal philosophical claim is that the Chinese Community Party must be Confucianized. He thinks that what remains of Marxism in Chinese socio-political ideology of the Party should be replaced with a reconstituted and adapted version of the philosophies of Confucius and Mencius. In his program, while the educational system will be kept within the party schools, their syllabi should be changed, listing the Four Books and Five Classics as required courses of study. He calls for a return to the examination system for all promotions within the bureaucracy and argues that Confucian philosophical teachings should be a major component of each examination. Moreover, he also maintains that not merely the political system of China, but also the society must be Confucianized. Kang holds that only by introducing Confucianism into the national education system can China regain its value system, as well as possess again a faith and soul for its culture. In his view, this can be achieved only if Confucianism becomes the state’s civil value system.

b. Tu Wei-ming (1940-) and New Confucianism

The New Confucian Movement is a complex and overlapping group of scholars from mainland China to the U. S. A. One thinker who is contributing to this movement is Tu Wei-ming (Du Weiming, 1940-). Having taught and written for many years in the U. S. A., Tu became the founding Dean of the Institute for Advanced Humanistic Studies at Beijing University in 2010. A five-volume anthology of his works was published in Chinese in 2001. One representative example of his work is the essay entitled “Beyond the Enlightenment Mentality: A Confucian Perspective on Ethics, Migration, and Global Stewardship.”

7. References and Further Reading

(Formater: Insert paragraphs for this section here.)

Ames, Roger T. and Henry Rosemont, Jr., trans. The Analects of Confucius: A Philosophical Translation. New York: Ballantine, 1998.
Bloom, Irene, trans. Completed by Philip J. Ivanhoe. Mencius. New York: Columbia University Press, 2009.
Brooks, E. Bruce and A. Taeko, trans. The Original Analects: Sayings of Confucius and His Successors. New York: Columbia University Press, 1998.
Chan, Wing-tsit, trans. A Sourcebook in Chinese Philosophy, 4th ed. Princeton: Princeton University Press, 1963.
Csikszentmihalyi, Mark, trans. “The Way of the King Joins the Three” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 15-18.
Forke, Alfred, trans. Philosophical Essays of Wang Ch’ung. London: Luzac, 1907. Available online at http://www.humanistictexts.org/wangchung.htm.
Fung, Yu-lan, trans. and ed. A History of Chinese Philosophy, 2 vols. Princeton: Princeton University Press, 1953.
Hu Shi. “My Credo and Its Evolution,” in Living Philosophies: A Series of Intimate Credos, Leaach Henry Godddardv, ed. (New York: Simon and Schuster, 1931), 235-63.
Hutton, Eric, trans. “Xunzi” in Readings in Classical Chinese Philosophy, eds. P.J. Ivanhoe and Bryan Van Norden (Indianapolis: Hackett Publishing, 2001), 255-311.
Hutton, Eric, trans and ed. Xunzi: The Complete Text. Princeton: Princeton University Press, 2014.
Ivanhoe, Philip J., trans. “Cheng Hao, Selected Sayings” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 143-152.
Ivanhoe, Philip J., trans. “Cheng Yi, Selected Sayings” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 158-168.
Ivanhoe, Philip J., trans. The Daodejing of Laozi. New York: Seven Bridges Press, 2002.
Ivanhoe, Philip J., trans. “The Platform Sutra of the Sixth Patriarch” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 91-98.
Ivanhoe, Philip J., trans. “A Record of Practice” in Readings from the Lu-Wang School of Neo-Confucianism, Philip J. Ivanhoe, trans. and ed. (Indianapolis: Hackett Publishing, 2009), 131-160.
Ivanhoe, Philip J. “Whose Confucius? Which Analects?” in Confucius and the Analects: New Essays, ed. Bryan W. Van Norden (Oxford: Oxford University Press, 2002), 119-133.
Johnston, Ian, trans. The Mozi: A Complete Translation. New York: Columbia University Press, 2010.
Kang, Xiaoguang. “Confucianization: A Future in the Tradition,” trans. Huiqing Liu, Social Research 73:1 (2006): 77-120.
LaFargue, Michael, trans. The Tao of the Tao-te-ching. Albany: State University of New York Press, 1992.
Lau, D. C. trans. Mencius. 2 vols. Hong Kong: Chinese University Press, 1984.
Lau, D. C. “On Mencius’ Use of the Method of Analogy in Argument” in Lau, trans., Mencius (London: Penguin Books, 1970), 235-263.
Liao, W.K., trans. (1939). Complete Works of Hanfeizi. London: Arthur Probsthain, 1939. http://www2.iath.virginia.edu/saxon/servlet/SaxonServlet?source=xwomen/texts/hanfei.xml&style=xwomen/xsl/dynaxml.xsl&chunk.id=d1.1&toc.depth=1&toc.id=0&doc.lang=bilingual.
Littlejohn, Ronnie. Confucianism: An Introduction. London: I.B. Tauris, 2011.
Littlejohn, Ronnie. Daoism: An Introduction. London: I.B. Tauris, 2010.
Liu, Xiaogan. Classifying the Zhuangzi Chapters. Trans. by Donald Munro. Ann Arbor, Michigan: The University of Michigan, 1994.
Major, John, Sarah Queen, Andrew Set Meyer, and Harold Roth, trans. The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China. New York: Columbia University Press, 2010.
Mao, Zedong (1917-45). Collected Works of Mao Zedong. US Government’s Joint Publications Research Service. http://marxists.org/reference/archive/mao/works/collected-works-pdf/index.htm.
Mao, Zedong. Quotations from Mao Tse Tung. Beijing: Peking Foreign Languages Press, 1966.
Mao Tse Tung Internet Archive, http://www.marxists.org/reference/archive/mao/works/red-book/index.htm.
Mao, Zedong.《毛泽东選集》Mao Zedong Xuanji, Selected Works of Mao Zedong. Beijing: Renmin Press, 1968. http://www.marxists.org/reference/archive/mao/selected-works/index.htm.
Ownby, David. “Kang Xiaoguang: Social Science, Civil Society, and Confucian Religion.” China Perspectives 4 (2009): 101-111.
Roth, Harold. “Who Compiled the Chuang-tzu?” in Chinese Texts and Philosophical Contexts, ed. Henry Rosemont. La Salle: Open Court, 1991.
Sanderovitch, Sharon, trans. “The Way of the King Joins the Three” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 13-15.
Shaughnessy, Edward, trans. The I Ching: The Classic of Changes. New York: Ballantine Books, 1997.
Slingerland, Edward, trans. Confucius: Analects, with Selections from Traditional Commentaries. Indianapolis: Hackett Publishing, 2003.
Suzuki, D.T. and Carus, Paul, trans. Treatise on Response & Retribution, Chicago: Open Court, 1906.
Thompson, Laurence, trans. Ta t´ung shu: the One-world Philosophy of K`ang Yu-wei. London: George Allen and Unwin, 1958.
Tiwald, Justin, trans. “An Evidential Commentary on the Meaning of Terms in the Mengzi” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans. and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 318-337.
Tu, Weiming, “Beyond the Enlightenment Mentality: A Confucian Perspective on Ethics, Migration, and Global Stewardship.” International Migration Review 30.1 (Spring 1996), 58-75.
Van Norden, Bryan W. ed. Confucius and the Analects: New Essays. Oxford: Oxford University Press, 2002.
Van Norden, Bryan W, trans. Mengzi, with Selections from Traditional Commentaries. Indianapolis: Hackett Publishing, 2008.
Van Norden, Bryan W., trans. “Categorized Conversation of Zhu Xi” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, Justin Tiwald and Bryan Van Norden, eds. (Indianapolis: Hackett, 2014), 168-184.
Van Norden, Bryan W. and Tiwald, Justin, trans. “Explanation of the Diagram of the Great Ultimate” in Readings in Later Chinese Philosophy: Han to the Twentieth Century, trans and ed., Justin Tiwald and Bryan W. Van Norden (Indianapolis: Hackett, 2014), 136-140.
Watson, Burton, trans. The Complete Works of Chuang Tzu. New York: Columbia University Press, 1968.
Wei, Tat, trans. Ch’eng Wei-shi lun (Doctrine of Mere-Consciousness by Husan Tsang). Hong Kong: The Ch’eng Wei-shi lun Publication Committee, 1973.
Zhang Dongsun, “A Chinese Philosopher’s Theory of Knowledge” in Our Language and Our World: Selections from Etc.: A Review of General Semantics, ed., S.I. Hayakawa and trans., Li An-che (New York: Harper, 1959), 299-324.

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Plato: The Laws

plato The Laws is Plato’s last, longest, and, perhaps, most loathed work. The book is a conversation on political philosophy between three elderly men: an unnamed Athenian, a Spartan named Megillus, and a Cretan named Clinias. These men work to create a constitution for Magnesia, a new Cretan colony. The government of Magnesia is a mixture of democratic and authoritarian principles that aim at making all of its citizens happy and virtuous.

Like Plato’s other works on political theory, such as the Statesman and the Republic, the Laws is not simply about political thought, but involves extensive discussions on psychology, ethics, theology, epistemology, and metaphysics. However, unlike these other works, the Laws combines political philosophy with applied legislation, going into great detail concerning what laws and procedures should be in Magnesia. Examples include conversations on whether drunkenness should be allowed in the city, how citizens should hunt, and how to punish suicide. Yet, the legal details, clunky prose, and lack of organization have drawn condemnation from both ancient and modern scholars. Many have attributed this awkward writing to Plato’s old age at the time of writing; nonetheless, readers should bear in mind that the work was never completed. Although these criticisms have some merit, the ideas discussed in the Laws are well worth our consideration, and the dialogue has a literary quality of its own.

In the 21^st century, there has been a growing interest among philosophers in the study of the Laws. Many of the philosophical ideas in the Laws have stood the test of time, such as the principle that absolute power corrupts absolutely and that no person is exempt from the rule of law. Other significant developments in the Laws include the emphasis on a mixed regime, a varied penal system, its policy on women in the military, and its attempt at rational theology. Yet, Plato took his most original idea to be that law should combine persuasion with compulsion. In order to persuade citizens to follow the legal code, every law has a prelude that offers reasons why it is in one’s interest to obey. The compulsion comes in the form of a punishment attached to the law if the persuasion should fail to motivate compliance.

In addition, in the Laws Plato defends several positions that appear in tension with ideas expressed in his other works. Perhaps the largest difference is that the ideal city in the Laws is far more democratic than the ideal city in the Republic. Other notable differences include appearing to accept the possibility of weakness of will (akrasia)—a position rejected in earlier works—and granting much more authority to religion than any reader of the Euthyphro would expect. By exploring these apparent differences, students of Plato and the history of philosophy will come away with a more nuanced and complex understanding of Plato’s philosophical ideas.

Setting and Characters
The Laws, Customs, and Political Structure of Magnesia
The Relationship between the Laws and the Republic
Overview of the Laws
Book 1 and 2
Book 3
Book 4
Book 5
1. Ethics
2. Geography and Population
Book 6
1. Voting and Offices
2. Marriage
Book 7 and 8
1. Musical Education
2. Gymnastics
Book 9
1. Responsibility
2. Punishment
Book 10
1. Atheism
2. Deism and Traditional Theism
Book 11 and 12
1. Laws
2. Nocturnal Council
References and Further Reading

1. Setting and Characters

The dialogue is set on the Greek island of Crete in the 4^th century B.C.E. Three elderly men are walking from Cnossos to the sacred cave and sanctuary of Zeus located on Mount Ida. This setting is crucially linked to the theme of the Laws. These three men are walking the path that Minos (a legendary lawgiver of Crete) and his father followed every nine years to receive the guidance of Zeus. As these men trace Minos’ steps, they seek to discover what the best political system and laws are. Like Minos, they too will found their political system on their understanding of the gods.

Each man is from a different Greek city-state (polis). Clinias is from Cnossos, Crete; Megillus is from Sparta; and the unnamed individual is from Athens. There is some speculation as to who this unnamed Athenian might be. Aristotle (Politics 2.6.1265a) thinks he is Socrates. Cicero (Laws 1.5.15) holds that he is Plato himself, while others speculate that he is supposed to remind the reader of the Athenian politician Solon. Another interpretation holds that the Athenian is unnamed because Plato doesn’t intend for him to represent any particular historical figure.

Setting aside the issue of who the Stranger is, readers might wonder whether they should interpret his views as Plato’s own. There is no easy and uncontroversial answer to the question. Indeed, it is a problem that pervades all of Plato’s work. Scholars adopt a variety of approaches towards this issue. Some scholars take the protagonist to represent Plato’s own view, while others hold that Plato’s view isn’t identified with any single character, but is found in the overall discussion indirectly. Furthermore, some interpreters maintain that Plato intentionally leaves his direct voice out of the dialogues because he isn’t interested in putting forth specific theses, but rather, is interested in generating thought about a set of related questions.

Although Spartans, Cretans, and Athenians are unified in the sense that they are all Greek, they differ culturally. Spartans and Cretans are from the Dorian ethnic group, while Athenians are Ionian. This is relevant for two reasons. First, the Ionians and Dorians have not always been on friendly terms. Indeed, this conflict culminates in the Peloponnesian war (431-404 BC). Second, Dorians are stereotyped as having an exclusive military focus and a distaste for intellectual pursuits, while Athenians are seen as being more artistic and philosophical. Both of these features will play out in the drama of the dialogue as each interlocutor will defend views characteristic of their home institutions and will behave in ways that are stereotypical of their culture.

2. The Laws, Customs, and Political Structure of Magnesia

Magnesia, the theoretical colony of Crete that is developed in the Laws, is a self-sufficient agricultural state located nine to ten miles from the sea. Its remote location will deter the influence of visitors, who might corrupt the culture of Magnesia. That being said, Magnesia will have a population of slaves and foreigners who carry out necessary tasks forbidden to citizens, such as trading and menial labor. The city will consist of 5,040 households. The Athenian is adamant about this number because it is divisible by any number from 1 to 12 (with the exception of 11), making it convenient for purposes of administration. Each household will be allotted to plots of land (one near the city center and one located further away) and these plots of land are inalienable to the holder’s family. The intention is to prevent members of the community from becoming wealthy at the expense of other citizens. Indeed, the city is designed in such a way to prevent citizens from becoming extremely wealthy or poor. Nevertheless, there will be four property classes based on the wealth one’s family accumulated before coming to Magnesia. Although the land will not be farmed in common, it is to be considered a part of the common property, and shareholders must make public contributions. Women will not be allowed to own property, but will be considered citizens and can hold political office. In fact, women are able to participate in the military as soldiers and can attend their own private common meals—two practices usually reserved for men in ancient Greece.

The political system of Magnesia will be mixed, blending democratic and authoritarian elements. This can be seen in how political offices are handled. There are a vast number of different political offices in Magnesia, some of which will be made up of the general citizen body. The benefit of this is that it will make the citizens feel that they have a stake in Magnesia. However, at the same time, there will be particular offices made up of more elite citizens. For example, the “guardians of the law” will supervise the general citizen body. In order to ensure that the guardians of the law are accountable for their conduct, there will be a powerful board of “scrutineers” that provide a check on their authority. The most distinguished office is the “nocturnal council,” which will be in charge of researching the philosophical nature of law and offering insight into how these features can be applied in Magnesia.

3. The Relationship between the Laws and the Republic

Although the Republic and the Laws share many similarities, those who come to the Laws after reading the Republic will likely be surprised at what they find insofar as these texts differ with respect to both content and style. In terms of style, the Laws has far less literary quality than Plato’s masterpiece, the Republic. This is partly a result of the fact that the Laws deals with the details of legal and governmental policies, while the Republic doesn’t; rather, the Republic focuses on politics and ethics at a much more general level. Furthermore, unlike Plato’s other works, the character Socrates is noticeably absent in the Laws.

Turning now to content, in the Republic, Socrates develops an ideal city, referred to as the Callipolis (literally, the beautiful or noble city). The Callipolis consists of three classes: a large working class of farmers and craftspeople, an educated military class, and a small number of elite philosophers who will rule the city. The military and ruler classes are called “guardians,” and they will not have any private property. Indeed, they will hold everything in common including women, men, and children. Unlike in the Callipolis, private property is allowed throughout Magnesia and political power spreads throughout the city. Another notable difference is that only philosophers possess fully-developed virtue in the Republic (and in the Phaedo) while in the Laws the Athenian says that correct legislation aims at developing virtue in the entire citizen body (1.630d-631d, 4.705d-706a, 4.407d, 6.770c, 12.962b-963a). To be sure, the political structure of the Callipolis secures the correct behavior of all citizens. However, because complete virtue involves knowledge, which only philosophers have, non-philosophers can only approximate virtue. In other words, the Laws seems to express more optimism than the Republic with respect to the average citizen’s ability to be virtuous.

This leaves readers to wonder what could explain these apparent differences. Although many different answers have been presented, the most prevalent answer is that the texts were written for two different purposes. The Republic represents Plato’s ideal vision of a political utopia, while the Laws represents his vision of the best attainable city given the defects of human nature. Aristotle, for example, holds that the Republic and the Laws share many of the same features, but that the Laws offers a system that is more capable of being generally adopted (Politics 2.6.1265a-b). Many scholars have supported this reading by pointing out that Magnesia is said to be the second best city, with the ideal city being one in which women, children and property are held in common (Laws 5.739a-740a). Additionally, this interpretation explains why the Laws goes into greater detail concerning day-to-day activities than the Republic does. Because the Callipolis is an unattainable utopia, there is no point to discussing the customs in any sort of detail, but because Magnesia is attainable, this is a worthwhile project. Trevor Saunders captures the essence of this interpretation when he says, “The Republic presents merely the theoretical ideal…The Laws describes, in effect, the Republic modified and realized in the conditions of this world” (1970, 28).

An alternative answer is that Plato changed his mind. On this reading, the views defended in the Laws are an advancement on the ideas expressed in the Republic. This reading denies that 5.739a-740a provides support for the claim that the Callipolis is the ideal city. Strictly speaking, the passage only says that the ideal city is one where everything is held in common, and in the Callipolis only the guardians hold things in common. This lends credence to thinking that the ideal city described in the Laws is not the Callipolis. Christopher Bobonich (2002) has argued that this new perspective is the result of Plato changing his mind about psychology, abandoning the view of the Republic in which the soul has parts and replacing it with a more unified conception of human agency and motivation. However, readers should note that this is merely a cursory discussion of a very large and important issue—there are many other ways to account for the differences between the texts.

4. Overview of the Laws

The Laws is made up of twelve books. Books 1 and 2 explore what is the purpose of government. This exploration takes the form of a comparative evaluation of the practices found in the interlocutors’ homelands. Through the course of this discussion, a preliminary account of education and virtue is offered. Book 3 examines the origins of government and the merits of different constitutions. At Book 3’s conclusion, it is revealed that Clinias is in charge of developing a legal code for a new colony of Crete, Magnesia. After discussing the appropriate population and geography of Magnesia, Book 4 analyzes the correct method for legislating law. Book 5 begins with various moral lessons and then shifts to an account of the correct procedure for founding Magnesia and distributing the land within it. Book 6 presents the details of the various offices and legal positions in Magnesia and ends by examining marriage. Book 7 and 8 discuss the musical and physical education of the citizens. Book 8 concludes with a discussion of sexuality and economics. Book 9 introduces criminal law and analyzes what factors should be taken into account when determining a punishment. Book 10 examines laws concerning impiety and presents an account of theology. Book 11 and 12 continue with the legal code. The Laws ends with an account of the “Nocturnal Council,” the “anchor” of the city.

5. Book 1 and 2

a. Virtue

The dialogue begins with the Athenian inquiring into the origin of law, as to whether it comes from a divine or human being. Clinias states that Apollo is credited as the originator of Crete’s laws, while Zeus is credited as the founder of Sparta’s (624a-625a). The conversation shifts to the question of the purpose of government. Megillus and Clinias hold that the goal of government is to win in war, since conflict is an essential condition of all human beings (625ca-627c). Because the fundamental goal is victory in war, Clinias and Megillus maintain that the primary purpose of education is to make citizens courageous. The Athenian responds by pointing out that reconciliation and harmony among warring parties is superior to one group defeating another. This demonstrates that peace is superior to victory (627c-630d). Consequently, the educative system should not focus exclusively on cultivating courage in its citizens, but should develop virtue in its entirety, including not only courage but wisdom, moderation and justice as well (630d-631d). Indeed, courage, the Athenian argues, is the least important virtue (631d). The goal of law is to help its citizens flourish, and the most direct route to this is developing virtue in them.

It is during this discussion that the Athenian makes an important distinction between “divine” and “human” goods. Divine goods are the virtues, whereas human goods are things like health, strength, wealth, and beauty. Divine goods are superior to human goods in that human goods depend on divine goods, but divine goods do not depend on anything. The idea is that the virtues always contribute to human flourishing, but things that are commonly thought to do so, such as wealth and beauty, will not do so unless one possesses virtue. In fact, things like beauty and wealth in the hands of a corrupt person will enable him or her to act in ways that will lead to failure.

Now that the importance of virtue is established, the Athenian challenges his interlocutors to identify the laws and customs of their home cities that develop virtue. Megillus easily identifies the Spartan practices that cultivate courage. The Spartan’s educational method primarily focuses on exposing citizens to fear and pain so that they might develop a resistance to each (633b-c). The Athenian responds by pointing out that this practice does nothing to develop the resistance to desire and pleasure. He argues that the Spartans only have partial courage because complete courage involves not only overcoming fear and pain, but desire and pleasure as well (633c-d).

This leads to an inquiry into what customs Sparta and Crete have for developing moderation. Megillus expresses uncertainty, but suggests it likely has to do with gymnastics and common meals (essentially an all-male club with a military emphasis). The conversation becomes contentious as the Athenian says that these practices are the cause of the Dorian’s reputation for pederasty, homosexuality, and the vicious pursuit of pleasure (636a-e). (To see Plato express an alternative attitude towards these practices, readers should turn to the Phaedrus and Symposium.) Megillus defends the nobility of the Spartans, proclaiming that they do not get drunk and that they would beat any drunkard they encountered even if it were during the festival of Dionysus (636e-637a). The Athenian thinks this is bad practice, because under the appropriate conditions intoxication can help one cultivate moderation and courage.

In having the characters put forth the particular positions that they do, Plato is asking us to reflect on the way in which political institutions shape citizens’ values. For instance, Clinias and Megillus, who both come from cultures that center on the military, hold that human conflict is a fundamental part of human nature and courage is the greatest virtue. In contrast, the Athenian, who comes from a culture of art and philosophy, sees harmony, peace, and leisure as ideal. Hence, in order for citizens to cultivate the appropriate dispositions, it is essential that the city have the correct policies and that citizen receive the correct education.

b. Education and Moral Psychology

In defense of moderated intoxication, the Athenian offers an account of education and moral psychology. By education, the Athenian does not mean technical skills, but rather things that direct one towards virtue. The bulk of education is meant to instill the appropriate feelings in citizens so that they feel pleasure and pain with respect to the appropriate things. Just as the Spartan practice of exposing citizens to fear and pain can help cultivate the appropriate feelings with respect to pain, drinking parties can help citizens develop the appropriate feelings with respect to pleasure. The idea being that one can learn to resist negative pleasures and desires only by being exposed to these things. Supervised drinking parties provide a safe and inexpensive way to do this.

Megillus and Clinias are quite skeptical and ask the Athenian to explain how wine affects the soul. It is here that we get an account of moral psychology (644c-645c). The Athenian asks us to imagine a puppet made by the gods with various cords in it. These cords, which represent affections (pleasure, pain, and the emotions) in the soul, pull the puppet in various directions. One cord is sacred and golden. This cord represents reason or calculation and when one follows it, one is virtuous. However, because reason/calculation is soft and gentle it requires the assistance of the other cords (which are hard and violent) to move the puppet in the correct way. The general idea is that virtue not only requires reason/calculation, but also the cultivation of the correct feelings.

The puppet metaphor raises a number of philosophical issues surrounding strength of will (enkrateia) and weakness of will (akrasia). Roughly put, weakness of will is when one intellectually grasps that one should do a certain action, but one’s emotions and desires overrule this judgement, leading to ethical failure. Strength of will is the contrary phenomenon. Like the weak-willed person, the strong-willed person desires to do other than what they intellectual judge they should do. Unlike the weak-willed person, the strong-willed person overcomes these desires and behaves correctly. In the Protagoras (352a-c), Socrates denies the possibility of weakness of will and in the Republic the virtuous agent is not the strong-willed individual who overcomes contrary emotions, but one whose psychic forces exist in perfect harmony. On the face of it, the puppet metaphor raises trouble for both of these commitments. It presents a problem for the former because it suggests that the pull of reason/calculation can be overcome by the emotions (the hard and violent cords) (see also 3.689c and 9.734b). However, this interpretation does face the problem in that the cord called reason/calculation in the metaphor is itself described as an emotion/force, which raises doubts that Plato’s intent is to draw a contrast between reason and the emotions.

The puppet metaphor also raises problems for the view that virtue is harmony because virtue in the puppet metaphor involves mastering the pull of contrary cords. This suggests that virtue amounts to being strong-willed. However, in Book 2 the Athenian describes virtue as the agreement between pleasure and pain and the account that one grasps or reason (653a). This description is in line with thinking that virtue is a harmony in the soul between the different psychic forces.

Another issue disputed by scholars is whether the soul in the puppet metaphor consists of three parts as it does in the Republic. In the Republic (see also, the Phaedrus 246a-254e), the three parts of the soul are: the reasoning/calculating part, the spirited part, and the appetitive parts. Some scholars defend a continuity between the Laws and the Republic, while others argue that the metaphor suggests a bipartition between the rational and non-rational. In other words, in the Laws, the non-rational part of the soul subsumes both the appetitive and the spirited part. Additionally, other scholars have argued that in the Laws, Plato no longer treats the soul as having parts, but more as a unitary agent with different forces in it.

c. Happiness and Virtue

Book 2 continues the discussion surrounding drinking parties and education. Musical education forms the foundation of one’s character because it is through song and dance that one cultivates the appropriate affective responses (654a-d). By taking pleasure in virtuous actions depicted in song and dance, one begins to cultivate virtue (655d-655b). The contrary is true too, one will cultivate vice, if one takes pleasure in vicious actions depicted in song and dance (655b-656b). Because of this, it is paramount for the legislature to establish what music should be allowed in the city—a task that the Athenian believes is best handled by the elderly given their wisdom (658a-e).

One of the most important things music should teach is that justice produces happiness, while injustice produces unhappiness (660b-664b). Clinias and Megillus are skeptical about the connection between virtue and happiness. Clinias will concede that an unjust person lives shamefully, but does not think they live an unsuccessful life if they have wealth, strength, health, and beauty (661d-662a; compare Gorgias 474c-475e). The Athenian will respond by offering four arguments for why it is necessary that the legislators teach that happiness is linked to justice. The first argument is that a legislator who does not teach this to the citizens is sending contradictory messages (662c-663a). On the one hand, the legislators are telling citizens that they should be just so that they may live a good life, but, on the other hand, they are teaching them that they will be deprived of a benefit—namely, pleasure—by living justly. The second argument is that a legislator who does not teach this will find it impossible to persuade the citizens to be just (663b-c. The third argument is that the statement is true—justice is linked to happiness (663c-d). The fourth argument is that even if the doctrine were not true, it ought to be taught anyways because of the social benefits that it provides (663d-e).

d. Symposium

Having secured the importance of teaching the connection between justice and happiness, the Athenian continues his discussion of symposium. He explains that drinking parties and drunkenness should be reserved for citizens in mid-to-late adulthood and must be supervised by a wise leader. The young have lots of energy and are already eager to participate in musical education. Thus, participating in drinking parties would overstimulate the youth and would lead to negative consequences. However, as one ages, one grows despondent and less interested in song and dance. Thus, drinking parties will return older adults to a youthful state in which they are more eager to participate in musical education (671a-674c).

6. Book 3

Book 3 surveys the success and failures of different political constitutions throughout history. Readers should bear in mind that the historical accounts given by Plato are not entirely accurate, but are rather being used to illustrate certain philosophical points.

a. The Origin of Legislation

The Athenian begins by talking about the traditional idea that developed culture is repeatedly annihilated by a great flood. From this flood emerged a primitive culture. During this time life was simple and peaceful. Because there were so few people, individuals were delighted to see each other and resources were abundant (678e-679a). Despite not having any formal law, people lived according to a political system called autocracy or dynasty (680b). In this system the eldest ruled, with authority being passed down through one’s parents.

Eventually, small clans merged together and formed cities. Once this happened, conflict arose because there were different elders, each claiming to having authority. In addition, each clan brought with them different religious customs. From this conflict, legislation arose (681c). Individuals were selected to represent the interests of the various clans that comprise the city. These representatives spoke to the respective leaders of the about what rules should be adopted (681c-d).

From these digressions into the origin of legislation three lessons can be drawn. First, cities and civilization are a natural development. The Athenian is rejecting the idea that the city and law are unnatural (see 10.888e-890a; Protagoras 320d-322d; Republic 358b-359b). Second, humans are not naturally opposed to one another as Clinias suggested in Book 1, but share mutual goodwill. Third, a necessary feature of legislation is the reconciliation of conflicts of interest (see Stalley 1983, 71-2).

b. Sparta

After discussing the rise and fall of Troy, the Athenian turns to the history of the three allied Dorian states of the Peloponnese: Sparta, Argos, and Messene. The leaders and citizens of each state bound each other to oaths to respect each other’s rights and to come to each other’s aid if they should be threatened. However, the allegiance dissolved with only Sparta surviving the fallout with any kind of success. Why did the allegiance fail? The Athenian asserts that it was the result of a type of ignorance that is the discordance between one’s emotions and one’s judgments (689a-c). From this, it is agreed that no citizen who suffers this ignorance should have any degree of power (689c-e). This returns us to the discussion of education in Books 1 and 2, where we are told that in order for a city to flourish its citizens must cultivate the appropriate affective responses.

Argos’ and Messene’s respective leaders suffered from this type of ignorance and the negative consequences of this were exacerbated by the fact that they had absolute power (690d-691d). Sparta, in contrast, was safeguarded from disaster because it distributed political power between multiple actors (or positions of power), including two kings (rather than one), a council of elders, and officials chosen by lot (called ephors) (691d-692bc). Here, the Athenian is introducing the key political idea that a successful constitution will distribute power by mixing various ruling elements.

c. Persia and Athens

Having described a moderate political system in Sparta, the Athenian discusses two states that stand as opposites to each other: Athens and Persia. Athens represents the extreme democracy and Persia the extreme monarchy. According to the Athenian, Persia fluctuated between periods of success and failure. Under the ruler of Cyrus, there was a balance of freedom and subjection. Soldiers were granted freedom of speech and the king took council from wise citizens. The result was that the soldiers had positive feelings towards their leaders and the state was guided in a wise direction (694b-c). However, upon the death of Cyrus, disaster ensued. Cyrus’ sons were raised in luxury and were never properly educated (694c-b). Instead of blending freedom and subjection as their father did, his sons were violent and demanded submission (695b). Eventually, Darius took control of the empire and this process repeated itself. Darius salvaged the empire by embracing freedom and subjection, but when his pampered son, Xerxes, took over, the empire suffered (695d-e).

According to the Athenian, the history of Athens is very much the opposite of Persia. If Persia failed because its rulers did not grant enough freedom, Athens failed because it granted too much. When the Persians attacked the Greeks, out of fear and necessity the Athenians lived according to certain honor codes that bound the community together. During this time, Athenians would voluntarily submit themselves to authority and because of this Athens was successful in its defense (698b-700a). However, once the threat from Persia was gone, the fear and honor codes that held the community together and naturally restricted freedom, left as well. Athenians began to consider themselves as the authority on various matters and let pleasure guide them. This resulted in a community of ignorance and excess (700a-701d).

The Athenian’s point is two-fold. First, if a political system is to succeed it must be a mixture of subjection and freedom. It must grant enough freedom such that citizens are not oppressed and do not resent the leaders, but follow them willingly. Indeed, the political system should be concerned about the welfare of the entire citizen body. Nevertheless, a political system must grant authority only to those who are wise since the masses will simply pursue what they find most pleasant. Hence, there must be some restrictions on the freedom of citizens. Second, the only way to consistently achieve a balanced political system is if the citizens receive a proper education.

7. Book 4

a. Geography of Magnesia

At the end of Book 3, Clinias reveals that he is one of ten Cretans assigned to compose a legal code for a new colony, Magnesia. Book 4 begins the construction of this new colony. Magnesia will be located on an isolated Cretan island, roughly nine or ten miles inland. Although the terrain is rough, the land has many resources. The Athenian is pleased to find this out because it means that Magnesians will not require a significant amount of trading with different communities. This is beneficial because it will restrict foreign influence on the city (704a-705b).

b. Colonists and Legislation

Colonists will mostly come from Crete, though individuals from the greater Peloponnese will be welcome as well. Initially, this poses a problem. Magnesia will consist of individuals with different cultural customs, so how can these be reconciled under a single system of law? The Athenian’s solution at this stage of the argument is that a moderate dictator and a wise legislator should develop the legal code and constitution (709a-710e). The advantage of a dictatorship is that the laws and customs can easily be altered since power is located in one individual. It should be noted that after the dictator and legislator create the legal code, power will be transferred to various officials.

The next project is to describe what constitution this benevolent dictator will create. No straight answer is given, instead the Athenian proceeds to offer a myth of life during the time of Cronos (Zeus’ father). The myth explains that during Cronos’ rule, life was blessed and happy. Cronos, knowing that human nature is corrupt, put divine beings in charge of humans. This is similar to how humans rule over farm animals. The lesson is that one should not be ruled by one’s equal, but by one’s superior. The Athenian explains that although Cronos’ reign is over and divine beings no longer guide us, within human beings is a divine element, namely, reason. By following reason, the laws will mirror the divine rule that occurred during the time of Cronos and humans will be happy (713c-714a). This myth connects the reader back to the initial topic of the Laws, which concerns the connection between law and the divine. The Athenian is explicitly linking together reason, law, and the divine.

From the myth of Cronus, it is clear that the law should be rational, but who should it serve and where does its authority lie? The Athenian maintains that any law that does not serve the interest of the whole city is a bogus law (715b). For this reason, those who hold political positions will be called servants of the law rather than being called rulers. Since the law is connected to the divine, those who serve the interests of the city are really serving the gods (715c-d). From this it is clear that the law is to have authority over all citizens and that the law is fundamentally concerned about the welfare of the whole community and not any particular group or individual.

c. Preludes

The initial framing of the laws comes directly from the legislator and the dictator. The Athenian remarks that this is the best and most efficient means to establishing good laws in the city. But if law comes entirely from the outside, why would a citizen follow it willingly? How is the Athenian not simply making the same mistake he accused the Persian leaders of making? The Athenian solves this problem by inventing the idea of a prelude in law.

He begins his explanation with a medical analogy in which he compares the medical practices of a free doctor with that of a slave doctor (720a-720e). The doctors differ in terms of whom they treat and how they treat them. The slave doctor primarily treats slaves and acts like a tyrant—simply issuing commands and forcing his patients into obedience. In contrast, the free doctor primarily treats free people and is attentive to his patients before he issues prescriptions. In fact, the free doctor will offer no prescription until he has persuaded his patient about what is the correct medical procedure. The slave doctor is like a tyrant, relying solely on compulsion; in contrast, the free doctor utilizes both persuasion and compulsion. The Athenian wants the legislator to be like the free doctor, using both persuasion and compulsion.

Persuasion is achieved by attaching preludes to the law. In musical compositions, preludes are brief musical performances that precede the main composition. Musical preludes are designed to complement the forthcoming performance so that it is better received by the audience. Similarly, the legislator can preface the law with brief statements that will make the citizens more cooperative and ready to learn, and thus more likely to accept the laws freely (722d-723a). Compulsion is achieved by attaching penalties to the law if citizens should choose not to comply.

The Athenian clearly wants citizens to obey the law voluntarily. He realizes that in order for this to happen the citizens must see the law as serving their interests and the preludes are meant to accomplish this. But what is the nature of the persuasion underlying the preludes? There are three main interpretations. The first interpretation is that the persuasion is rational. Defenders of this view maintain that the point of the preludes is to explain to citizens the actual reasons that underlie the law. The evidence in favor of this reading is mainly found in how the Athenian describes the preludes. When discussing the preludes, the Athenian repeatedly says that they involve teaching, learning, and reason (4.718c-d, 4.720d, 4.723a, 9.857d-e, 9.858d, and 10.888a). If this interpretation is correct, then the Laws presents a much more optimistic view of the average citizen than the Republic does. In the Republic, farmers and artisans do not receive philosophical training, but on this reading the citizens of Magnesia will come to grasp some of the underlying philosophical reasons behind the law.

The second interpretation holds that the persuasion is non-rational and does not appeal to citizens’ reason, but rather their emotion. The main evidence in support of this reading is found in the preludes themselves. Many (though not all) of the preludes are like conventional sermons, merely shaming the citizens into obedience. A favorite example of those who support the non-rational reading is the prelude to hunting laws. In this prelude, the Athenian simply asserts that only hunting land animals with horses, dogs, or on foot is worthy of courage, and that other forms of hunting such as trapping, are lazy and should not be done (7.823d-824b; see also 5.726a-734e, 6.772e-773c, 9.854b-c, 10.904e-905c, and 11.927a-d). The Athenian makes no attempt to explain why some forms of hunting are lazy, while others are courageous, nor does he explain why a lazy form of hunting is bad and not simply an efficient use of one’s time.

The third interpretation lies in the middle of the first two, it attempts to reconcile the rational and non-rational readings. Suppose that the preludes are described by the Athenian as appealing to reason and suppose that the actual preludes do not appeal to reason, but instead emotion. What could explain this inconsistency? Two answers present themselves and represent the main readings that could be classified as being in the middle. The first is that the Stranger is using the description of the preludes to offer an ideal of law according to which the citizens freely and rationally obey the law. However, due to the psychological limitations of humans, the actual preludes will not live up to this ideal. The second answer is more pragmatic. The Athenian wants citizens to be motivated to obey the law. He recognizes that citizens will be diverse in both their interests and intellectual abilities. Because of this, the lawgiver will have to appeal to different types of things in order to motivate citizens, some being rational, while others being non-rational.

8. Book 5

a. Ethics

Having explained the concept of a prelude, the Athenian proceeds to offer a prelude which will preface the entire legal code of Magnesia. This prelude provides the moral foundation for the city, explaining the general duties of the citizens. These duties fall under three main headings: to the soul, to the body, and to other citizens. The prelude ends with an attempt to show that the virtuous life leads to the maximum amount of pleasure and the vicious life leads to the maximum amount of pain. Below provides an outline of the main ideas expressed in this section of Book 5.

The Athenian explains that the soul is the master of the body and because of this it should be given priority over the body. Nevertheless, most humans fail to do this, and instead pursue beauty, wealth, and pleasure at the expense of virtue, and as a result, they prioritize the body over the soul (726a-728d). Although humans should prioritize the soul over the body, they are also obligated to take care of their bodies. However, people do not honor the body by being extremely beautiful, healthy, and strong. Rather, they honor the body by achieving a mean between the extremes of each of these states. The same principle applies to wealth. Too much wealth will lead to feuds and greed, while too little wealth will make one vulnerable to exploitation (728d-729a).

Readers might find the idea of honoring the soul and body as being not only mystical sounding, but also wrong. After all, it might be good for me to be physically healthy, but it doesn’t seem like I’m violating a duty if I’m not. However, these oddities can be explained away if we consider three things. First, the Athenian’s division between honoring the soul and honoring the body maps on to the distinction he articulated in Book 1 between divine and human goods. Humans honor the soul by pursuing virtue. This is a divine exercise because the soul itself is divine (726a). Although the religious connection is important for Plato, this distinction is really between “internal” and “external” goods. Internal goods are the goods of the mind and character, while external goods are everything that is potentially good that lies outside the mind and character. For Plato, the value of external goods depends on the presence of internal goods, while the value of internal goods in no way depends on the presence of external goods. In other words, internal goods are good in every situation, while external goods are only good in some situations. Because of this, Plato finds it odd that humans devote so much time and energy to pursuing external goods and so little to achieving internal goods.

Second, Ancient Greek ethics is usually interpreted as egoistic in the sense that ethical inquiry centers on the question of what is the best life for an individual. In this framework, discussions about why one should become virtuous are put in terms of how virtue relates to well-being. In other words, the Ancient Greek ethicists argue that we have self-regarding reasons to become virtuous; namely, that virtue will help us live a successful and happy life. With this in mind, it makes sense that Plato would think that we are obligated to care for the soul and body, since the good life requires it.

Third, it is worth bearing in mind that the main ethical theories today have self-regarding features built into them and thus this idea is not entirely unique to Plato (and other Ancient Greek ethicists). The three main ethical theories today are virtue ethics (advocated by Plato), deontology, and consequentialism. Immanuel Kant, the inspiration for deontology, held that we have the duty of self-improvement, while consequentialism, in its most traditional form, holds that when determining how I ought to act, my own personal welfare is given a consideration.

After expressing that citizens ought to care for others, the Athenian offers a fascinating argument in defense of the virtuous life. The crux of the argument is that vice leads to emotional extremes, while virtue leads to emotional stability. Because emotional extremes are painful, it follows that the virtuous life will be more pleasant (732e-734e).

The Athenian aims to show that the virtuous life will lead to more pleasure than pain. In doing this, he hopes to undermine the all too common thought, that the life of vice, though morally bad, is still enjoyable.

b. Geography and Population

The remainder of Book 5 returns to discussing the structure of Magnesia. This discussion covers a wide array of topics, which include: the selection of citizens (735a-736e), the distribution of land (736c-737d and 740a), the population (737e-738b and 740b-744a), religion (738c-738e), the ideal state (739a-739e), the four property classes (744b-745b), administrative units of the state (745b-745e), the flexibility of the law in light of facts (745e-746d), the importance of mathematics (746d-747d), and the influence of the climate (747d-747e). The main philosophical ideas in this part of the book are covered in sections 3 and 4 above.

9. Book 6

a. Voting and Offices

With the geography and population of Magnesia established, the Athenian begins to describe the various offices in the city and the electoral process (751a-768e). The electoral process is quite complicated and difficult to understand, but typically has four stages: nomination, voting, casting lots, and scrutiny. All citizens who have served (or are serving) in the military will nominate candidates by writing their names on publicly displayed tablets. During this time, they are permitted to erase any names they find unsuitable. The names that appear most frequently will be assembled into a list from which citizens will cast their votes. This process will then repeat; the names of citizens who have the most votes will be assembled into another list. From this list, lots will be drawn to determine who gets the position. If the selected names pass scrutiny, they will be declared elected.

One might wonder what value casting lots adds to the electoral process, especially since the practice is no longer that common. In Plato’s time, casting lots was seen as a democratic process, while voting was seen as being more of an oligarchic process (Aristotle Politics 4.9.1294b8-13). The idea is that if all citizens are equal, then they all equally deserve to hold office; thus, the only fair procedure would be to have the office chosen randomly. To have citizens vote for a candidate, is to admit that some citizens are more qualified than others. Hence, the inclusion of lot casting is a concession to the egalitarian sentiment found in democracies.

This is most clearly seen in the Athenian’s discussion of equality (756e-758). The Athenian distinguishes between two types of equality: arithmetic equality and geometric equality (these are Aristotle’s terms, see Politics 5.1.1301b29-1302a8, Nicomachean Ethics 5.3.1131a25-5.5.1133b28). Arithmetic equality treats everyone as equal and corresponds to the lot, while geometric equality treats everyone based on their nature and abilities and corresponds more closely to voting. The Athenian maintains that geometric equality is the true form of equality since humans have different natures and to treat them as equal is actually a form of inequality. However, most citizens will not see things this way and thus the inclusion of the lot is a way to avoid dissension.

There are various offices described in Book 6, but three are worthy of note: the assembly, the council, and the guardians of the law. The assembly is open to all citizens who are serving or have served in the military. The main function is to elect members of the council and other officials, though there are other functions (753b, 764a, 767e-768a, 772c-d, 8.850b, 11.921e, 12.943c). The council comprises ninety members from each property class, totaling 360 members. The membership lasts one year and the main function is to conduct the day to day business of the state such as supervising elections and organizing the assembly (756b-758d). The guardians of the law are made up of thirty-seven citizens aged at least fifty. They will hold the position for at least twenty years and their primary function is to guard the law (752-755b). They guard the law by supervising both officials and ordinary citizens, by helping resolve difficult judicial cases, and by supplementing and revising the law. Within both the electoral process and the offices held, we see the Athenian’s attempt to develop a constitution that mixes various political elements.

b. Marriage

The conversation abruptly shifts to the topic of marriage and child-rearing, with an aside on slavery. In continuing with his emphasis on moderation and mixed constitutions, the Athenian encourages people to marry partners who have opposite characteristics. Although people are attracted to those who are like them, citizens will be encouraged to put the good of the state above their own preferences. However, because citizens will find such laws to be excessively restrictive, the Athenian only wants to encourage, but not require, citizens to marry people with opposite qualities (773c-774a). If male citizens do not marry by the age of thirty-five, they will be subject to fines and dishonors.

These laws might strike one as rather draconian; nonetheless, one should keep in mind three things. First, the marriage laws in Magnesia are inspired by actual practices in Crete and Sparta. Second, the laws are less severe than the one’s expressed in the Republic in which there is no private marriage for the guardian class (that is, soldiers and philosophers). In the Republic, the guardians will consider each (appropriately aged) person of the opposite sex to be their spouse. Mating will be arranged by using a lottery. However, the lottery is rigged such that a select few will actually be controlling the sexual relationships so as to avoid incest, control the population, and implement eugenics (Republic 5.459d-460c). Of course, Plato does not provide the details of the marriage laws surrounding the working class citizens and for all we know these might have been similar to the ones in Magnesia. Third, for his time, Plato is actually progressive in his views of women. In Book 6, the Athenian advocates for the inclusion of women in the practice of common meals, an inclusion that Aristotle lists as something peculiar to Plato (Politics 2.12.1274b10-11). The Athenian emphasizes that a city cannot flourish unless all citizens receive a proper education.

10. Book 7 and 8

Traditional Greek education involved both musical and gymnastic training. Musical education includes all of the subjects of the Muses, subjects such as music, poetry, and mathematics. Gymnastics is education related to physical activity. It includes things like military training and sports. Books 7 and 8 provide the details of Plato’s account of education, which extends to both males and females. Education, for Plato, mostly comes in the form of play and its importance cannot be overstated. The following passage captures this idea, as well as Plato’s conservatism:

If you control the way children play, and the same children always play the same games under the same rules and in the same conditions, and get pleasure from the same toys, you’ll find that the conventions of adult life too are left in peace without alteration… Change, we shall find, except in something evil, is extremely dangerous (Saunders trans., 797a-c)

Below is a sketch of the main educative laws and principles.

a. Musical Education

The poetry and theatre allowed in Magnesia will mostly present images and sounds that provide positive moral lessons (814e-816d, 817b-817d). The underlying idea behind these restrictions is that humans will develop characteristics of the people they observe in poetry and theatre. If they see bad people doing well or acting as cowards, they will be more inclined to become bad and cowardly. There is a notable exception, however, in that comedy will be allowed as long as it is performed by slaves or foreigners (816d-e).

The Athenian’s policy concerning musical education extends the views discussed in Books 1 and 2 in two ways. First, the policies reflect the view that the character we develop is largely shaped by what we find pleasurable and painful. The art and entertainment in the city should be such that we take pleasure in good and beautiful things and are pained by bad and ugly things. Second, the inclusion of comedy reflects the lessons of the discussion concerning drunkenness; we can only learn to resist doing shameful behavior if we have some exposure to it.

All Magnesians will learn basic mathematics, with some advancing to study astronomy. This is significant because in the Republic, Plato says that it is through mathematics that we come to learn about non-sensible properties, which are the subject of philosophical thought (7.522c-540b). In the Republic, this study is commonly thought to be reserved for the most elite and talented citizens, while in the Laws a portion of it is given to the entire citizen body. This suggests that, on some level, all Magnesians will have some awareness of philosophy.

b. Gymnastics

Physical education aims at achieving two things: (1) the development of good character traits and (2) military training. Because physical education is meant to provide military training, sports will be modified to emphasize this. For example, impractical and unrealistic techniques will be forbidden (796a, 813e, and 814d) and armed competitions will be emphasized (833e-834a).

It is clear enough how physical education could prepare one for the military, but how does it contribute to one’s character? There are two related ways in which physical movement affects one’s character. First, the Athenian argues that physical movement directly affects one’s emotions. For example, the Athenian insists that fetuses and infants must constantly be moved around so that their excessive fears and anxieties are purged (789b-791d). Another example of this kind of thinking is the Athenian’s claim that a moderate amount of physical hardship is required for children to develop virtue; too much luxury will make one spoiled and lack moderation, but too much hardship will make one misanthropic (791d-794a). Second, the Athenian maintains that humans take on the characteristics of the things that they imitate. Dancers will become graceful and courageous by imitating graceful and courageous movements, while they will become the opposite by imitating the opposite (814e-816e).

11. Book 9

a. Responsibility

In Plato’s so called “early dialogues,” Socrates defends the paradoxical claim that injustice is always involuntary because it is a result of ignorance. The evil doer actually desires what is good, so when they act wrongly, they are not doing what they actually want to do (Protagoras 352a-c; Gorgias 468b; Meno 77e-78b). We can break this paradoxical view into two claims:

Involuntary Thesis: No one is voluntarily unjust.

Ignorance Thesis: All wrongdoing is the result of ignorance.

In Book 9 of the Laws, Plato will grapple with both claims. On the one hand, the Athenian is adamant that the involuntary thesis is true, but on the other hand, he acknowledges that all lawgivers seem to deny it. Lawgivers treat voluntary wrongdoing as a more severe punishment than involuntary wrongdoing. Moreover, the concept of punishment seems to presuppose that the criminals are responsible for their actions and this seems to presuppose that they act voluntarily when they act unjustly. The Athenian, thus, faces a dilemma: he must either abandon the involuntary thesis or he must explain how the involuntary thesis is able to preserve the underlying thought in law that some crimes are accidental and others are not (860c-861d).

The Athenian refuses to abandon the involuntary thesis and attempts to resolve this difficulty by offering a distinction between injury and injustice. Injury explores what kind of harms were done to the victim and what the criminal owes to the victim, their family, or the state. Injustice explores the psychological conditions under which the crime was committed. He mentions three main conditions: anger (thumos), pleasure, and ignorance (862b-864c).

Although there is much scholarly debate surrounding this issue, the general idea appears to be that a criminal can harm someone voluntarily or involuntarily, but can never be unjust voluntarily. For example, I might intentionally bump my coffee cup so that it spills on your computer or I might accidentally do this. The former is a voluntary harm, while the latter is an involuntary harm. Accordingly, the former should be punished more severally than the latter. Nevertheless, even in the instance when I voluntarily damage your computer, I am not voluntarily unjust. This is because no one desires what is bad for them and injustice is bad for one, so no one desires injustice. If I truly knew what was good or was not overcome by pleasure or anger, I would not engage in vicious behavior because my soul would be just. Thus, Plato wants to preserve the voluntary thesis, while abandoning (or qualifying) the ignorance thesis by allowing for the possibility that anger and pleasure can move one to act unjustly.

Many scholars have pointed out that the Athenian appears to equivocate on the terms “voluntary” and “involuntary.” When discussing voluntary and involuntary harms the terms are used in the ordinary sense, reflecting what an agent actively or consciously desires and wishes. However, when discussing voluntary and involuntary injustice the terms are used in the Socratic sense, reflecting what an agent deeply desires and wishes. Hence, the ordinary sense only refers to conscious psychological states, while the Socratic sense can refer to unconscious states or what is entailed by desiring the good.

In any case, the Athenian’s overall point is clear. Punishment must not simply look to the harm that is caused, but must look to the psychological state under which injury resulted. This has the benefit of allowing for nuance when punishing agents since the degree of culpability can be found in the agent’s psychological state. An agent who deliberates and then kills someone should not be treated the same as someone who kills someone in anger or as the result of some unforeseen accident.

b. Punishment

The Athenian’s distinction between injury and injustice accords with his commitment to punishment as a means of recompense for the victim and as a cure for criminality. The purpose of the former is rather self-explanatory, but more needs to be said about the latter. As the Athenian explained in Book 1, the purpose of legal codes is to make citizens happy. Since, happiness is linked to virtue, the law must try to make citizens virtuous. Seeing punishment as curative is really just an extension of this idea to the criminal. If justice is a healthy state of the soul, then injustice is a disease of the soul in need of curing via punishment. For passages that express this idea, see 5.728c, 5.735e, 8.843d, 9.854d-855b, 9.862d-863c, 11.933e-934c, 12.941d, and 12.957d. Unfortunately, the Athenian never explains how particular punishments will achieve this goal.

One might think that the Athenian’s curative view of punishment results in soft penalties, but this is far from true. Punishment will take six forms: death, corporal punishment, imprisonment, exile, monetary penalties, and dishonors. It is worth pointing out that the use of imprisonment as punishment in Greek society appears to be an innovation of Plato. One might wonder how capital punishment is compatible with a curative theory of punishment. The answer is that some people are beyond cure and death is best for them and the city (862d-863a). For Plato, psychological harmony, virtue, and well-being are all interconnected. Accordingly, the completely vicious who cannot be cured will always be in a state of psychological disharmony and will never flourish. Death is better than living in such a condition.

12. Book 10

Book 10 is probably the most studied and best known part of the Laws. The Book concerns the laws of impiety of which there are three varieties (885b):

Atheism: The belief that the gods do not exist.

Deism: The belief that the gods exist but are indifferent to human affairs.

Traditional Theism: The belief that the gods exist and can be bribed.

The Athenian believes that these impious beliefs threaten to undermine the political and ethical foundation of the city. Because of this, the lawgiver must attempt to persuade the citizens to abandon these false beliefs. If citizens refuse, they must be punished.

a. Atheism

Clinias is surprised that atheists exist. This is because he thinks that it is well agreed by Greek and non-Greeks that certain visible celestial bodies are gods (885e). The Athenian takes Clinias to be too dismissive of atheists, attributing their belief to a lack of self-control and desire for pleasure (886a-b). The Athenian explains that the cause of atheism is not a lack of self-control, but, rather, a materialistic cosmology (888e-890a). Atheists believe that the origins of the cosmos are basic elemental bodies randomly interacting with each other via an unintelligent process. Craft, which is an intelligent process, only comes into effect later once humans are created. There are two types of craft. First, there are those that cooperate with natural processes and are useful such as farming. Second, there are those that do not cooperate with natural processes and are useless such as law and religion. Hence, Atheists hold that the cosmos is directed via blind random chance and things like religion and law are products of useless crafts.

The Athenian responds by defending an alternative cosmology, which reverses the priority of soul and matter. Readers should be warned that the argument is obscure, difficult, and probably invalid; let this merely serve as a sketch of the main moves in it. The Athenian begins by explaining that there are two types of motions. On the one hand, there is “transmitted motion,” which moves other things, but cannot move unless another motion moves it. On the other hand, there is “self-motion,” which moves itself as well as other things (894b-c). The first motion cannot be a transmitted motion or else there would have to be an infinite series of transmitted motion (894e). Additionally, imagine, for instance, that there was a complete rest, the only thing that could initiate motion again would be self-motion (895a-b). Thus, the first motion must be self-motion (895c).

Having established that the first cause is self-motion, the Athenian examines the nature of self-motion. He argues that a thing that moves itself must be said to be alive and whatever has a soul is alive (895c). In fact, the definition of soul is motion capable of moving itself (895e-896a). From this he concludes that soul is the first source of movement and change in everything and is prior to material things (896c-d). The Athenian asserts that if soul is prior to material bodies, then the attributes of soul (such as true belief and calculation) are also prior to material things (896d). Since soul is the cause of all things, it follows that it is the cause of both good and bad (896d). The Athenian concludes that since the soul dwells in and governs all moving things, it must govern the universe (896d-e).

The argument is not yet complete, however. At this point, even if the argument is sound, it does not establish that there are gods. At best, it only shows that there is at least one or two souls responsible for the motions in the world. The Athenian must show that the qualities that this self-moving soul possesses are divine and worthy of being called a god. This is what he does next by connecting the rationality of the soul with the divine and virtue (897b-899b).

The argument raises a number of interpretative and philosophical questions. One of the more tantalizing questions concerns Plato’s inclusion of a bad soul which is responsible for evil (896e). What is the nature of this bad soul and why does Plato include it? Most commentators have denied that the bad soul is anything like the devil; some hold it is cosmic evil in the universe generally, while others maintain it is located in humans. The inclusion of this issue is related to the problem of evil. The general worry is that if the world is governed by a rational, powerful, and good god (or gods), what explains the inclusion of evil in the world? Why would a rational, powerful, and good god allow for evil? Plato offers various answers. For example, in the Timaeus (42e-44d), evil is said to come from disorderly movements associated with necessity, in the Theaetetus (176a-b), evil is said to come from mortals, and in the Statesman (269c-270a), evil is said to come from god releasing control. Accordingly, the Laws is unique in that evil is explicitly tied to the soul. How we understand the nature of this evil soul will explain whether the view articulated in the Laws is compatible or incompatible with these other texts.

b. Deism and Traditional Theism

Having taking himself to refute atheism, the Athenian takes on deism and traditional theism. He notes that some youths have come to believe that the gods do not care about human affairs because they have witnessed bad people living good lives (899d-900b). The Athenian responds to this charge by arguing that the gods know everything, are all powerful, and are supremely good (901d-e). Now if the gods could neglect humans it would be through ignorance, lack of power, or vice. However, because the gods clearly are not like this, the gods must care about the affairs of humans (901e-903a).

However, the Athenian recognizes that not everyone will be moved by this argument and offers a myth that he hopes will persuade doubters (903b-905d). The myth declares that each part of the cosmos was put together with a mind towards the well-being of the whole cosmos and not any single part. Humans go wrong in thinking that the cosmos is created for them; in reality, humans are created for the good of the cosmos. After this, the Athenian describes a process of reincarnation in which good souls are transferred to better bodies and bad souls to worse bodies. Thus, the unjust will wind up with bad lives and the just will wind up with good lives in the end.

The first part of this myth is important for what it teaches us about Plato’s ethical theory. Ancient ethical theories are often criticized as being too egoistic; that is, they overly focus on the happiness of the individual and not on the contribution to the happiness of others. However, this myth reveals that, at least for Plato in the Laws, this is inaccurate. The myth moves individuals away from their own selfish concerns to the good of everyone generally.

After this, the Athenian swiftly dismisses traditional theism. He maintains that the gods are rulers since they manage the heavens (905e). But what type of earthly rulers do the gods resemble? If traditional theism were true, the gods would resemble petty and greedy rulers (906a-e). But this is an absurd conception of the gods, who are the greatest of all things (907b). Hence, traditional theism must be wrong.

Setting aside issues of how to understand Plato’s theology in the Laws, there is the general question of why Plato thinks impiety will undermine the political system of Magnesia. It is easy enough to see why the deist and traditional theist pose a threat. If the gods are indifferent to human affairs or can be persuaded, then either the gods do not care about citizens disobeying the law or they can be bribed out of caring. It is less clear why the Athenian is concerned about atheists, however. Although he thinks that cultural relativism is a consequence of the atheist’s cosmological views, he admits that not all atheists are vicious and some are good (908b-c). Whatever the answer is, it is clear that Plato thinks that belief in god is in some way tied to thinking that morality is objective. This is a surprising stance in light of the claims put forth in the Euthyphro in which it is argued that ethical truths do not depend on the gods. These two texts are not necessarily inconsistent with each other; nonetheless, there is clearly a tension that requires explanation (see Divine Command Theory).

13. Book 11 and 12

a. Laws

Book 11 and the beginning of 12 discuss various laws, which only have a loose relation to each other. Most of this section is relatively self-explanatory and does not warrant additional comment. This section addresses: property law (913a-915c), commercial law (915d-922a), family law (922a-932d), and miscellaneous laws (932e-960c). Within the discussion of miscellaneous laws, the Athenian discusses an important office, “the scrutineers” (12.945b-948b). The function of scrutineers is to audit the officials of the city and to punish them when necessary. The scrutineers play an essential role in the system of checks and balances in Magnesia. But what ensures that the scrutineers themselves are not corrupt? To ensure that the scrutineers are not themselves corrupt, they must be citizens with proven reputation for good character and capable of approaching matters impartially. However, if an official feels they are being unfairly treated by a scrutineer, they can accuse the scrutineers and a trial will be held to determine the truth.

b. Nocturnal Council

The Laws ends with a discussion of the “nocturnal council,” so named because they meet daily from dawn until sunrise (951c-952d, 961a-968e). The nocturnal council is an elite group of elderly citizens, who have proven their worth by winning honors and have traveled abroad to learn from other states. The nocturnal council plays three roles in the city. First, they will be in charge of supplementing and revising the law in light of changing circumstances, while still keeping with the original spirit of the law. Second, the nocturnal council will study the ethical principles underlying the law. This involves studying the nature of virtue itself, discovering the ways in which the individual virtues of moderation, courage, wisdom and justice are really one Virtue. In addition, members of the nocturnal council will study cosmology and theology. Third, they will explore how these philosophical and theological ideas can be applied to the law. They are to ensure that, as far as possible, the law is in harmony with the philosophical principles they have learned.

The nocturnal council will bring to mind the Republic’s philosopher rulers in charge of the Callipolis. How similar they are depends on what kind of authority is granted to the nocturnal council. In the Callipolis, the philosopher rulers have absolute power, but it is far from clear whether this is the case for the nocturnal council. Indeed, it is a subject of much dispute. The difficulty stems from the fact that a few passages suggest that the nocturnal council will be entrusted with unrestricted power (7.818c, 12.968c, 12.969b). That being said, much of the Laws issues warnings about unrestricted power (see especially 3.691a-d, 4.713c, 9.875a-b); thus, it would be strange for the book to end with a renunciation of this thesis.

14. References and Further Reading

a. Standard Greek Texts

Burnet, J. (ed.), Platonis Opera. Vol. 5. (Oxford: Oxford Classical Texts, 1907).
Des Places, É. and Diès, A. (eds. and trans.) 1951-1956. Platon: Oeuvres Complètes. Vols. 11-12. (Budé edn. Paris: Société d’ Édition Les Belles Lettres), 1951-1956).

b. English Translations

Bury. R. G. Plato: Laws (Vol. 1 and 2). Loeb Classical Library, Plato Volume 10 and 11. (Cambridge, MA: Harvard University Press) English translation side by side with the Greek text.
Pangle, T. The Laws of Plato, translated with Notes and Interpretative Essay. (Chicago: University of Chicago Press, 1980).
- A more literal translation of the text, matching English words and Greek words with precision.
Griffith, T. Plato: The Laws. Cambridge Texts in the History of Political Thought, ed. M. Schofield (Cambridge: Cambridge University Press, 2016)
Saunders, T. Plato: The Laws, translated with an Introduction. (London: Penguin Books, 1970).
- A more stylized translation of the text that aims for readability. In addition, it breaks the text into smaller sections, offering a brief analysis of each.

c. General Discussions and Anthologies

Bobonich, C. (ed.), Plato’s ‘Laws’: A Critical Guide. (Cambridge: Cambridge University Press, 2010).
- An anthology that surveys philosophical debates concerning the Laws. Chapter 1, authored by Malcom Schofield, provides a helpful overview of the Laws.
Laks, A. “The Laws” in C. Rowe and M. Schofield, eds., The Cambridge History of Greek and Roman Political Thought. (Cambridge: Cambridge University Press, 1998).
- A brief article that provides an overview of the Laws with a focus on political thought.
Sanday, E. (ed), Plato’s Laws: Force and Truth in Politics. Studies in Continental Thought. (Bloomington: Indiana University Press, 2012).
- An anthology with chapters dedicated to each book of the Laws.
Stalley, R. F. An Introduction in Plato’s Laws. (Indiana: Hackett Publishing, 1983).

d. Culture, Laws, and Context

Cohen, D. “The Legal Status and Political Role of Women in Plato’s Laws.” Revue Internationale des Droits de l’Antiquité, 34 (1987): 27-40.
- An optimistic assessment of the role of women in the Laws.
Morrow, G. Plato’s Cretan City: An Historical Interpretation of the Laws. (Princeton: Princeton University Press, 1960)
- Details the various religious and political policies in the Laws, as well as placing them in a historical and cultural context.
Nightingale, A. W. “Plato’s Lawcode in Context: Rule by Written Law in Athens and Magnesia.” Classical Quarterly 49 (1999): 100-122.
- Discusses the historical and cultural context underlying the laws of Magnesia.
Nightingale, A. W. “Writing/Reading a Sacred Text: A Literary Interpretation of Plato’s Laws.” Classical Philology 88 (1993): 279-300.
- Offers a literary interpretation of the Laws.
Okin, Susan M. “Philosopher Queens and Private Lives: Plato on Women and the Family.” Philosophy & Public Affairs 6 (1977): 345-369.
- Discusses how private property affects gender politics in Plato’s philosophy. Okin argues that Plato’s reintroduction of private property in the Laws results in more traditional roles for women than in the Republic.
Peponi, A-E (ed.). Performance and Culture in Plato’s Laws. (New York: Cambridge University Press, 2013).
- Anthology that focuses on the culture and music in Plato’s Law.
Reid, J. “The Offices of Magnesia.” Polis 37 (2020): 567-589.
Saunders, T. J. Plato’s Penal Code. (Oxford: Oxford University Press, 1991).

e. The Preludes

Annas, J. Virtue and Law in Plato and Beyond. (New York: Oxford University Press, 2017).
Baima, N. R. and T. Paytas. Plato’s Pragmatism: Rethinking the Relationship between Ethics and Epistemology. (New York: Routledge, 2021).
- Chapter 2 argues that the persuasion in the Laws is sometimes rational and truthful, and other times non-rational and deceptive.
Buccioni, E. “Revisiting the Controversial Nature of Persuasion in Plato’s Laws. Polis 24 (2007): 262-283.
- Defends a middle reading of the preludes, which compares the use of rhetoric in the Laws to that of the Phaedrus.
Bobonich, C. “Persuasion, Compulsion and Freedom in Plato’s Laws.” Classical Quarterly 41 (1991): 365-387.
- Defends the rational interpretation of the preludes.
Laks, A. “Legislation and Demiurgy: On the Relationship between Plato’s Republic and Laws.” Classical Antiquity 9 (1990): 209-229
- Defends a middle reading of the preludes, according to which the preludes offer an ideal of law, but because of the psychological limitations of the citizens, the actual preludes involves are non-rational.
Morrow, G. “Plato’s Conception of Persuasion.” Philosophical Review 62 (1953): 234-250.
- Defends a non-rational interpretation of persuasion.
Stalley, R. “Persuasion in Plato’s Laws.” History of Political Thought 15 (1983): 157-177.
- Defends a non-rational interpretation of persuasion.
Williams, D. L. “Plato’s Noble Lie: From Kallipolis to Magnesia.” History of Political Thought 34 (2013): 363-392.
- Argues that there is less political deception in Magnesia than in the Callipolis.

f. Ethics, Moral Psychology, and Political Thought

Barker, E. Greek Political Theory: Plato and his Predecessors. (London: Methuen, 1960).
- A classic study of Plato’s political thought.
Belfiore, E. “Wine and Catharsis of the Emotions in Plato’s Laws. Classical Quarterly 35 (1992): 349-361.
- Compares the moral psychology advanced in the Republic to that of the Laws. Argues that the moral psychology in the Laws shares commonalities with Aristotle’s view of the effects of poetry.
Bobonich, C. Plato’s Utopia Recast: His Later Ethics and Poltics. (Oxford: Oxford University Press, 2002).
- Examines Plato’s moral psychology from the Phaedo to the Laws and concludes that Magnesia is Plato’s new utopia.
Bobonich, C. “Akrasia and Agency in Plato’s Laws and Republic.” Archiv für der Philosophie 76 (1994): 3-36.
- Argues that Plato does allow for weakness of will in the Laws.
Klosko, G. “The Nocturnal Council in Plato’s Laws.” Political Studies 36 (1988): 74-88.
Klosko, G. The Development of Plato’s Political Theory. (London, Methuen, 1986).
Meyer, S. S. Plato: The Laws 1 & 2. Translated with an Introduction and Commentary. (Oxford: Oxford University Press, 2015).
Samaras, T. Plato on Democracy. (New York: Peter Lang Publishing, 2002)
- Part three discusses Plato’s political thought in the Laws.
Sassi, M. “The Self, the Soul, and the Individual in the City of the Laws.” Oxford Studies in Ancient Philosophy 35 (2008): 125-148.
- Discusses the moral psychology in the Laws.
Saunders, T. J. “The Socratic Paradoxes in Plato’s Laws.” Hermes 96 (1968): 421-434.
- An influential article on voluntary wrongdoing in the Laws.
Weiss, R. The Socratic Paradox and its Enemies. (Chicago: University of Chicago, 2006).
- Chapter 9 discusses Plato’s distinction between injury and injustice and relates it to the idea that justice is beautiful and injustice is shameful.
Wilburn, J. “Tripartition and the Causes of Criminal Behavior in Laws 9.” Ancient Philosophy 33 (2013): 111-134.
- Discusses Plato’s account of moral psychology and its relation to Book 9.
Wilburn, J. “Akrasia and Self-Rule in Plato’s Laws.” Oxford Studies in Ancient Philosophy 43 (2012): 25-33.
- Presents an alternative reading of the puppet metaphor according to which it does not support weakness of will.

g. Theology

Carone, G. R. Plato’s Cosmology and its Ethical Dimensions. (Cambridge: Cambridge University Press, 2005).
- Chapter 8 discusses Plato’s account of cosmic evil in Laws 10.
Mayhew, R. Plato: Laws 10. (Oxford: Oxford University Press, 2008).
- Offers a line by line commentary and discussion of Book 10.
Mohr, R. God and Forms in Plato. (Las Vegas: Parmenides, 2006).
- Chapters 8 and 11 focus on theology in the Laws.
Powers, N. “Plato’s Cure for Impiety in Laws 10.” Ancient Philosophy 34 (2014): 47-63.
- Discusses how the context in which the Athenian presents his theology constrains the account given.
Solmsen, F. Plato’s Theology. (Ithaca: Cornell University Press, 1942).
Trelawny-Cassity, L. “On the Foundation of Theology in Plato’s Laws,” Epoché: A Journal for the History of Philosophy 18 (2014): 325-49.
- Discusses Plato’s cosmology and theology in the Laws by connecting it to Plato’s methodology and ideas explored in the Phaedo, Statesman, Philebus, and Timaeus.

Author Information

Nicholas R. Baima
Email: nichbaima@gmail.com
Florida Atlantic University
U. S. A.

	Piaget	Gottschalk	Löbner	Peters & Westerståhl
\(\small{ID}\)	identité (\(\small{I}\))	identity (\(\small{E}\))	indentity
\(\small{ENEG}\)	inversion (\(\small{N}\))	negational (\(\small{N}\))	negation	outer negation
\(\small{INEG}\)	réciprocation (\(\small{R}\))	contradual (\(\small{C}\))	subnegation	inner negation
\(\small{DUAL}\)	corrélation (\(\small{C}\))	dual (\(\small{E}\))	dual	dual

Fallacies

Table of Contents

1. Introduction

2. Taxonomy of Fallacies

3. Pedagogy

4. What is a Fallacy?

5. Other Controversies

6. Partial List of Fallacies

Abusive Ad Hominem

Accent

Accentus

Accident

Ad Baculum

Ad Consequentiam

Ad Crumenum

Ad Hoc Rescue

Ad Hominem

Ad Hominem, Circumstantial

Ad Ignorantiam

Ad Misericordiam

Ad Novitatem

Ad Numerum

Ad Populum

Ad Verecundiam

Affirming the Consequent

Against the Person

All-or-Nothing

Ambiguity

Amphiboly

Anecdotal Evidence

Anthropomorphism

Appeal to Authority

Appeal to Consequence

Appeal to Emotions

Appeal to Force

Appeal to Ignorance

Appeal to Money

Appeal to Past Practice

Appeal to Pity

Appeal to Snobbery

Appeal to the Gallery

Appeal to the Masses

Appeal to the Mob

Appeal to the People

Appeal to the Stick

Appeal to Unqualified Authority

Appeal to Vanity

Argument from Ignorance

Argument from Outrage

Argument from Popularity

Argumentum Ad ….

Argumentum Consensus Gentium

Availability Heuristic

Avoiding the Issue

Avoiding the Question

Bad Seed

Bald Man

Bandwagon

Begging the Question

Beside the Point

Biased Generalizing

Biased Sample

Biased Statistics

Bifurcation

Black-or-White

Caricaturization

Changing the Question

Cherry-Picking

Circular Reasoning

Circumstantial Ad Hominem

Clouding the Issue

Common Belief

Common Cause

Common Practice

Complex Question

Composition

Confirmation Bias

Conjunction

Confusing an Explanation with an Excuse

Consensus Gentium